Updated kingfisher scan to accept Git repository URLs as positional targets (for example kingfisher scan github.com/org/repo or kingfisher scan https://gitlab.com/group/project.git) without requiring --git-url.

This commit is contained in:
Mick Grove 2026-02-26 23:14:18 -07:00
commit 0ae4e8445c
25 changed files with 333 additions and 87 deletions

View file

@ -3,6 +3,10 @@
All notable changes to this project will be documented in this file.
## [v1.85.0]
- Report viewer: added `--view-report-port` and `--view-report-address` to `kingfisher scan --view-report`, and `--address` to `kingfisher view`, so the embedded report server can bind to `0.0.0.0` and be reached from the host when running in Docker. Use `--view-report-address 0.0.0.0` with `-p 7890:7890` (or `--view-report-port 7891` with `-p 7891:7891`) to view the HTML report at http://localhost:7890 from your host.
- Updated `kingfisher scan` to accept Git repository URLs as positional targets (for example `kingfisher scan github.com/org/repo` or `kingfisher scan https://gitlab.com/group/project.git`) without requiring `--git-url`.
- Deprecated `--git-url` while preserving backward compatibility; using the flag now emits a migration warning to prefer positional URL targets.
- Updated README/integration/usage/install/demo examples and CLI tests to use positional Git URL scanning syntax.
- Added `--turbo` mode: sets `--commit-metadata=false`, `--no-base64`, disables language detection, and disables tree-sitter parsing...for maximum scan speed. Findings will omit Git commit context (author, date, commit hash) and will not include Base64-decoded secrets.
- SQLite database scanning: kingfisher now detects and extracts SQLite files (`.db`, `.sqlite`, `.sqlite3`, etc.), dumping each table as SQL text with named columns so secrets stored in database rows are scannable. Controlled by the existing `--extract-archives` flag.
- Python bytecode (.pyc) scanning: extracts string constants from compiled Python (`.pyc`, `.pyo`) files via marshal parsing so secrets embedded in bytecode are scannable. Controlled by `--extract-archives`.

View file

@ -75,7 +75,7 @@ NOTE: Replay has been slowed down for demo
## Report Viewer Demo
Explore Kingfisher's built-in report viewer and its `--access-map`, which can show what the token (AWS, GCP, Azure, GitHub, GitLab, and Slack...more coming) can actually access.
Note: when you pass `--view-report`, Kingfisher starts a **localhost-only** web server on port `7890` and opens it in your default browser. You'll see this near the end of the scan output, and **Kingfisher will keep running** until you stop it.
Note: when you pass `--view-report`, Kingfisher starts a web server on port `7890` (default) and opens it in your default browser. By default it binds to `127.0.0.1` for security. You'll see this near the end of the scan output, and **Kingfisher will keep running** until you stop it.
```bash
INFO kingfisher::cli::commands::view: Starting access-map viewer address=127.0.0.1:7890
@ -242,13 +242,44 @@ KF_SLACK_TOKEN="xoxp-..." kingfisher scan slack "api_key OR password"
docker run --rm -v "$PWD":/src ghcr.io/mongodb/kingfisher:latest scan /src
```
### 19: Output JSON results
### 19: Run with Docker and view report in browser
To run a scan in Docker and view the HTML report on your host machine, use `--view-report-address 0.0.0.0` so the server is reachable from outside the container, and map the port with `-p`:
```bash
docker run --rm \
-v "$PWD":/src \
-p 7890:7890 \
ghcr.io/mongodb/kingfisher:latest \
scan https://github.com/leaktk/fake-leaks \
--access-map \
--view-report \
--view-report-address 0.0.0.0
```
Then open **http://localhost:7890** in your browser. If port 7890 is already in use, use `--view-report-port` and map accordingly:
```bash
docker run --rm \
-v "$PWD":/src \
-p 7891:7891 \
ghcr.io/mongodb/kingfisher:latest \
scan https://github.com/leaktk/fake-leaks \
--access-map \
--view-report \
--view-report-port 7891 \
--view-report-address 0.0.0.0
```
Then open **http://localhost:7891**.
### 20: Output JSON results
```bash
kingfisher scan /path/to/code --format json --output findings.json
```
### 20: Map blast radius of discovered credentials
### 21: Map blast radius of discovered credentials
```bash
kingfisher scan /path/to/code --access-map --view-report

View file

@ -240,11 +240,10 @@ kingfisher scan /path/to/local/repo --branch <ref>
kingfisher scan C:\\src\\repo --branch <commit-hash>
```
The same diff-focused workflow works when cloning repositories on the fly with `--git-url`. Kingfisher automatically tries remote-tracking names like `origin/main` and `origin/feature-1`, so you can target the branches involved in a pull request without performing a local checkout first.
The same diff-focused workflow works when cloning repositories on the fly by passing a Git URL directly to `scan`. Kingfisher automatically tries remote-tracking names like `origin/main` and `origin/feature-1`, so you can target the branches involved in a pull request without performing a local checkout first.
```bash
kingfisher scan \
--git-url https://github.com/org/repo.git \
kingfisher scan https://github.com/org/repo.git \
--since-commit main \
--branch development
```
@ -256,16 +255,14 @@ When `--since-commit` is omitted, specifying `--branch` scans the requested ref
kingfisher scan ~/tmp/repo --branch feature-123
# Or scan a branch when cloning on the fly
kingfisher scan \
--git-url https://github.com/org/repo.git \
kingfisher scan https://github.com/org/repo.git \
--branch origin/feature-123
```
In CI systems that expose the base and head commits explicitly, you can pass those SHAs directly while still using `--git-url`:
In CI systems that expose the base and head commits explicitly, you can pass those SHAs directly while scanning a Git URL:
```bash
kingfisher scan \
--git-url git@github.com:org/repo.git \
kingfisher scan https://github.com/org/repo.git \
--since-commit "$BASE_COMMIT" \
--branch "$PR_HEAD_COMMIT"
```
@ -341,8 +338,8 @@ kingfisher scan /path/to/repo --rule-stats
- `--no-base64`: By default, Kingfisher finds and decodes base64 blobs and scans them for secrets. This adds a slight performance overhead; use this flag to disable
- `--confidence <LEVEL>`: (low|medium|high)
- `--min-entropy <VAL>`: Override default threshold
- `--include-contributors`: When using `--git-url` for GitHub or GitLab, include contributor-owned repos in the scan
- `--git-clone-dir <DIR>`: Choose the parent directory for cloned repos and scan artifacts (use with `--git-url`)
- `--include-contributors`: When scanning GitHub or GitLab URLs, include contributor-owned repos in the scan
- `--git-clone-dir <DIR>`: Choose the parent directory for cloned repos and scan artifacts (use with Git URL scans)
- `--keep-clones`: Preserve cloned repositories on disk after a scan completes
- `--repo-clone-limit <N>`: Cap the number of GitHub/GitLab repositories cloned when enumerating orgs/groups or contributor repos
- `--no-binary`: Skip binary files

View file

@ -341,7 +341,7 @@ docker run --rm \
-e KF_GITHUB_TOKEN=ghp_… \
-v "$PWD":/proj \
ghcr.io/mongodb/kingfisher:latest \
scan --git-url https://github.com/org/private_repo.git
scan https://github.com/org/private_repo.git
# Scan an S3 bucket
# Credentials can come from KF_AWS_KEY/KF_AWS_SECRET, --role-arn, or --profile
@ -377,6 +377,15 @@ docker run --rm \
scan /src \
--format json \
--output /out/findings.json
# Scan and view the HTML report in your browser (Docker)
# Use --view-report-address 0.0.0.0 and -p to expose the report server to the host
docker run --rm \
-v "$PWD":/src \
-p 7890:7890 \
ghcr.io/mongodb/kingfisher:latest \
scan /src --access-map --view-report --view-report-address 0.0.0.0
# Then open http://localhost:7890 in your browser
```
## PyPI Wheels

View file

@ -157,7 +157,8 @@ kingfisher scan github --organization my-org \
### Scan remote GitHub repository
`--git-url` clones the repository and scans its files and history. When the URL
Pass a repository URL as a positional scan target to clone and scan its files and history.
(The legacy `--git-url` flag still works but is deprecated.) When the URL
targets GitHub and you pass `--include-contributors`, Kingfisher enumerates
repository contributors and attempts to clone **all public repos owned by those
contributors**—a common offensive and blue-team pivot when developers leak
@ -176,9 +177,9 @@ extras counts against API rate limits and private artifacts require a
Use `--git-clone-dir` to choose where cloned repositories land and
`--keep-clones` to preserve them for follow-on analysis.
> **Why does `--git-url` sometimes report fewer findings than scanning a local checkout?**.
> **Why can scanning a remote URL report fewer findings than scanning a local checkout?**.
>
> Remote clones created via `--git-url` default to `--mirror`/bare mode so Kingfisher only
> Remote clones default to `--mirror`/bare mode so Kingfisher only
> reads the Git history. When you point Kingfisher at an existing working tree (for example
> `kingfisher scan ./repo`), it enumerates both the filesystem contents *and* the Git
> history. Any secrets that are present in the checked-out files therefore appear twice:
@ -188,23 +189,23 @@ Use `--git-clone-dir` to choose where cloned repositories land and
```bash
# Scan the repository only
kingfisher scan --git-url https://github.com/org/repo.git
kingfisher scan github.com/org/repo
# Scan the repository plus contributor repos, but cap the crawl
kingfisher scan --git-url https://github.com/org/repo.git \
kingfisher scan https://github.com/org/repo.git \
--include-contributors \
--repo-clone-limit 250
# Keep clones for later manual inspection
kingfisher scan --git-url https://github.com/org/repo.git \
kingfisher scan https://github.com/org/repo.git \
--git-clone-dir ./kingfisher-clones \
--keep-clones
# Include issues, wiki, and owner gists
kingfisher scan --git-url https://github.com/org/repo.git --repo-artifacts
kingfisher scan https://github.com/org/repo.git --repo-artifacts
# Private repositories or artifacts
KF_GITHUB_TOKEN="ghp_…" kingfisher scan --git-url https://github.com/org/private_repo.git --repo-artifacts
KF_GITHUB_TOKEN="ghp_…" kingfisher scan https://github.com/org/private_repo.git --repo-artifacts
```
## GitLab
@ -239,7 +240,7 @@ kingfisher scan gitlab --group my-group \
### Scan remote GitLab repository by URL
`--git-url` by itself clones the project repository. When the URL targets
A Git URL target by itself clones the project repository. When the URL targets
GitLab and you pass `--include-contributors`, Kingfisher enumerates contributors
and tries to clone **their other public projects** to catch secrets that escape
the main repo. Apply `--repo-clone-limit` to cap the total repos cloned during
@ -258,23 +259,23 @@ to preserve them for later review.
```bash
# Scan the repository only
kingfisher scan --git-url https://gitlab.com/group/project.git
kingfisher scan gitlab.com/group/project.git
# Scan the repository plus contributor projects, but cap the crawl
kingfisher scan --git-url https://gitlab.com/group/project.git \
kingfisher scan https://gitlab.com/group/project.git \
--include-contributors \
--repo-clone-limit 250
# Keep clones for later manual inspection
kingfisher scan --git-url https://gitlab.com/group/project.git \
kingfisher scan https://gitlab.com/group/project.git \
--git-clone-dir ./kingfisher-clones \
--keep-clones
# Include issues, wiki, and snippets
kingfisher scan --git-url https://gitlab.com/group/project.git --repo-artifacts
kingfisher scan https://gitlab.com/group/project.git --repo-artifacts
# Private projects or artifacts
KF_GITLAB_TOKEN="glpat-…" kingfisher scan --git-url https://gitlab.com/group/private_project.git --repo-artifacts
KF_GITLAB_TOKEN="glpat-…" kingfisher scan https://gitlab.com/group/private_project.git --repo-artifacts
```
### List GitLab repositories
@ -360,17 +361,17 @@ kingfisher scan gitea --organization my-org \
### Scan remote Gitea repository by URL
`--git-url` clones the repository and scans its history. Adding `--repo-artifacts`
A Git URL target clones the repository and scans its history. Adding `--repo-artifacts`
also clones the repository wiki if one exists. Private repositories and wikis
require `KF_GITEA_TOKEN` (and `KF_GITEA_USERNAME` when cloning via HTTPS).
```bash
# Scan the repository only
kingfisher scan --git-url https://gitea.com/org/repo.git
kingfisher scan https://gitea.com/org/repo.git
# Include the repository wiki (if present)
KF_GITEA_TOKEN="gtoken" KF_GITEA_USERNAME="org" \
kingfisher scan --git-url https://gitea.com/org/repo.git --repo-artifacts
kingfisher scan https://gitea.com/org/repo.git --repo-artifacts
```
### List Gitea repositories
@ -414,17 +415,17 @@ kingfisher scan bitbucket --workspace my-team \
### Scan remote Bitbucket repository by URL
`--git-url` clones the repository and scans its files and history. To inspect
A Git URL target clones the repository and scans its files and history. To inspect
Bitbucket artifacts such as issues, add `--repo-artifacts`. Private artifacts
require credentials (see [Authenticate to Bitbucket](#authenticate-to-bitbucket)).
```bash
# Scan the repository only
kingfisher scan --git-url https://bitbucket.org/hashashash/secretstest.git
kingfisher scan https://bitbucket.org/hashashash/secretstest.git
# Include repository issues
KF_BITBUCKET_TOKEN="$BITBUCKET_TOKEN" \
kingfisher scan --git-url https://bitbucket.org/workspace/project.git --repo-artifacts
kingfisher scan https://bitbucket.org/workspace/project.git --repo-artifacts
```
### List Bitbucket repositories

View file

@ -105,7 +105,7 @@ Add `--access-map` to enrich JSON, JSONL, BSON, pretty, and SARIF reports with a
kingfisher view kingfisher.json
```
The `view` subcommand starts a local-only server (default port `7890`) that bundles the HTML, CSS, and JavaScript for the access-map viewer directly into the Kingfisher binary. Provide a JSON or JSONL report to load it automatically and Kingfisher will open your browser, or open the page and upload a report in the browser. If port 7890 is already in use, Kingfisher will exit and tell you to re-run with `--port <PORT>`.
The `view` subcommand starts a server (default port `7890`, bind address `127.0.0.1`) that bundles the HTML, CSS, and JavaScript for the access-map viewer directly into the Kingfisher binary. Provide a JSON or JSONL report to load it automatically and Kingfisher will open your browser, or open the page and upload a report in the browser. If port 7890 is already in use, re-run with `--port <PORT>`. To allow access from Docker or other hosts, use `--address 0.0.0.0`.
### Pipe any text directly into Kingfisher by passing `-`
@ -348,11 +348,10 @@ kingfisher scan /path/to/local/repo --branch <ref>
kingfisher scan C:\\src\\repo --branch <commit-hash>
```
The same diff-focused workflow works when cloning repositories on the fly with `--git-url`. Kingfisher automatically tries remote-tracking names like `origin/main` and `origin/feature-1`, so you can target the branches involved in a pull request without performing a local checkout first.
The same diff-focused workflow works when cloning repositories on the fly by passing a Git URL directly to `scan`. Kingfisher automatically tries remote-tracking names like `origin/main` and `origin/feature-1`, so you can target the branches involved in a pull request without performing a local checkout first.
```bash
kingfisher scan \
--git-url https://github.com/org/repo.git \
kingfisher scan https://github.com/org/repo.git \
--since-commit main \
--branch development
```
@ -364,16 +363,14 @@ When `--since-commit` is omitted, specifying `--branch` scans the requested ref
kingfisher scan ~/tmp/repo --branch feature-123
# Or scan a branch when cloning on the fly
kingfisher scan \
--git-url https://github.com/org/repo.git \
kingfisher scan https://github.com/org/repo.git \
--branch origin/feature-123
```
In CI systems that expose the base and head commits explicitly, you can pass those SHAs directly while still using `--git-url`:
In CI systems that expose the base and head commits explicitly, you can pass those SHAs directly while scanning a Git URL:
```bash
kingfisher scan \
--git-url git@github.com:org/repo.git \
kingfisher scan https://github.com/org/repo.git \
--since-commit "$BASE_COMMIT" \
--branch "$PR_HEAD_COMMIT"
```
@ -530,7 +527,7 @@ kingfisher scan github --organization my-org \
### Scan remote GitHub repository
`--git-url` clones the repository and scans its files and history. When the URL targets GitHub and you pass `--include-contributors`, Kingfisher enumerates repository contributors and attempts to clone **all public repos owned by those contributors**—a common offensive and blue-team pivot when developers leak secrets in personal or side projects. Use `--repo-clone-limit` to cap how many repositories are cloned during this enumeration.
Pass a repository URL as a positional scan target to clone and scan its files and history. (The legacy `--git-url` flag still works but is deprecated.) When the URL targets GitHub and you pass `--include-contributors`, Kingfisher enumerates repository contributors and attempts to clone **all public repos owned by those contributors**—a common offensive and blue-team pivot when developers leak secrets in personal or side projects. Use `--repo-clone-limit` to cap how many repositories are cloned during this enumeration.
**NOTE**: This may cause you to be temporarily rate-limited by GitHub. Providing a token (`KF_GITHUB_TOKEN`) will provide a higher rate limit.
@ -538,29 +535,29 @@ To inspect related server-side data, supply `--repo-artifacts`. This flag pulls
Use `--git-clone-dir` to choose where cloned repositories land and `--keep-clones` to preserve them for follow-on analysis.
> **Why does `--git-url` sometimes report fewer findings than scanning a local checkout?**.
> **Why can scanning a remote URL report fewer findings than scanning a local checkout?**.
>
> Remote clones created via `--git-url` default to `--mirror`/bare mode so Kingfisher only reads the Git history. When you point Kingfisher at an existing working tree (for example `kingfisher scan ./repo`), it enumerates both the filesystem contents *and* the Git history. Any secrets that are present in the checked-out files therefore appear twice: once from the working tree path and once from the commit where the secret entered the history. To replicate the remote behavior locally, either scan a bare clone or disable history scanning with `--git-history none` when targeting a working tree.
> Remote clones default to `--mirror`/bare mode so Kingfisher only reads the Git history. When you point Kingfisher at an existing working tree (for example `kingfisher scan ./repo`), it enumerates both the filesystem contents *and* the Git history. Any secrets that are present in the checked-out files therefore appear twice: once from the working tree path and once from the commit where the secret entered the history. To replicate the remote behavior locally, either scan a bare clone or disable history scanning with `--git-history none` when targeting a working tree.
```bash
# Scan the repository only
kingfisher scan --git-url https://github.com/org/repo.git
kingfisher scan github.com/org/repo
# Scan the repository plus contributor repos, but cap the crawl
kingfisher scan --git-url https://github.com/org/repo.git \
kingfisher scan https://github.com/org/repo.git \
--include-contributors \
--repo-clone-limit 250
# Keep clones for later manual inspection
kingfisher scan --git-url https://github.com/org/repo.git \
kingfisher scan https://github.com/org/repo.git \
--git-clone-dir ./kingfisher-clones \
--keep-clones
# Include issues, wiki, and owner gists
kingfisher scan --git-url https://github.com/org/repo.git --repo-artifacts
kingfisher scan https://github.com/org/repo.git --repo-artifacts
# Private repositories or artifacts
KF_GITHUB_TOKEN="ghp_…" kingfisher scan --git-url https://github.com/org/private_repo.git --repo-artifacts
KF_GITHUB_TOKEN="ghp_…" kingfisher scan https://github.com/org/private_repo.git --repo-artifacts
```
---
@ -594,7 +591,7 @@ kingfisher scan gitlab --group my-group \
### Scan remote GitLab repository by URL
`--git-url` by itself clones the project repository. When the URL targets GitLab and you pass `--include-contributors`, Kingfisher enumerates contributors and tries to clone **their other public projects** to catch secrets that escape the main repo. Apply `--repo-clone-limit` to cap the total repos cloned during this pivot.
A Git URL target by itself clones the project repository. When the URL targets GitLab and you pass `--include-contributors`, Kingfisher enumerates contributors and tries to clone **their other public projects** to catch secrets that escape the main repo. Apply `--repo-clone-limit` to cap the total repos cloned during this pivot.
**NOTE**: This may cause you to be temporarily rate-limited by GitLab. Providing a token (`KF_GITLAB_TOKEN`) will provide a higher rate limit.
@ -604,23 +601,23 @@ Use `--git-clone-dir` to choose where cloned projects land and `--keep-clones` t
```bash
# Scan the repository only
kingfisher scan --git-url https://gitlab.com/group/project.git
kingfisher scan gitlab.com/group/project.git
# Scan the repository plus contributor projects, but cap the crawl
kingfisher scan --git-url https://gitlab.com/group/project.git \
kingfisher scan https://gitlab.com/group/project.git \
--include-contributors \
--repo-clone-limit 250
# Keep clones for later manual inspection
kingfisher scan --git-url https://gitlab.com/group/project.git \
kingfisher scan https://gitlab.com/group/project.git \
--git-clone-dir ./kingfisher-clones \
--keep-clones
# Include issues, wiki, and snippets
kingfisher scan --git-url https://gitlab.com/group/project.git --repo-artifacts
kingfisher scan https://gitlab.com/group/project.git --repo-artifacts
# Private projects or artifacts
KF_GITLAB_TOKEN="glpat-…" kingfisher scan --git-url https://gitlab.com/group/private_project.git --repo-artifacts
KF_GITLAB_TOKEN="glpat-…" kingfisher scan https://gitlab.com/group/private_project.git --repo-artifacts
```
### List GitLab repositories
@ -705,15 +702,15 @@ kingfisher scan gitea --organization my-org \
### Scan remote Gitea repository by URL
`--git-url` clones the repository and scans its history. Adding `--repo-artifacts` also clones the repository wiki if one exists. Private repositories and wikis require `KF_GITEA_TOKEN` (and `KF_GITEA_USERNAME` when cloning via HTTPS).
A Git URL target clones the repository and scans its history. Adding `--repo-artifacts` also clones the repository wiki if one exists. Private repositories and wikis require `KF_GITEA_TOKEN` (and `KF_GITEA_USERNAME` when cloning via HTTPS).
```bash
# Scan the repository only
kingfisher scan --git-url https://gitea.com/org/repo.git
kingfisher scan https://gitea.com/org/repo.git
# Include the repository wiki (if present)
KF_GITEA_TOKEN="gtoken" KF_GITEA_USERNAME="org" \
kingfisher scan --git-url https://gitea.com/org/repo.git --repo-artifacts
kingfisher scan https://gitea.com/org/repo.git --repo-artifacts
```
### List Gitea repositories
@ -757,15 +754,15 @@ kingfisher scan bitbucket --workspace my-team \
### Scan remote Bitbucket repository by URL
`--git-url` clones the repository and scans its files and history. To inspect Bitbucket artifacts such as issues, add `--repo-artifacts`. Private artifacts require credentials (see [Authenticate to Bitbucket](#authenticate-to-bitbucket)).
A Git URL target clones the repository and scans its files and history. To inspect Bitbucket artifacts such as issues, add `--repo-artifacts`. Private artifacts require credentials (see [Authenticate to Bitbucket](#authenticate-to-bitbucket)).
```bash
# Scan the repository only
kingfisher scan --git-url https://bitbucket.org/hashashash/secretstest.git
kingfisher scan https://bitbucket.org/hashashash/secretstest.git
# Include repository issues
KF_BITBUCKET_TOKEN="$BITBUCKET_TOKEN" \
kingfisher scan --git-url https://bitbucket.org/workspace/project.git --repo-artifacts
kingfisher scan https://bitbucket.org/workspace/project.git --repo-artifacts
```
### List Bitbucket repositories

View file

@ -6,7 +6,7 @@ Set TypingSpeed 60ms
Set Framerate 60
Set PlaybackSpeed 1.3
Type "kingfisher scan --git-url https://github.com/leaktk/fake-leaks.git --access-map --view-report"
Type "kingfisher scan https://github.com/leaktk/fake-leaks.git --access-map --view-report"
Enter
Wait+Screen@30s /(report|findings|summary|kingfisher)/

View file

@ -31,8 +31,8 @@ pub struct InputSpecifierArgs {
#[arg(num_args = 0.., value_hint = ValueHint::AnyPath)]
pub path_inputs: Vec<PathBuf>,
/// Clone and scan the Git repository at the given URL
#[arg(long, value_hint = ValueHint::Url)]
/// Deprecated: clone and scan a Git repository URL. Prefer positional targets: `kingfisher scan github.com/org/repo`
#[arg(long = "git-url", value_hint = ValueHint::Url)]
pub git_url: Vec<GitUrl>,
/// Parent directory for cloned Git repositories and scan artifacts
@ -421,7 +421,14 @@ impl InputSpecifierArgs {
}
/// Emit deprecation warnings for legacy top-level provider flags.
pub fn emit_deprecated_warnings(&self) {
pub fn emit_deprecated_warnings(&self, used_legacy_git_url_flag: bool) {
if used_legacy_git_url_flag {
warn_deprecated_provider(
"Git URL",
"Passing repository URLs with `--git-url` is deprecated. Pass the URL as a positional scan target instead, e.g. `kingfisher scan github.com/org/repo`.",
);
}
if self.using_legacy_github_flags() {
warn_deprecated_provider(
"GitHub",

View file

@ -1,6 +1,10 @@
use anyhow::bail;
use clap::{Args, Subcommand, ValueEnum, ValueHint};
use std::path::{Path, PathBuf};
use std::{
net::IpAddr,
path::{Path, PathBuf},
str::FromStr,
};
use strum::Display;
use tracing::debug;
use url::Url;
@ -17,6 +21,7 @@ use crate::{
inputs::{ContentFilteringArgs, InputSpecifierArgs},
output::{OutputArgs, ReportOutputFormat},
rules::RuleSpecifierArgs,
view,
},
global::RAM_GB,
},
@ -202,6 +207,11 @@ pub struct ScanArgs {
/// Disable rule-level `ignore_if_contains` filtering for pattern requirements
#[arg(global = true, long = "no-ignore-if-contains", default_value_t = false)]
pub no_ignore_if_contains: bool,
#[arg(skip)]
pub view_report_port: u16,
#[arg(skip)]
pub view_report_address: String,
}
/// Confidence levels for findings
@ -232,6 +242,24 @@ pub struct ScanCommandArgs {
#[arg(global = true, long = "view-report", default_value_t = false)]
pub view_report: bool,
/// Port for the report viewer when using --view-report (default 7890)
#[arg(
global = true,
long = "view-report-port",
default_value_t = view::DEFAULT_PORT,
value_name = "PORT"
)]
pub view_report_port: u16,
/// Bind address for the report viewer when using --view-report (default 127.0.0.1). Use 0.0.0.0 to allow access from Docker or other hosts.
#[arg(
global = true,
long = "view-report-address",
default_value = view::DEFAULT_ADDRESS,
value_name = "ADDRESS"
)]
pub view_report_address: String,
#[command(subcommand)]
pub provider: Option<ScanInputCommand>,
}
@ -253,11 +281,34 @@ pub enum ListRepositoriesCommand {
}
impl ScanCommandArgs {
fn infer_positional_git_urls(&mut self) {
let mut inferred_git_urls = Vec::new();
let mut retained_paths = Vec::new();
for path in self.scan_args.input_specifier_args.path_inputs.drain(..) {
if path.as_path() == Path::new("-") || path.exists() {
retained_paths.push(path);
continue;
}
if let Some(git_url) = parse_git_url_target(&path) {
inferred_git_urls.push(git_url);
} else {
retained_paths.push(path);
}
}
self.scan_args.input_specifier_args.path_inputs = retained_paths;
self.scan_args.input_specifier_args.git_url.extend(inferred_git_urls);
}
/// Convert CLI arguments into a scan or repository-listing operation.
pub fn into_operation(mut self) -> anyhow::Result<ScanOperation> {
let mut used_provider_subcommand = false;
self.scan_args.view_report = self.view_report;
self.scan_args.view_report_port = self.view_report_port;
self.scan_args.view_report_address = self.view_report_address.clone();
if let Some(provider) = self.provider.take() {
used_provider_subcommand = true;
@ -466,9 +517,12 @@ impl ScanCommandArgs {
}
}
let used_legacy_git_url_flag = !self.scan_args.input_specifier_args.git_url.is_empty();
self.infer_positional_git_urls();
if !self.scan_args.input_specifier_args.has_any_input() {
bail!(
"Specify a path, --git-url, or use a provider subcommand such as 'kingfisher scan github'"
"Specify a path or Git URL (for example: 'kingfisher scan github.com/org/repo'), or use a provider subcommand such as 'kingfisher scan github'"
);
}
@ -483,7 +537,7 @@ impl ScanCommandArgs {
}
if !used_provider_subcommand {
self.scan_args.input_specifier_args.emit_deprecated_warnings();
self.scan_args.input_specifier_args.emit_deprecated_warnings(used_legacy_git_url_flag);
}
if self.scan_args.manage_baseline {
@ -503,6 +557,44 @@ impl ScanCommandArgs {
}
}
fn parse_git_url_target(path: &Path) -> Option<GitUrl> {
let raw = path.to_str()?.trim();
if raw.is_empty() || raw == "-" || raw.contains('\\') {
return None;
}
if let Ok(url) = GitUrl::from_str(raw) {
return Some(url);
}
if raw.contains("://")
|| raw.starts_with('/')
|| raw.starts_with("./")
|| raw.starts_with("../")
|| raw.starts_with('~')
{
return None;
}
let (host, suffix) = raw.split_once('/')?;
if host.is_empty() || suffix.is_empty() {
return None;
}
let path_segments = suffix.split('/').filter(|segment| !segment.is_empty()).count();
if path_segments < 2 {
return None;
}
let host_looks_valid =
host.contains('.') || host == "localhost" || host.parse::<IpAddr>().is_ok();
if !host_looks_valid {
return None;
}
GitUrl::from_str(&format!("https://{raw}")).ok()
}
#[derive(Subcommand, Debug, Clone)]
pub enum ScanInputCommand {
/// Scan local files, directories, or Git repositories
@ -552,7 +644,7 @@ pub struct FilesystemScanArgs {
#[arg(value_name = "PATH", value_hint = ValueHint::AnyPath)]
pub paths: Vec<PathBuf>,
/// Git repository URLs to clone and scan
/// Deprecated: git repository URLs to clone and scan. Prefer positional targets.
#[arg(long = "git-url", value_hint = ValueHint::Url)]
pub git_url: Vec<GitUrl>,
}

View file

@ -22,6 +22,9 @@ pub const DEFAULT_PORT: u16 = 7890;
// Embedded viewer assets - force rebuild
static VIEWER_ASSETS: Dir<'_> = include_dir!("$CARGO_MANIFEST_DIR/docs/access-map-viewer");
/// Default bind address for the report viewer (localhost only for security).
pub const DEFAULT_ADDRESS: &str = "127.0.0.1";
/// View a Kingfisher access-map report locally.
#[derive(clap::Args, Debug)]
pub struct ViewArgs {
@ -33,6 +36,10 @@ pub struct ViewArgs {
#[arg(long, default_value_t = DEFAULT_PORT)]
pub port: u16,
/// Bind address for the report viewer (default 127.0.0.1). Use 0.0.0.0 to allow access from Docker or other hosts.
#[arg(long, default_value = DEFAULT_ADDRESS, value_name = "ADDRESS")]
pub address: String,
#[arg(skip)]
pub open_browser: bool,
@ -45,8 +52,10 @@ struct AppState {
report: Option<Vec<u8>>,
}
pub fn ensure_port_available(port: u16) -> Result<()> {
StdTcpListener::bind(("127.0.0.1", port)).map_err(|err| match err.kind() {
pub fn ensure_port_available(port: u16, address: &str) -> Result<()> {
let addr: std::net::IpAddr =
address.parse().context("Invalid bind address for report viewer")?;
StdTcpListener::bind((addr, port)).map_err(|err| match err.kind() {
std::io::ErrorKind::AddrInUse => anyhow!(
"Port {} is already in use. Re-run with --port <PORT> to choose a different port.",
port
@ -81,14 +90,15 @@ pub async fn run(args: ViewArgs) -> Result<()> {
None
};
let listener =
TcpListener::bind(("127.0.0.1", args.port)).await.map_err(|err| match err.kind() {
std::io::ErrorKind::AddrInUse => anyhow!(
"Port {} is already in use. Re-run with --port <PORT> to choose a different port.",
args.port
),
_ => err.into(),
})?;
let addr: std::net::IpAddr =
args.address.parse().context("Invalid bind address for report viewer")?;
let listener = TcpListener::bind((addr, args.port)).await.map_err(|err| match err.kind() {
std::io::ErrorKind::AddrInUse => anyhow!(
"Port {} is already in use. Re-run with --port <PORT> to choose a different port.",
args.port
),
_ => err.into(),
})?;
let address: SocketAddr =
listener.local_addr().context("Failed to read local listener address")?;

View file

@ -964,6 +964,8 @@ pub(crate) fn create_minimal_scan_args() -> crate::cli::commands::scan::ScanArgs
turbo: false,
no_inline_ignore: false,
no_ignore_if_contains: false,
view_report_port: 7890,
view_report_address: "127.0.0.1".to_string(),
validation_timeout: 10,
validation_retries: 1,
validation_rps: None,

View file

@ -239,7 +239,10 @@ async fn async_main(args: CommandLineArgs) -> Result<()> {
Command::Scan(scan_command) => match scan_command.into_operation()? {
ScanOperation::Scan(mut scan_args) => {
if scan_args.view_report {
view::ensure_port_available(view::DEFAULT_PORT)?;
view::ensure_port_available(
scan_args.view_report_port,
&scan_args.view_report_address,
)?;
}
let view_scan_started_at = chrono::Local::now();
let view_scan_start_time = Instant::now();
@ -320,7 +323,8 @@ async fn async_main(args: CommandLineArgs) -> Result<()> {
let report_bytes = serde_json::to_vec_pretty(&envelope)?;
let view_args = view::ViewArgs {
report: None,
port: view::DEFAULT_PORT,
port: scan_args.view_report_port,
address: scan_args.view_report_address.clone(),
open_browser: true,
report_bytes: Some(report_bytes),
};
@ -580,6 +584,8 @@ fn create_default_scan_args() -> cli::commands::scan::ScanArgs {
turbo: false,
no_inline_ignore: false,
no_ignore_if_contains: false,
view_report_port: view::DEFAULT_PORT,
view_report_address: view::DEFAULT_ADDRESS.to_string(),
validation_timeout: 10,
validation_retries: 1,
validation_rps: None,

View file

@ -1792,6 +1792,8 @@ mod tests {
skip_aws_account_file: None,
no_inline_ignore: false,
no_ignore_if_contains: false,
view_report_port: 7890,
view_report_address: "127.0.0.1".to_string(),
validation_timeout: 10,
validation_retries: 1,
validation_rps: None,

View file

@ -196,6 +196,8 @@ mod tests {
turbo: false,
no_inline_ignore: false,
no_ignore_if_contains: false,
view_report_port: 7890,
view_report_address: "127.0.0.1".to_string(),
validation_timeout: 10,
validation_retries: 1,
validation_rps: None,

View file

@ -12,7 +12,6 @@ fn parse_git_clone_dir_and_keep_clones() -> anyhow::Result<()> {
let args = CommandLineArgs::try_parse_from([
"kingfisher",
"scan",
"--git-url",
"https://github.com/octocat/Hello-World.git",
"--git-clone-dir",
dir.path().to_str().unwrap(),
@ -41,8 +40,7 @@ fn keep_clones_defaults_to_false() -> anyhow::Result<()> {
let args = CommandLineArgs::try_parse_from([
"kingfisher",
"scan",
"--git-url",
"https://github.com/octocat/Hello-World.git",
"github.com/octocat/Hello-World",
"--no-update-check",
])?;
@ -62,6 +60,70 @@ fn keep_clones_defaults_to_false() -> anyhow::Result<()> {
Ok(())
}
#[test]
fn deprecated_git_url_flag_still_parses() -> anyhow::Result<()> {
let args = CommandLineArgs::try_parse_from([
"kingfisher",
"scan",
"--git-url",
"https://github.com/octocat/Hello-World.git",
"--no-update-check",
])?;
let command = match args.command {
Command::Scan(scan_args) => scan_args,
other => panic!("unexpected command parsed: {:?}", other),
};
let scan_args = match command.into_operation()? {
ScanOperation::Scan(scan_args) => scan_args,
op => panic!("expected scan operation, got {:?}", op),
};
assert_eq!(scan_args.input_specifier_args.git_url.len(), 1);
assert_eq!(
scan_args.input_specifier_args.git_url[0].as_str(),
"https://github.com/octocat/Hello-World.git"
);
assert!(scan_args.input_specifier_args.path_inputs.is_empty());
Ok(())
}
#[test]
fn positional_git_url_examples_parse() -> anyhow::Result<()> {
let examples = [
("github.com/kubernetes/kubernetes", "https://github.com/kubernetes/kubernetes"),
("https://github.com/org/repo", "https://github.com/org/repo"),
("gitlab.com/gitlab-org/gitlab", "https://gitlab.com/gitlab-org/gitlab"),
(
"https://gitlab.com/namespace/project.git",
"https://gitlab.com/namespace/project.git",
),
];
for (input, expected) in examples {
let args =
CommandLineArgs::try_parse_from(["kingfisher", "scan", input, "--no-update-check"])?;
let command = match args.command {
Command::Scan(scan_args) => scan_args,
other => panic!("unexpected command parsed: {:?}", other),
};
let scan_args = match command.into_operation()? {
ScanOperation::Scan(scan_args) => scan_args,
op => panic!("expected scan operation, got {:?}", op),
};
assert_eq!(scan_args.input_specifier_args.git_url.len(), 1);
assert_eq!(scan_args.input_specifier_args.git_url[0].as_str(), expected);
assert!(scan_args.input_specifier_args.path_inputs.is_empty());
}
Ok(())
}
#[test]
fn turbo_mode_applies_speed_first_defaults() -> anyhow::Result<()> {
let args = CommandLineArgs::try_parse_from([

View file

@ -158,6 +158,8 @@ fn run_skiplist(skip_regex: Vec<String>, skip_skipword: Vec<String>) -> Result<u
turbo: false,
no_inline_ignore: false,
no_ignore_if_contains: false,
view_report_port: 7890,
view_report_address: "127.0.0.1".to_string(),
validation_retries: 1,
validation_rps: None,
validation_rps_rule: Vec::new(),

View file

@ -158,6 +158,8 @@ fn test_bitbucket_remote_scan() -> Result<()> {
extra_ignore_comments: Vec::new(),
no_inline_ignore: false,
no_ignore_if_contains: false,
view_report_port: 7890,
view_report_address: "127.0.0.1".to_string(),
validation_retries: 1,
validation_rps: None,
validation_rps_rule: Vec::new(),

View file

@ -178,6 +178,8 @@ rules:
extra_ignore_comments: Vec::new(),
no_inline_ignore: false,
no_ignore_if_contains: false,
view_report_port: 7890,
view_report_address: "127.0.0.1".to_string(),
validation_retries: 1,
validation_rps: None,
validation_rps_rule: Vec::new(),

View file

@ -165,6 +165,8 @@ fn test_github_remote_scan() -> Result<()> {
extra_ignore_comments: Vec::new(),
no_inline_ignore: false,
no_ignore_if_contains: false,
view_report_port: 7890,
view_report_address: "127.0.0.1".to_string(),
validation_retries: 1,
validation_rps: None,
validation_rps_rule: Vec::new(),

View file

@ -163,6 +163,8 @@ fn test_gitlab_remote_scan() -> Result<()> {
turbo: false,
no_inline_ignore: false,
no_ignore_if_contains: false,
view_report_port: 7890,
view_report_address: "127.0.0.1".to_string(),
validation_retries: 1,
validation_rps: None,
validation_rps_rule: Vec::new(),
@ -331,6 +333,8 @@ fn test_gitlab_remote_scan_no_history() -> Result<()> {
no_inline_ignore: false,
no_ignore_if_contains: false,
view_report: false,
view_report_port: 7890,
view_report_address: "127.0.0.1".to_string(),
validation_retries: 1,
validation_rps: None,
validation_rps_rule: Vec::new(),

View file

@ -141,6 +141,8 @@ async fn test_redact_hashes_finding_values() -> Result<()> {
extra_ignore_comments: Vec::new(),
no_inline_ignore: false,
no_ignore_if_contains: false,
view_report_port: 7890,
view_report_address: "127.0.0.1".to_string(),
validation_retries: 1,
validation_rps: None,
validation_rps_rule: Vec::new(),

View file

@ -146,6 +146,8 @@ impl TestContext {
turbo: false,
no_inline_ignore: false,
no_ignore_if_contains: false,
view_report_port: 7890,
view_report_address: "127.0.0.1".to_string(),
validation_retries: 1,
validation_rps: None,
validation_rps_rule: Vec::new(),
@ -300,6 +302,8 @@ async fn test_scan_slack_messages() -> Result<()> {
no_inline_ignore: false,
no_ignore_if_contains: false,
view_report: false,
view_report_port: 7890,
view_report_address: "127.0.0.1".to_string(),
validation_retries: 1,
validation_rps: None,
validation_rps_rule: Vec::new(),

View file

@ -221,6 +221,8 @@ async fn test_validation_cache_and_depvars() -> Result<()> {
extra_ignore_comments: Vec::new(),
no_inline_ignore: false,
no_ignore_if_contains: false,
view_report_port: 7890,
view_report_address: "127.0.0.1".to_string(),
validation_retries: 1,
validation_rps: None,
validation_rps_rule: Vec::new(),

View file

@ -164,6 +164,8 @@ impl TestContext {
extra_ignore_comments: Vec::new(),
no_inline_ignore: false,
no_ignore_if_contains: false,
view_report_port: 7890,
view_report_address: "127.0.0.1".to_string(),
validation_retries: 1,
validation_rps: None,
validation_rps_rule: Vec::new(),
@ -305,6 +307,8 @@ impl TestContext {
turbo: false,
no_inline_ignore: false,
no_ignore_if_contains: false,
view_report_port: 7890,
view_report_address: "127.0.0.1".to_string(),
validation_retries: 1,
validation_rps: None,
validation_rps_rule: Vec::new(),

View file

@ -4,7 +4,7 @@ use predicates::str::contains;
#[test]
fn scan_homebrew_github_no_findings() -> anyhow::Result<()> {
Command::new(assert_cmd::cargo::cargo_bin!("kingfisher"))
.args(["scan", "--git-url", "https://github.com/homebrew/.github", "--no-update-check"])
.args(["scan", "https://github.com/homebrew/.github", "--no-update-check"])
.assert()
.success()
.stdout(contains("|Findings....................: 0"))