- Security: git clone provider tokens (`KF_GITHUB_TOKEN`, `KF_GITLAB_TOKEN`, `KF_GITEA_TOKEN`, `KF_AZURE_TOKEN`, `KF_HUGGINGFACE_TOKEN`) are now installed as host-scoped, HTTPS-only credential helpers (`credential.https://<host>.helper`) instead of unscoped global ones, so a malicious clone target can no longer capture them via an auth challenge. Trusted hosts derive from each provider's SaaS default plus any configured `--<provider>-api-url`/`--azure-base-url`/`--endpoint`, preserving GitHub Enterprise and other self-hosted flows.
- Security: `--output` report files are opened with `O_NOFOLLOW` (with a symlink pre-check on non-Unix) so a symlink planted at the report path inside a scanned repository can no longer redirect the write to truncate or overwrite an arbitrary file.
- Security: single-stream gzip/bzip2/xz/zlib decompression is now bounded by a 512 MB decompressed-byte cap, preventing a small compression bomb from exhausting disk during a scan.
- Added 3 detection and validation rules for Cognition Devin API credentials: `kingfisher.devin.1` (legacy personal keys, `apk_user_` prefix), `kingfisher.devin.2` (legacy service keys, `apk_` prefix), and `kingfisher.devin.3` (v3 service-user tokens, `cog_` prefix / RFC 4648 base32). Live validation uses `GET /v1/sessions` for `apk_*` keys and `GET /v3/self` for `cog_` tokens.
- Fixed asymmetric JWT validation panics by using a single `jsonwebtoken` crypto backend and adding RS256 regression coverage. Thanks @AgentEnder. [#386](https://github.com/mongodb/kingfisher/pull/386)
- Validator panics now fail that validation result instead of crashing the scan, with panic payloads kept out of cached and user-visible validation responses. Thanks @AgentEnder. [#387](https://github.com/mongodb/kingfisher/pull/387)
- Reduced `failed to spawn thread` errors in validation-heavy scans by capping Tokio blocking pools for the main and artifact-fetcher runtimes and raising the Unix soft `RLIMIT_NPROC` before worker startup.
- Archive scanning now reaches inside Android/iOS app packages: added `apk`, `aab`, and `ipa` to the recognized ZIP-based archive formats so secrets embedded in APK/AAB/IPA contents (e.g. `classes*.dex`, `res/values/strings.xml`) are extracted and matched.
- Git repository scans now extract archive blobs encountered in the object database, not just on the filesystem. Previously a `.zip`/`.jar`/`.apk`/`.tar.gz` committed to a repo was scanned as raw compressed bytes, so secrets inside it were invisible. The git enumerator fans each archive entry out as a synthetic `<archive>!<entry>` blob with the original commit metadata. Honors `--no-extract-archives` for opt-out.
- Fixed tar-wrapped archive extraction for `.tgz` and `.tar.*` files, and made dependent credential validation deduplication preserve per-occurrence context so repeated secrets validate with the correct nearby companion value.
- Performance: ZIP-based git blobs ≤ 64 MB extract entirely in memory (no temp-file round trip), beating the v1.99.0 baseline by ~15% on a 80 GiB monorepo despite scanning ~300K additional archive-content blobs. Larger archives auto-fall-back to a disk-streaming extractor.
- Memory safety: hard caps on archive extraction — 64 MB compressed pre-flight, 256 MB aggregate decompressed per archive (in-memory and disk paths), 512 MB per entry, plus a `PK\x03\x04` magic-byte gate. Worst-case footprint is bounded at ~`num_jobs * 320 MB`.
- Release binary trimmed from 34 MB to 26 MB (~24% smaller). Switched `jsonwebtoken` to its `rust_crypto` backend (eliminates our scanner's pull on `aws-lc-rs`), bumped workspace `hmac` 0.12→0.13, `sha1` 0.10→0.11, `sha2` 0.10→0.11 to deduplicate our internal crypto code with the AWS sigv4 side, and migrated affected call sites in `kingfisher-core`, `kingfisher-rules`, and `kingfisher-scanner` to the digest-0.11 API (`hex::encode` for hex digests, explicit `KeyInit` import for HMAC).
- Fixed [#371](https://github.com/mongodb/kingfisher/issues/371): `pip install kingfisher-bin` on glibc Linux distros (Ubuntu, Debian, RHEL, Fedora, …) installed a macOS Mach-O binary and failed with `OSError: [Errno 8] Exec format error`. Linux wheels are now tagged `manylinux_2_17_<arch>.musllinux_1_2_<arch>` (instead of `musllinux_1_2_<arch>` only), so pip accepts them on both glibc-2.17+ and musl distros. The `pypi/hatch_build.py` hook now hard-fails when `KINGFISHER_PYPI_WHEEL_TAG` is unset, and the publish workflow refuses to upload any `py3-none-any.whl`, so the v1.92.0-era pure-Python wheel cannot recur.
-`--self-update` (alias `--update`) on a scan or other command now **re-execs into the freshly installed binary** so the current invocation completes with the new code and the latest detection rules. Previously the on-disk binary was replaced but the running process kept using the old in-memory version, requiring a second invocation to pick up the changes. On Unix this is a true `exec()` (same PID); on Windows the new binary is spawned and the parent exits with its status code. The explicit `kingfisher self-update` subcommand still updates and exits without re-execing. Self-update now also covers Windows arm64 (the asset was already published; the runtime cfg map gained the missing arm). See `docs/ADVANCED.md` → *Update Checks*.
-`--include-contributors` now respects `--github-repo-type` when enumerating contributor-owned repositories: by default contributor forks are excluded (matching the existing `Source` default), previously they were always included regardless of the flag. Added a new `--github-repo-type all` option to opt into the prior behavior of scanning both source and fork repos for contributors, organizations, and users.
- **Access Map:** Pinecone API keys (validated `kingfisher.pinecone.1`): caller resources via `GET /indexes` (with serverless cloud/region or pod environment metadata, deletion-protection state) and `GET /collections`; standalone `kingfisher access-map pinecone` (alias `pinecone.io`).
- Added `--blast-radius` as an alias for `--access-map` on `kingfisher scan`, and `kingfisher blast-radius <provider>` as an alias for the `kingfisher access-map <provider>` subcommand, so the user-facing "blast radius" concept matches the CLI invocation.
- **Webhook alerting — Discord, Mattermost, and Google Chat targets:** `--alert-format` now accepts `discord` (color-coded embeds), `mattermost` (Slack-compatible attachments), and `googlechat` (`cardsV2` cards). Discord and Google Chat URLs are auto-inferred from the webhook host; Mattermost requires `--alert-format mattermost` since it is always self-hosted. All five chat targets (Slack, Teams, Discord, Mattermost, Google Chat) plus the Generic JSON sink can be combined in a single run via repeated `--alert-webhook` flags or `alerts.webhooks` entries in `kingfisher.yaml`.
- **Webhook alerting — `--alert-detail` mode:** new `--alert-detail auto|summary|detail` flag controls per-finding verbosity. `auto` (default) renders inline findings for ≤ 25 filtered results and drops to a summary card for larger scans so high-volume runs do not flood the channel. `summary` always suppresses per-finding blocks; `detail` always renders them. Per-webhook overrides are available via `detail:` in `kingfisher.yaml`.
- **Webhook alerting — `--alert-report-url` pivot link:** pass a CI run URL (or set `KINGFISHER_ALERT_REPORT_URL`) to embed a one-click "Full report →" link in every chat payload. In GitHub Actions, pair with `github.server_url/${{ github.repository }}/actions/runs/${{ github.run_id }}` to land the responder directly in the SARIF view for that run.
- **Webhook alerting — fingerprints in chat payloads:** every finding rendered in detail mode now includes its stable `fingerprint` ID (e.g. `fp:1635470773610661884`), matching the value emitted in JSON/JSONL/SARIF/baseline outputs. SOAR playbooks and SIEM rules can use these IDs to dedupe across runs without a separate correlation step.
- **Webhook alerting — scan target in all alert modes:** the "Target" line in chat payloads now correctly reflects the actual scan target for all input modes (GitHub org/user, GitLab group, Bitbucket workspace, S3/GCS bucket, Docker image, Jira/Confluence, Slack, Teams, Postman, etc.), not just local path scans.
- **`kingfisher config init` subcommand:** convert an existing `kingfisher scan ...` invocation into a reusable `kingfisher.yaml` by replacing `scan` with `config init` (e.g. `kingfisher config init --confidence high --redact --exclude vendor/ > kingfisher.yaml`). Only flags the user actually supplied appear in the output — clap defaults are stripped — and scan-target inputs are dropped. Writes to stdout by default, or to `--out FILE` (with `--force` to overwrite).
- **Access Map UI redesign** in the report viewer: identities are now grouped into collapsible per-provider sections (admin-bearing providers first); permissions are classified by severity (admin / privilege escalation / risky / read-only) with color-coded badges and rollup chips on each card header; the expanded card body renders permissions **once per group** with a "These permissions apply to all N resources above" banner instead of repeating the same 50+ badges per resource; duplicate-named identities (e.g., multiple MongoDB `admin` tokens) now display a discriminator subtitle (`identity_id · access_type`) so they're tellable apart; new "Critical only" toolbar toggle (persisted in `localStorage`) hides read-only permissions and zero-risk identities; the stats bar gained an admin-permission count. Imported TruffleHog/Gitleaks reports keep the previous flat rendering as a backwards-compatible fallback. Underlying JSON now includes `permissions_by_severity` and an `identity.context` discriminator on each `AccessMapEntry`.
- Bounded disk usage for large multi-repo scans (e.g. `--include-contributors --repo-artifacts` against orgs with thousands of repos): cloning, artifact fetching, and scanning now run concurrently through bounded channels, and each cloned repo is removed from the temp directory as soon as its scan completes. On-disk footprint stays roughly `O(num_jobs)` regardless of total repo count instead of growing without bound. `--keep-clones` and `--git-clone-dir` opt out of the per-repo cleanup as before.
- Parallelized `--repo-artifacts` fetching with `buffer_unordered(num_jobs)` so issue/PR/wiki API calls run concurrently and stream into the scan loop, replacing the previous per-repo serial loop that delayed the start of scanning by hours on large fan-outs.
- Streamed `--format json` output as compact one-envelope-per-line so concatenated per-repo emits from the parallel scan path produce valid JSONL that `kingfisher view` can load. Pipe through `jq .` for pretty-printed output.
- Fixed a panic in the lexer when a string literal ends in a trailing backslash (`'... \`); the escape handling now clamps past-EOF so `extract_literal_values` returns instead of slicing out of bounds.
- Added first-class **Postman** scanning target: new `kingfisher scan postman` subcommand (and equivalent `--postman-*` flags) fetches workspaces, collections, and environments via the Postman API and scans them for hard-coded credentials in request `auth` blocks, pre-request/test scripts, saved example responses, and — notably — `secret`-typed environment variables, which the API returns in plaintext despite the UI mask. Selectors: `--workspace`, `--collection`, `--environment`, `--all`, with optional `--include-mocks-monitors` and `--api-url` for self-hosted endpoints. Authenticates via `KF_POSTMAN_TOKEN` (or `POSTMAN_API_KEY`) sent as `X-Api-Key`; honors `X-RateLimit-RetryAfter` on 429s. Findings link back to `https://go.postman.co/...` URLs in reports.
- Fixed [#359](https://github.com/mongodb/kingfisher/issues/359): added `kingfisher.github.9` to detect the new ~520-character stateless GitHub App installation token format (`ghs_<APP_ID>_<JWT>`). The legacy 36-character `ghs_` rule (`kingfisher.github.5`) is retained for older / GHES-issued tokens that are still in circulation.
- Added provider endpoint overrides for validation and revocation via global `--endpoint PROVIDER=URL` and `--endpoint-config FILE`, with built-in support for self-hosted GitHub, GitLab, Gitea, Jira, Confluence, and Artifactory instances.
- **Report viewer cross-tool triage:** when a Kingfisher report is loaded alongside a Gitleaks or TruffleHog report, matching imported findings are enriched with Kingfisher's validation verdict, validation response, validate command, and revoke command. Matching is keyed on `commit + file + line` with a `file + line` fallback, and enriched rows show an "Enriched by Kingfisher" callout in the detail panel plus an "Enriched" chip in the findings table. Added a **Source** column to the findings table; a new **Duplicates Removed by Tool** dashboard panel showing per-tool cards for Kingfisher / TruffleHog / Gitleaks; and an upload-time **Deduplicate findings** toggle (on by default) so users can inspect the raw rows before fingerprint dedup when needed.
- Fixed [#344](https://github.com/mongodb/kingfisher/issues/344): baseline fingerprints no longer have to be hexadecimal. The fingerprint value emitted by scan output (JSON, JSONL, pretty, SARIF) can now be copied directly into a baseline file and will match on the next scan. `--manage-baseline` now writes fingerprints in decimal to match scan output, and legacy 16-char hex (and `0x`-prefixed hex) entries continue to be accepted, so existing baseline files keep working unchanged.
- Expanded the bundled ruleset to **942 rules** (820 standalone detectors + 122 dependent rules), with **484 standalone detectors** now including live HTTP / service-specific validation.
- Documentation: expanded coverage of the **Report Viewer & Triager** across `README.md`, `docs/USAGE.md`, and the docs site (`docs-site/docs/features/report-viewer.md`, `docs-site/docs/usage/basic-scanning.md`). The same viewer is available locally via `kingfisher view <report.json>` and as a hosted static upload-based page at [https://mongodb.github.io/kingfisher/viewer/](https://mongodb.github.io/kingfisher/viewer/). Both forms import Kingfisher, Gitleaks, and TruffleHog JSON/JSONL for cross-tool triage with fingerprint-based deduplication and blast-radius rendering.
- Added archive extraction for three Korean formats: HWPX (Hancom OWPML ZIP container), HWP (Hancom 5.x OLE2/CFBF binary — streams decoded via raw DEFLATE / zlib fallbacks), and EGG (ALZip; registered for enumeration and scanned as raw bytes since no open-source extractor exists).
- Added live HTTP validation for 18 rules across 15 providers: Val Town, Polar, hCaptcha, Thunderstore, Elastic Cloud (2 rules), LlamaCloud, Gemfury (2 rules), Vonage, ThingsBoard, Zapier, Facebook Access Token, GitLab Session Cookie, PostHog Feature Flags, Unkey API Key, and Hop.io (2 rules).
- Added revocation support for 7 rules across 6 providers: Discord webhooks (single-step DELETE), DigitalOcean PATs (self-revoke via OAuth), and multi-step HttpMultiStep revocation for LaunchDarkly, Resend, Linode, and Netlify (2 rules). Built-in revocation coverage is now 34 provider families with 53 revocation-enabled rules.
- **Access Map:** monday.com API tokens (validated `kingfisher.monday.1`) and Asana personal access / OAuth tokens (validated `kingfisher.asana.3`, `kingfisher.asana.4`, `kingfisher.asana.5`). Monday maps the caller via the `me { account, teams }` GraphQL query and enumerates accessible workspaces and boards; Asana resolves the caller via `/users/me` and enumerates accessible workspaces, organizations, projects, and team memberships. Standalone `kingfisher access-map monday` and `kingfisher access-map asana`.
- **Report viewer:** Import Gitleaks and TruffleHog JSON into the bundled local viewer with deduplication for repeated imported findings, and publish a static upload-based viewer on the docs site for GitHub Pages hosting. See `docs/USAGE.md`.
- Fixed parser-based context gating so assignment-style contextual secrets still scan in raw text when parser verification is unavailable, instead of being dropped.
- Fixed dependent-variable pairing for HTTP validation so rules use the nearest helper match in-file, and updated Pinata detection/validation to reliably catch API key IDs, API secrets, and JWTs, including key+secret validation.
- Removed 17 direct dependencies from the root crate by dropping unused deps (`p256`, `ed25519-dalek`, `jsonwebtoken`, `gitlab`, `lazy_static`, `base32`, `pem`, `byteorder`, `reqwest-middleware`, `sha1`, `time`, `ring`, `num_cpus`, `strum_macros`), replacing `once_cell` with `std::sync::{LazyLock, OnceLock}`, and using `std::thread::available_parallelism()` in place of `num_cpus`. Salt generation now uses `rand` instead of `ring`, and all `strum_macros::Display` imports are consolidated under `strum::Display`.
- Migrated the workspace to Rust Edition 2024 (MSRV 1.94) and refactored nested `if let` chains in core/scanner hot paths (content-type detection, origin parsing, GCP/Harness/Azure DevOps access maps, GitHub/GitLab repo parsing, dependent-variable pairing) to use stable let-chains for flatter control flow.
- Tightened lint hygiene by converting stable `#[allow(...)]` attributes to `#[expect(...)]` across the workspace (e.g. `dead_code`, `clippy::too_many_arguments`, `clippy::large_enum_variant`) so the compiler surfaces stale suppressions as warnings.
- Fixed scan performance regression: the rule profiler was unconditionally active even without `--rule-stats`, causing RwLock contention across scan threads. Scans are now ~15% faster than v1.94.0.
- Replaced tree-sitter with a lighter parser-based context verifier built from handwritten lexers plus `tl`/`cssparser`, preserving context-dependent matching while cutting about 19 MB from the release binary.
- Added a `validation: type: Raw` exception path for provider-specific checks, with new raw validators for Azure Batch, FTP, Kraken, LDAP, RabbitMQ, and Redis. Also added stable request-scoped template values plus new Liquid filters for HMAC-SHA384 hex output and timestamp generation.
- Expanded live validation coverage for several built-in rules, including Agora, Bitfinex, DocuSign, Dwolla, GitLab, KuCoin, RingCentral, Snowflake, Tableau, Trello, and Webex. Also tightened newly added helper regex to avoid high-match scan regressions, and made preflight-blocked raw validations report as skipped/not attempted instead of failed.
- Updated vendored `vectorscan-rs` from v0.0.5 (Vectorscan 5.4.11) to v0.0.6 (Vectorscan 5.4.12). The upstream crate now ships pre-extracted sources instead of a tarball+patch, and fixes the `cpu_native` feature flag. Local Windows and musl build patches have been re-applied.
- **Access Map: added 21 new blast radius providers**, bringing the total to 39. New providers: Airtable, Algolia, Artifactory, Auth0, CircleCI, DigitalOcean, Fastly, HubSpot, IBM Cloud, Jira, MySQL, PayPal, Plaid, SendGrid, Sendinblue/Brevo, Shopify, Square, Stripe, Terraform Cloud, JFrog Xray, and Zendesk. Each provider maps leaked credentials to their effective identity, permissions, and exposed resources.
- **Access Map: expanded provider depth** for existing integrations. AWS now enumerates SQS, SNS, RDS, ECR, and SSM Parameter Store in addition to the earlier core services; Azure Storage now maps Blob containers, File shares, and Queues from account keys; OpenAI now enumerates visible models, files, assistants, and fine-tuning jobs; Hugging Face now includes datasets and Spaces alongside models; Anthropic now surfaces visible organization API keys.
- Folded in a set of safe dependency bumps from open maintenance PRs, including `strum`, `sysinfo`, `hmac`, `sha1`, `sha2`, `gitlab`, and `oci-client`, with small compatibility fixes in runtime hashing, system memory detection, and Azure signing code.
- Added Mermaid architecture documentation in `docs/ARCHITECTURE.md`, covering the main Kingfisher components, command paths, and scan flow at a high level.
- Expanded `docs/LIBRARY.md` with Mermaid diagrams showing the relationships and internal structure of `kingfisher-core`, `kingfisher-rules`, and `kingfisher-scanner`.
- Added live HTTP validation for Etsy, JFrog, Octopus Deploy, OpenShift, and Private AI where provider documentation supported reliable token-only checks.
- Added `hmac_sha256_b64key` Liquid filter for HMAC-SHA256 signing with base64-encoded keys (decodes key to raw bytes before signing), enabling correct Azure Notification Hub SAS validation.
- Integrated SLSA v3 provenance generation into the release workflow; hash computation now scopes to build artifacts only for idempotent re-runs.
- Removed Zapier webhook live validation (GET to a catch hook triggers the Zap).
- Hardened Heroku revocation regex to prevent crossing JSON object boundaries when extracting authorization IDs.
- Fixed Zendesk subdomain regex to reject trailing hyphens; renamed `ZENDESK_SUBDOMAIN` to `ZENDESK_HOST` for clarity.
- Fixed Stytch and Polymarket trailing `\b` boundaries that prevented matching base64-padded secrets ending with `=`.
- Tightened Kubernetes API Server URL pattern to require kube-specific identifiers, preventing bootstrap tokens from binding to unrelated `server:` entries.
- Added SSRF protection for credential validation: outbound HTTP requests now block connections to loopback, private, link-local, and other non-public IP addresses. HTTP redirect targets are DNS-resolved and validated against the same SSRF rules. Use `--allow-internal-ips` to opt out when scanning internal infrastructure.
- Remediated current RustSec vulnerability findings by upgrading core dependencies including `gix`, `mysql_async`, `axum`, `indicatif`, `quick-xml`, and `console`.
- Added `make audit-deps` to run `cargo audit` locally and report vulnerable dependencies.
- Refreshed pinned GitHub Actions for `swatinem/rust-cache`, `msys2/setup-msys2`, and `ncipollo/release-action`, and configured Dependabot to ignore selected GitHub Action major-version bumps.
- OpenSSF Scorecard hardening: added `SECURITY.md`, `.github/dependabot.yml`, pinned all GitHub Actions by SHA, fixed dangerous workflow expression injection patterns, added top-level `permissions: {}` to `pypi.yml`, and added SLSA provenance generation for releases.
- Added ClusterFuzzLite integration with four fuzz targets (entropy, location mapping, base64 decoding, span deduplication) and a `make fuzz` target for local fuzzing.
- Added `--max-validation-response-length <BYTES>` for `scan` to control validation response storage truncation (default: `2048`, `0` disables truncation).
- Updated `--full-validation-response` to bypass both validation storage truncation and reporter truncation, preserving complete response bodies end-to-end for parsing/reporting workflows.
- Added Testkube detection/validation coverage with `kingfisher.testkube.*` rules for API keys plus dependent organization/environment IDs used for live API validation.
- Added TOON output for `scan`, `validate`, and `revoke`, optimized for LLM/agent workflows; prefer `--format toon` when calling Kingfisher from an LLM.
- Expanded built-in revocation support with new YAML revocation flows for Cloudflare, Confluent, Doppler, Mapbox, Particle.io, Twitch, and additional Vercel token formats.
- Added revocation coverage documentation: new `docs/REVOCATION_PROVIDERS.md` matrix and README links highlighting supported revocation providers/rule IDs.
- Access Map: added Microsoft Teams provider. Parses Incoming Webhook URLs (legacy and workflow-based) to extract tenant and webhook identity, probes for active status, and reports channel-level blast radius. Supports standalone `access-map microsoftteams` (alias `msteams`) and automatic mapping for validated `kingfisher.msteams.*` and `kingfisher.microsoftteamswebhook.*` findings.
- Added Microsoft Teams scan target: `kingfisher scan teams "QUERY"` searches Teams messages via Microsoft Graph Search API and scans them for secrets, mirroring the Slack integration.
- Requires `KF_TEAMS_TOKEN` environment variable (Microsoft Graph access token with `ChannelMessage.Read.All` or `Chat.Read` permissions).
- Findings reference Teams message URLs in reports; see `docs/USAGE.md` and `docs/INTEGRATIONS.md` for authentication setup.
- Tree-sitter fallback behavior changed to be strictly additive: when parser context is unavailable, findings now fall back to Hyperscan/Vectorscan matches instead of being suppressed.
- Fixed dependent-rule reporting gaps (for example Algolia API keys) by preserving regex findings when tree-sitter is unavailable, while still marking validation as skipped when dependency inputs are missing.
- Expanded parser queries for C, Go, Java, JavaScript, and TypeScript to improve assignment/literal capture coverage (including template/raw string handling in JS/TS/Go).
- Added parser query quality gates: compile-time query validation tests plus fixture-based capture-count regression tests backed by `testdata/parsers/tree_sitter_capture_baseline.json`.
- Added inline-ignore coverage for directives placed on the line immediately before a single-line secret match.
- Updated tree-sitter documentation wording to align with `--turbo` terminology.
- Tree-sitter verification now runs for blobs from `0` bytes up to `128 KiB` (previously `1 KiB` to `64 KiB`), while remaining a post-regex verification step applied only to context-dependent candidate matches from Hyperscan/Vectorscan.
- False-positive reduction: Hyperscan/Vectorscan still scans everything first, then tree-sitter performs a second-pass verification only on auto-classified context-dependent findings; self-identifying/token-explicit findings stay regex-first.
- Hardened Perplexity API key validation to reject auth failures (`401`/`403`) and avoid false "Active Credential" results from error payloads.
- Fixed Yelp API key validation false positives by switching to an auth-enforcing endpoint (`/v3/businesses/search`) and adding explicit auth error guards.
- Tightened regex specificity for newly added rules by replacing broad variable-length token captures with explicit fixed formats/lengths and aligned examples to pass `rules check`.
- GitLab scanning: honor OS-trusted internal CAs without requiring `SSL_CERT_FILE`, and preserve custom GitLab API ports in repository enumeration and artifact fetching.
- Windows builds: replaced `buildwin.bat` flow with Makefile-driven MinGW targets for `windows-x64` and `windows-arm64`, producing static `kingfisher.exe` artifacts packaged as `kingfisher-windows-*.zip` with checksums.
- GitHub Actions (`ci.yml`, `release.yml`): Windows jobs now build and test both x64 and arm64 via a matrix using `make windows-x64` / `make windows-arm64`.
- Report viewer: added `--view-report-port` and `--view-report-address` to `kingfisher scan --view-report`, and `--address` to `kingfisher view`, so the embedded report server can bind to `0.0.0.0` and be reached from the host when running in Docker. Use `--view-report-address 0.0.0.0` with `-p 7890:7890` (or `--view-report-port 7891` with `-p 7891:7891`) to view the HTML report at http://localhost:7890 from your host.
- Updated `kingfisher scan` to accept Git repository URLs as positional targets (for example `kingfisher scan github.com/org/repo` or `kingfisher scan https://gitlab.com/group/project.git`) without requiring `--git-url`.
- Deprecated `--git-url` while preserving backward compatibility; using the flag now emits a migration warning to prefer positional URL targets.
- Updated README/integration/usage/install/demo examples and CLI tests to use positional Git URL scanning syntax.
- Jira scanning: added `kingfisher scan jira --include-comments` and `--include-changelog` to scan per-issue comments and changelog entries, with paginated Jira comment fetching and ADF text normalization preserved for issue/comment content.
- Added `--turbo` mode: sets `--commit-metadata=false`, `--no-base64`, disables language detection, and disables tree-sitter parsing...for maximum scan speed. Findings will omit Git commit context (author, date, commit hash) and will not include Base64-decoded secrets.
- SQLite database scanning: kingfisher now detects and extracts SQLite files (`.db`, `.sqlite`, `.sqlite3`, etc.), dumping each table as SQL text with named columns so secrets stored in database rows are scannable. Extraction is enabled by default and can be disabled with `--no-extract-archives`.
- Python bytecode (.pyc) scanning: extracts string constants from compiled Python (`.pyc`, `.pyo`) files via marshal parsing so secrets embedded in bytecode are scannable. Extraction is enabled by default and can be disabled with `--no-extract-archives`.
- Access Map: added Buildkite provider. Enumerates token scopes, user identity, organizations, and pipelines with severity classification based on scope risk.
- Access Map: added Harness provider. Uses `x-api-key` authentication to enumerate organizations/projects when permitted (best-effort).
- Added Weights & Biases support: new `kingfisher.wandb.2` rule for `wandb_v1_...` keys (legacy `kingfisher.wandb.1` retained), plus Access Map provider/CLI support (`weightsandbiases`, alias `wandb`).
- Reports: always emit `validate`/`revoke` command hints when supported by a rule (no suppression for missing template vars).
- Kingfisher can now generate an auditor-friendly HTML report: `--format html --output kingfisher-audit.html`
- Architecture: split `matcher.rs` into a `src/matcher/` module directory with focused sub-modules (`base64_decode`, `captures`, `conversion`, `dedup`, `filter`, `fingerprint`). Decomposed `filter_match` into smaller validation helpers.
- Architecture: refactored `scanner/runner.rs` god function into phase-based helpers (`enumerate_all_repos`, `fetch_all_artifacts`, `run_sequential_scan`, `run_parallel_scan`, etc.) with a `ValidationDeps` type alias.
- Architecture: consolidated duplicated matching primitives (base64 detection, dedup, fingerprinting, secret capture selection) into `kingfisher-scanner::primitives` as the single source of truth; both the scanner crate and binary now share one implementation.
- Architecture: introduced `TokenAccessMapper` trait for access map providers, implemented for GitHub, GitLab, Slack, HuggingFace, Gitea, and Bitbucket.
- Architecture: moved `content_type` module to `kingfisher-core` crate where it logically belongs (zero binary-crate dependencies).
- Library crates: added an external-consumer integration test (`tests/library_crates_external_project.rs`) and fixed `kingfisher-scanner` manifest wiring by making `serde` a required dependency, ensuring `kingfisher-core`/`kingfisher-rules`/`kingfisher-scanner` compile and run from a non-kingfisher Rust project.
- Added revocation for GitHub App Server-to-Server tokens (`ghs_`, `kingfisher.github.5`) via `DELETE /installation/token`. Note: `ghu_` (user-to-server) tokens cannot be self-revoked; they require the GitHub App's client credentials or manual revocation via GitHub Settings.
- Viewer: replaced the Access Map tree view with a card-based layout showing identity, resource count, permission tags, and token details at a glance with expandable inline detail.
- Viewer: added per-finding Blast Radius section linking findings to their access map entries with an auto-generated risk rationale (critical/high/medium/low) based on credential status, resource count, and permission severity.
- Viewer: added two new report types — Risk Report (findings + blast radius per credential, for researchers/bug bounty) and Scan Report (executive summary + scan metadata + findings table, for defenders/tickets). Both support "Active credentials only" filtering.
- Viewer: redesigned the Access Map export report to match the Scan/Risk report quality with summary stats, per-identity cards, token details, and resource/permission grids.
- Viewer: added scan metadata bar (timestamp, target, duration, version) to the Dashboard view.
- Added Vercel credential rules for new token formats introduced February 2026: `vcp_` (personal access), `vci_` (integration), `vca_` (app access), `vcr_` (app refresh), `vck_` (AI Gateway API key). All use CRC32/Base62 checksum validation. Legacy 24-char format retained as `kingfisher.vercel.1`.
- Added revocation support for Vercel app tokens (`vca_`, `vcr_`) via `https://api.vercel.com/login/oauth/token/revoke`. Requires `VERCEL_APP_CLIENT_ID` (or `NEXT_PUBLIC_VERCEL_APP_CLIENT_ID`) and `VERCEL_APP_CLIENT_SECRET`.
- Fixed validate/revoke command generation to omit regex named captures (e.g., `BODY`, `CHECKSUM`) when they are not used by validation/revocation templates, so rules like Vercel no longer produce unnecessary `--var BODY=...` arguments.
- Fixed HTTP validation incorrectly marking valid credentials as inactive when response bodies exceeded 2048 bytes. Matchers (`JsonValid`, `WordMatch`, etc.) now run against the full response; only the stored preview remains truncated for reporting.
- Added optional validation rate limiting via `--validation-rps` (global) and repeatable `--validation-rps-rule <RULE_SELECTOR=RPS>` (per-rule override) for both `scan` and `validate`. Throttling now applies across built-in validator types (HTTP/gRPC plus AWS, GCP, Coinbase, MongoDB, Postgres, MySQL, JDBC, JWT, and Azure Storage). Rule selectors support the short form (for example, `github=2` matches `kingfisher.github.*`) with longest-prefix precedence when multiple selectors apply.
- Prevented transient HTTP validation failures (429/5xx) from being cached, avoiding cache poisoning that could suppress later successful validations in the same scan.
- Added `kingfisher.temporal.1` rule for Temporal Cloud API keys (namespace-scoped and user-scoped JWT formats) with Temporal-specific pattern matching.
- Added Temporal Cloud active credential validation via `GET https://saas-api.tmprl.cloud/cloud/current-identity` using bearer auth, so Temporal keys validate against provider APIs instead of generic OIDC discovery.
- Fixed JWT issuer normalization to treat bare host issuers (e.g. `iss: "temporal.io"`) as HTTPS URLs during discovery, avoiding low-level URL builder failures.
- Added `crates/kingfisher-rules/build.rs` to ensure embedded rule assets rebuild when files under `crates/kingfisher-rules/data` change.
- Added `--full-validation-response` flag to include complete validation response bodies without truncation. By default, validation responses are still truncated to 512 characters for readability. When enabled, users can parse and present full validation responses as needed (e.g., for GitHub token validation responses that include user metadata beyond the first 512 characters).
- Fixed AWS access key validation to support temporary/session keys (ASIA prefix) in addition to long-lived keys (AKIA prefix).
- Consolidated all validator implementations into the `kingfisher-scanner` crate to eliminate code duplication. Validators for AWS, Azure, Coinbase, GCP, JWT, JDBC, MongoDB, MySQL, Postgres, and HTTP are now maintained in a single location with proper feature gating.
- Added "Skipped Validations" counter to scan summary output to distinguish between validations that failed (HTTP errors, connection failures) and validations that were skipped due to missing preconditions (e.g., missing dependent rules). This provides better visibility into validation coverage for large scans.
- Improved error messages for `kingfisher validate` command when rules require dependent variables from `depends_on` sections. Now clearly explains which variables are needed and from which dependent rules they are normally captured.
- Fixed `validate_command` and `revoke_command` generation in scan output to include all required `--var` arguments for rules with `depends_on` sections (e.g., PubNub, Azure Storage). Commands now include dependent variables like `--var SUBSCRIPTIONTOKEN=<value>` or `--var AZURENAME=<value>`.
- Updated Azure Storage validation to use `AZURENAME` variable (matching the `depends_on_rule` configuration) with `STORAGE_ACCOUNT` maintained as a backward-compatible alias.
- Added internal `dependent_captures` field to match records to preserve variables from dependent rules through the validation pipeline for accurate command generation.
- Added `--tls-mode <strict|lax|off>` global flag to control TLS certificate validation behavior during credential validation:
-`strict` (default): Full WebPKI certificate validation with trusted CA chains, hostname verification, and expiration checks
-`lax`: Accept self-signed or unknown CA certificates, useful for database connections (PostgreSQL, MySQL, MongoDB) and services using private CAs (e.g., Amazon RDS)
-`off`: Disable all TLS validation (equivalent to legacy `--ignore-certs`)
- Added rule-level `tls_mode` field allowing individual rules to opt into relaxed TLS validation when appropriate. Rules for PostgreSQL, MySQL, MongoDB, JDBC, and JWT now include `tls_mode: lax` by default.
- The `--ignore-certs` flag remains supported as a deprecated alias for `--tls-mode=off` for backward compatibility.
- Updated documentation to explain TLS validation modes and their security implications.
- Added comprehensive test coverage for TLS mode functionality including unit tests, integration tests, and rule configuration verification.
- Added `validate_command` and `revoke_command` fields to scan output (pretty, JSON, JSONL, BSON, SARIF formats) showing the exact `kingfisher validate` or `kingfisher revoke` command to run for each finding. The `validate_command` is included for all findings with validation support; `revoke_command` is included only for active credentials with revocation support. These fields are omitted when `--redact` is used since they contain the secret value.
- Added `kingfisher-auto` pre-commit hook that automatically downloads and caches the appropriate binary for your platform (no Docker or manual installation required).
- Added `kingfisher-pre-commit-auto.sh` and `kingfisher-pre-commit-auto.ps1` scripts for automatic binary download in Git hooks (Linux, macOS, Windows support).
- Fixed validation deduplication for rules with nested unnamed captures (e.g. `(?<REGEX>...(ABC|DEF)...)`) to use the primary capture for grouping, ensuring each unique match triggers a separate validation request.
- Added trace-level (`-vv`) logging for internal validation dedup keys and grouping to aid debugging.
- Skipped per-repository report writes when an output file is specified and emit a single aggregated report after multi-repository scans to preserve full output content in files.
- Will now prefer git history findings when identical secrets appear in both current files and git history (dedup only).
- Fixed report viewer to add support for opening JSONL.
- Add opt-in contributor repository enumeration for GitHub/GitLab `--git-url` scans with `--include-contributors`, plus `--repo-clone-limit` to cap repo cloning.
- Add `--git-clone-dir` to set the parent clone directory and `--keep-clones` to preserve cloned repos after scans.
- Added several new rules.
- Added configurable validation timeout and retry settings for `kingfisher scan`.
- Added `--staged` argument to support new `pre-commit` behavior and added integration coverage to ensure validated secrets block commits when used as pre-commit hook
- Added new rules for AWS Bedrock, Voyage.ai, Posthog, Atlassian
- Added an embedded web-based report and access-map viewer via `kingfisher view` subcommand that can load JSON or JSONL reports passed on the CLI (or upload them in the browser)
- Reduced per-match memory usage by compacting stored source locations and interning repeated capture names.
- Stored optional validation response bodies as boxed strings to avoid allocating empty payloads and to streamline validator caches.
- Parallelized git cloning based on the configured job count and begin scanning repositories as soon as each clone finishes to reduce end-to-end scan times.
- Combined per-repository results into a single aggregate summary after scans complete.
- Added initial access-map support and report viewer html file. Currently beta features.
- Filter out empty 'KF_BITBUCKET_*' environment values when constructing the Bitbucket authentication configuration so blank variables no longer override valid credentials
- Added `pattern_requirements` checks to rules, providing lightweight post-regex character-class validation without lookarounds. See docs/RULES.md for detail
- Added an `ignore_if_contains` option to `pattern_requirements` to drop matches containing case-insensitive placeholder words, with tests covering the new behavior.
- Added checksum comparisons to `pattern_requirements`, new `suffix`, `crc32`, and `base62` Liquid filters, and verbose logging so mismatched checksums are skipped with context rather than reported as findings.
- Split GitHub token detections into fine-grained/fixed-format variants and enforce checksum validation for modern GitHub token families (PAT, OAuth, App, refresh) while preserving legacy coverage.
- Fixed local filesystem scans to keep `open_path_as_is` enabled when opening Git repositories and only disable it for diff-based scans.
- Created Linux and Windows specific installer script
- Updated diff-focused scanning so `--branch-root-commit` can be provided alongside `--branch`, letting you diff from a chosen commit while targeting a specific branch tip (still defaulting back to the `--branch` ref when the commit is omitted).
- Removed the `--bitbucket-username`, `--bitbucket-token`, and `--bitbucket-oauth-token` flags in favour of `KF_BITBUCKET_*` environment variables when authenticating to Bitbucket.
- Added provider-specific `kingfisher scan` subcommands (for example `kingfisher scan github …`) that translate into the legacy flags under the hood. The new layout keeps backwards compatibility while removing the wall of provider options from `kingfisher scan --help`.
- Updated the README so every provider example (GitHub, GitLab, Bitbucket, Azure Repos, Gitea, Hugging Face, Slack, Jira, Confluence, S3, GCS, Docker) uses the new subcommand style.
- Legacy provider flags (for example `--github-user`, `--gitlab-group`, `--bitbucket-workspace`, `--s3-bucket`) still work but now emit a deprecation warning to encourage migration to the new `kingfisher scan <provider>` flow.
- Kept the direct `kingfisher scan /path/to/dir` flow for local filesystem / local git repo scans while adding a `--list-only` switch to each provider subcommand so repository enumeration no longer requires the standalone `github repos`, `gitlab repos`, etc. commands.
- Removed the legacy top-level provider commands (`kingfisher github`, `kingfisher gitlab`, `kingfisher gitea`, `kingfisher bitbucket`, `kingfisher azure`, `kingfisher huggingface`) now that enumeration lives under `kingfisher scan <provider> --list-only`.
- Fixed `kingfisher scan github …` (and other provider-specific subcommands) so they no longer demand placeholder path arguments before the CLI accepts the request.
- Fixed `kingfisher scan` so that providing `--branch` without `--since-commit` now diffs the branch against the empty tree and scans every commit reachable from that branch.
- Added first-class Hugging Face scanning support, including CLI enumeration, token authentication, and integration with remote scans.
- Condensed GitError formatting to report the exit status and the first informative lines from stdout/stderr, producing concise git clone failure logs.
- Added support for scanning Google Cloud Storage buckets via `--gcs-bucket`, including optional prefixes and service-account authentication.
- Added `--skip-aws-account` (now accepting comma-separated values) and `--skip-aws-account-file` to bypass live AWS validation for known canary/honey-token account IDs without triggering alerts. Kingfisher now ships with several canary AWS account IDs pre-seeded in the skip list and now reports matching findings as "Not Attempted" with the "Response" containing "(skip list entry)" so it's clear that validation was intentionally skipped and why.
- Added: repeatable `--ignore-comment <TOKEN>` flag to reuse inline directives from other scanners (for example `NOSONAR`, `kics-scan ignore`, `gitleaks:allow`, etc)
- Respect user color settings in update messages by using the same color helper as the main reporter, ensuring consistent output and no ANSI codes on update check, when color is disabled
- Added first-class Gitea support, including CLI commands, environment-based authentication, documentation, and integration with scans and repository enumeration.
- Enabled ANSI formatting in the tracing formatter whenever stderr is attached to a terminal so colorized updater messages render correctly instead of showing escape sequences.
- Added diff-only Git scanning via `--since-commit` and `--branch`, including remote-aware ref resolution so CI jobs can pair `--git-url` clones with pull request branches
- Added `--github-exclude` and `--gitlab-exclude` options to skip specific repositories when scanning or listing GitHub and GitLab sources, including support for gitignore-style glob patterns
- Replaced quadratic match filtering with a per-rule span map, fixing missed secrets in extremely large files and improving scan performance
- Support scanning extremely large files by chunking input into 1GiB segments with small overlaps, avoiding vectorscan buffer limits while preserving match offsets
- Always use chunked vectorscan, eliminating the slow regex fallback for blobs over 4GiB
- Skip Base64 scanning for blobs over 64 MB to avoid a second pass over massive files
- Increased max-file-size default to 64 MB (up from 25 MB)
- Decode Base64 blobs and scan their contents for secrets while skipping short strings for performance. This has a small performance impact and can be disabled with `--no-base64`
- Added rules for sendbird, mattermost, langchain, notion
- JWT validation hardened to reject alg:none by default (only allowed if explicitly configured), require iss for OIDC/JWKS verification, ensuring "Active Credential" means cryptographically verified and time-valid, not just unexpired
- Updated the Git cloning logic to include all refs and minimize clone output, allowing Kingfisher to analyze pull request and deleted branch history
- Expanded directory exclusion handling to interpret plain patterns as prefixes, ensuring options like --exclude .git also skip all nested paths
- Updated baseline management to track encountered findings and remove entries that are no longer present, saving the baseline file whenever entries are pruned or new matches are added
- Dropped the “prevalidated” flag from rule definitions and validation logic so every finding now flows through the standard active/inactive/unknown pipeline, simplifying rule configuration and preventing special‑case bypasses
- GitLab: Matched GitLab group repository listings to glab by only enumerating projects that belong directly to each group, without automatically traversing subgroups
- Remote scans with `--git-history=none` now clone repositories with a working tree and scan the current files instead of erroring with "No inputs to scan".
- New `--self-update` flag installs updates when available
- New `--no-update-check` flag disables update checks
- Updated rules
## [1.11.0] 2025-06-21
- Increased default value for number of scanning jobs to improve validation speed
- Fixed issue where some API responses (e.g. GitHub's `/user` endpoint) include required fields like `"name"` beyond the first 512 bytes. Truncating earlier causes `WordMatch` checks to fail even for active credentials. Increased the limit to keep a larger slice of the body while still bounding memory usage.
## [1.10.0] 2025-06-20
- Updated de-dupe fingerprint to include the content of the match