MongoDB Kingfisher - secret detection and live validation
  • Rust 95.9%
  • Makefile 1.1%
  • Python 1%
  • Shell 0.7%
  • PowerShell 0.5%
  • Other 0.7%
Find a file
Mick Grove f028180871 Merge remote-tracking branch 'origin/main' into development
* origin/main:
  fix(digitalocean): regex is case-sensitive
  feat(adafruitio): improve pattern matching
2025-07-14 13:27:18 -07:00
.github Updated github actions to extract only the latest changelog entry. Added --rule-stats flag to display rule performance 2025-06-25 17:23:35 -07:00
data Merge remote-tracking branch 'origin/main' into development 2025-07-14 13:27:18 -07:00
docs Added baseline feature with --baseline-file and --manage-baseline flags. Introduced --exclude option for skipping paths 2025-07-14 13:18:24 -07:00
src Added baseline feature with --baseline-file and --manage-baseline flags. Introduced --exclude option for skipping paths 2025-07-14 13:18:24 -07:00
testdata Ensuring temp files are cleaned up. Applying visual style to the update check output 2025-06-26 09:45:14 -07:00
tests Added baseline feature with --baseline-file and --manage-baseline flags. Introduced --exclude option for skipping paths 2025-07-14 13:18:24 -07:00
vendor/vectorscan-rs preparing for v1.12 2025-06-24 17:17:16 -07:00
.gitattributes preparing for v1.12 2025-06-24 17:17:16 -07:00
.gitignore preparing for v1.12 2025-06-24 17:17:16 -07:00
buildwin.bat preparing for v1.12 2025-06-24 17:17:16 -07:00
Cargo.toml Added baseline feature with --baseline-file and --manage-baseline flags. Introduced --exclude option for skipping paths 2025-07-14 13:18:24 -07:00
CHANGELOG.md Added baseline feature with --baseline-file and --manage-baseline flags. Introduced --exclude option for skipping paths 2025-07-14 13:18:24 -07:00
evergreen.yml preparing for v1.12 2025-06-24 17:17:16 -07:00
LICENSE preparing for v1.12 2025-06-24 17:17:16 -07:00
Makefile Updated README 2025-06-28 07:08:22 -07:00
nextest.toml preparing for v1.12 2025-06-24 17:17:16 -07:00
NOTICE Restored --version cli argument. Added a test for it 2025-07-01 10:34:14 -07:00
README.md Added baseline feature with --baseline-file and --manage-baseline flags. Introduced --exclude option for skipping paths 2025-07-14 13:18:24 -07:00
rustfmt.toml preparing for v1.12 2025-06-24 17:17:16 -07:00
THIRD_PARTY_NOTICES preparing for v1.12 2025-06-24 17:17:16 -07:00

Kingfisher

Kingfisher Logo

License

Kingfisher is a blazingly fast secretscanning and validation tool built in Rust. It combines Intels hardwareaccelerated Hyperscan regex engine with languageaware parsing via TreeSitter, and ships with hundreds of builtin rules to detect, validate, and triage secrets before they ever reach production

Kingfisher originated as a fork of Nosey Parker by Praetorian Security, Inc, and is built atop their incredible work and the work contributed by the Nosey Parker community.

Kingfisher extends Nosey Parker with live secret validation via cloud-provider APIs, augments regex detection with tree-sitter for code parsing, adds GitLab support, and builds a Windows x64 binary.

MongoDB Blog: Introducing Kingfisher: Real-Time Secret Detection and Validation

Key Features

  • Performance: Multithreaded, Hyperscanpowered scanning for massive codebases
  • LanguageAware Accuracy: AST parsing in 20+ languages via TreeSitter reduces contextless regex matches. see docs/PARSING.md
  • Built-In Validation: Hundreds of built-in detection rules, many with live-credential validators that call the relevant service APIs (AWS, Azure, GCP, Stripe, etc.) to confirm a secret is active. You can extend or override the library by adding YAML-defined rules on the command line—see docs/RULES.md for details
  • Git History Scanning: Scan local repos, remote GitHub/GitLab orgs/users, or arbitrary GitHub/GitLab repos

Getting Started

Installation

On macOS, you can simply

brew install kingfisher

Pre-built binaries are also available on the Releases section of this page.

Or you may compile for your platform via make:

# NOTE: Requires Docker
make linux
# macOS
make darwin
# Windows x64 --- requires building from a Windows host with Visual Studio installed
./buildwin.bat -force
# Build all targets
make linux-all # builds both x64 and arm64
make darwin-all # builds both x64 and arm64
make all # builds for every OS and architecture supported

Write Custom Rules!

Kingfisher ships with hundreds of rules with HTTP and servicespecific validation checks (AWS, Azure, GCP, etc.) to confirm if a detected string is a live credential.

However, you may want to add your own custom rules, or modify a detection to better suit your needs / environment.

First, review docs/RULES.md to learn how to create custom Kingfisher rules.

Once you've done that, you can provide your custom rules (defined in a YAML file) and provide it to Kingfisher at runtime --- no recompiling required!

Usage

Basic Examples

Note

  kingfisher scan detects whether the input is a Git repository or a plain directory—no extra flags required.

Scan with secret validation

kingfisher scan /path/to/code
## NOTE: This path can refer to:
# 1. a local git repo
# 2. a directory with many git repos
# 3. or just a folder with files and subdirectories

## To explicitly prevent scanning git commit history add:
#   `--git-history=none`

Scan a directory containing multiple Git repositories

kingfisher scan /projects/monorepodir

Scan a Git repository without validation

kingfisher scan ~/src/myrepo --no-validate

Display only secrets confirmed active by thirdparty APIs

kingfisher scan /path/to/repo --only-valid

Output JSON and capture to a file

kingfisher scan . --format json | tee kingfisher.json

Output SARIF directly to disk

kingfisher scan /path/to/repo --format sarif --output findings.sarif

Pipe any text directly into Kingfisher by passing -

cat /path/to/file.py | kingfisher scan -

Scan using a rule family with one flag

_(prefix matching: --rule kingfisher.aws loads kingfisher.aws._)*

# Only apply AWS-related rules (kingfisher.aws.1 + kingfisher.aws.2)
kingfisher scan /path/to/repo --rule kingfisher.aws

Display rule performance statistics

kingfisher scan /path/to/repo --rule-stats

Scan while ignoring likely test files

# Scan source but skip likely unit / integration tests
kingfisher scan ./my-project --ignore-tests

Exclude specific paths

# Skip all Python files and any directory named tests
kingfisher scan ./my-project \
  --exclude '*.py' \
  --exclude tests

If you want to know which files are being skipped, enable verbose debugging (-v) when scanning, which will report any files being skipped by the baseline file (or via --exclude):

# Skip all Python files and any directory named tests, and report to stderr any skipped files
kingfisher scan ./my-project \
  --exclude '*.py' \
  --exclude tests \
  -v

Scanning GitHub

Scan GitHub organisation (requires KF_GITHUB_TOKEN)

kingfisher scan --github-organization my-org

Scan remote GitHub repository

kingfisher scan --git-url https://github.com/org/repo.git

# Optionally provide a GitHub Token
KF_GITHUB_TOKEN="ghp_…" kingfisher scan --git-url https://github.com/org/private_repo.git


Scanning GitLab

Scan GitLab group (requires KF_GITLAB_TOKEN)

kingfisher scan --gitlab-group my-group

Scan GitLab user

kingfisher scan --gitlab-user johndoe

Scan remote GitLab repository by URL

kingfisher scan --git-url https://gitlab.com/group/project.git

List GitLab repositories

kingfisher gitlab repos list --group my-group

Environment Variables for Tokens

Variable Purpose
KF_GITHUB_TOKEN GitHub Personal Access Token
KF_GITLAB_TOKEN GitLab Personal Access Token

Set them temporarily per command:

KF_GITLAB_TOKEN="glpat-…" kingfisher scan --gitlab-group my-group

Or export for the session:

export KF_GITLAB_TOKEN="glpat-…"

If no token is provided Kingfisher still works for public repositories.


Exit Codes

Code Meaning
0 No findings
200 Findings discovered
205 Validated findings discovered

Update Checks

Kingfisher automatically queries GitHub for a newer release when it starts and tells you whether an update is available.

  • Hands-free updates Add --self-update to any Kingfisher command

    • If a newer version exists, Kingfisher will download it, replace the running binary, and re-launch itself with the exact same arguments.
    • If the update fails or no newer release is found, the current run proceeds as normal
  • Disable version checks Pass --no-update-check to skip both the startup and shutdown checks entirely


List Builtin Rules

kingfisher rules list

To scan using only your own my_rules.yaml you could run:

kingfisher scan \
  --load-builtins=false \
  --rules-path path/to/my_rules.yaml \
  ./src/

To add your rules alongside the builtins:

kingfisher scan \
  --rules-path ./custom-rules/ \
  --rules-path my_rules.yml \
  ~/path/to/project-dir/

Other Examples

# Check custom rules - this ensures all regular expressions compile, and can match the rule's `examples` in the YML file
kingfisher rules check --rules-path ./my_rules.yml

# List GitHub repos
kingfisher github repos list --user my-user
kingfisher github repos list --organization my-org

Notable Scan Options

  • --no-dedup: Report every occurrence of a finding (disable the default de-duplicate behavior)
  • --confidence <LEVEL>: (low|medium|high)
  • --min-entropy <VAL>: Override default threshold
  • --no-binary: Skip binary files
  • --no-extract-archives: Do not scan inside archives
  • --extraction-depth <N>: Specifies how deep nested archives should be extracted and scanned (default: 2)
  • --redact: Replaces discovered secrets with a one-way hash for secure output
  • --ignore-tests:Skip files or directories whose path component contains test, spec, fixture, example, or sample (case-insensitive)
  • --exclude <PATTERN>: Skip any file or directory whose path matches this glob pattern (repeatable, uses gitignore-style syntax)
  • --baseline-file <FILE>: Ignore matches listed in a baseline YAML file
  • --manage-baseline: Create or update the baseline file with current findings

Build a Baseline / Detect New Secrets

There are situations where a repository already contains checkedin secrets, but you want to ensure no new secrets are introduced. A baseline file lets you document the known findings so future scans only report anything that is not already in that list.

The easiest way to create a baseline is to run a normal scan with the --manage-baseline flag (typically at a low confidence level to capture all potential matches):

kingfisher scan /path/to/code \
  --confidence low \
  --manage-baseline \
  --baseline-file ./baseline-file.yml

Use the same YAML file with the --baseline-file option on future scans to hide all recorded findings:

kingfisher scan /path/to/code \
  --baseline-file /path/to/baseline-file.yaml

See (docs/BASELINE.md) for full detail.

Finding Fingerprint

The document below details the four-field formula (rule SHA-1, origin label, start & end offsets) hashed with XXH3-64 to create Kingfishers 64-bit finding fingerprint, and explains how this ID powers safe deduplication; plus how --no-dedup can be used shows every raw match. See (docs/FINGERPRINT.md)

Rule Performance Profiling

Use --rule-stats to collect timing information for every rule. After scanning, the summary prints a Rule Performance Stats section showing how many matches each rule produced along with its slowest and average match times. Useful when creating rules or debugging rules.

CLI Options

kingfisher scan --help

Business Value

By integrating Kingfisher into your development lifecycle, you can:

  • Prevent Costly Breaches
    Early detection of embedded credentials avoids expensive incident response, legal fees, and reputation damage
  • Automate Compliance
    Enforce secretscanning policies across GitOps, CI/CD, and pull requests to help satisfy SOC 2, PCIDSS, GDPR, and other standards
  • Reduce Noise, Focus on Real Threats
    Validation logic filters out false positives and highlights only active, valid secrets (--only-valid)
  • Accelerate Dev Workflows
    Run in parallel across dozens of languages, integrate with GitHub Actions or any pipeline, and shift security left to minimize delays

The Risk of Leaked Secrets

Embedding credentials in code repositories is a pervasive, everpresent risk that leads directly to data breaches:

  1. Uber (2016)

    • Incident: Attackers stole GitHub credentials, retrieved an AWS key from a developers private repo, and accessed data on 57 million riders and 600 000 drivers.
    • Sources: BBC News, Ars Technica
  2. AWS

    • Incident: An AWS engineer accidentally published log files and CloudFormation templates containing AWS key pairs (including “rootkey.csv”) to a public GitHub repo.
    • Sources: The Register, UpGuard
  3. Infosys

    • Incident: Infosys published an internal PyPI package embedding a FullAdminAccess AWS key for a Johns Hopkins data bucket; the key remained active for over a year.
    • Sources: The Stack, Tom Forbes Blog
  4. Microsoft

    • Incident: Microsofts AI research GitHub repo included an overly permissive Azure SAS token, exposing 38 TB of private data (workstation backups, 30,000+ Teams messages).
    • Sources: Wiz Blog, TechCrunch
  5. GitHub

    • Incident: GitHub discovered its RSA SSH host private key was briefly exposed in a public repository and rotated it out of caution.
    • Sources: GitHub Blog

Left unchecked, leaked secrets can lead to unauthorized access, pivoting within your environment, regulatory fines, and branddamaging incident response costs.

Benchmark Results

See (docs/COMPARISON.md)

Roadmap

  • More rules
  • Auto-updater
  • Packages for Linux (deb, rpm)
  • Please file a feature request if you have specific features you'd like added

License

Apache2 License