kingfisher/README.md

689 lines
29 KiB
Markdown
Raw Normal View History

2026-02-24 12:25:12 -07:00
# Kingfisher: Open Source Secret Scanner with Live Validation
2025-06-24 17:17:16 -07:00
<p align="center">
<img src="docs/kingfisher_logo.png" alt="Kingfisher Logo" width="126" height="173" style="vertical-align: right;" />
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
2026-03-07 21:31:02 -08:00
[![Detection Rules](https://img.shields.io/badge/Detection%20Rules-540-2ea043.svg)](https://github.com/mongodb/kingfisher)<br>
2025-10-31 16:32:35 -07:00
[![ghcr downloads](https://ghcr-badge.elias.eu.org/shield/mongodb/kingfisher/kingfisher)](https://github.com/mongodb/kingfisher/pkgs/container/kingfisher)<br>
2026-02-24 12:25:12 -07:00
Kingfisher is an open source secret scanner and **live secret validation** tool built in Rust.
2025-06-24 17:17:16 -07:00
2026-02-24 12:25:12 -07:00
It combines Intel's SIMD-accelerated regex engine (Hyperscan) with language-aware parsing to achieve high accuracy at massive scale, and **ships with hundreds of built-in rules** to detect, **validate**, and triage leaked API keys, tokens, and credentials before they ever reach production.
2025-10-31 16:21:23 -07:00
2026-02-24 12:25:12 -07:00
Designed for offensive security engineers and blue-team defenders alike, Kingfisher helps you scan repositories, cloud storage, chat, docs, and CI pipelines to find and verify exposed secrets quickly.
2026-01-01 22:24:57 -08:00
2025-10-31 16:21:23 -07:00
</p>
2026-01-31 21:54:08 -08:00
**Learn more:** [Introducing Kingfisher: RealTime Secret Detection and Validation](https://www.mongodb.com/blog/post/product-release-announcements/introducing-kingfisher-real-time-secret-detection-validation)
2026-02-24 12:25:12 -07:00
## What Is Kingfisher?
Kingfisher is a high-performance, open source secret detection tool for source code and developer platforms. If you are searching for a "GitHub secret scanner," "API key scanner," "token leak detection," or "Git secrets scanner," this project is built for that workflow.
- Scan code, Git history, and integrated platforms (GitHub, GitLab, Azure Repos, Bitbucket, Gitea, Hugging Face, Jira, Confluence, Slack, Docker, AWS S3, and Google Cloud Storage)
- Validate discovered credentials against provider APIs to reduce false positives
- Revoke supported secrets directly from the CLI
- Generate JSON, SARIF, and HTML outputs for security teams, compliance, and CI
2025-06-24 17:17:16 -07:00
## Key Features
2025-09-23 16:23:12 -07:00
2025-10-05 16:37:15 -07:00
### Multiple Scan Targets
2025-10-05 16:38:10 -07:00
<div align="center">
2025-10-05 16:37:15 -07:00
| Files / Dirs | Local Git | GitHub | GitLab | Azure Repos | Bitbucket | Gitea | Hugging Face |
|:-------------:|:----------:|:------:|:------:|:-------------:|:----------:|:------:|:-------------:|
| <img src="./docs/assets/icons/files.svg" height="40" alt="Files / Dirs"/><br/><sub>Files / Dirs</sub> | <img src="./docs/assets/icons/local-git.svg" height="40" alt="Local Git"/><br/><sub>Local Git</sub> | <img src="./docs/assets/icons/github.svg" height="40" alt="GitHub"/><br/><sub>GitHub</sub> | <img src="./docs/assets/icons/gitlab.svg" height="40" alt="GitLab"/><br/><sub>GitLab</sub> | <img src="./docs/assets/icons/azure-devops.svg" height="40" alt="Azure Repos"/><br/><sub>Azure Repos</sub> | <img src="./docs/assets/icons/bitbucket.svg" height="40" alt="Bitbucket"/><br/><sub>Bitbucket</sub> | <img src="./docs/assets/icons/gitea.svg" height="40" alt="Gitea"/><br/><sub>Gitea</sub> |<img src="./docs/assets/icons/huggingface.svg" height="40" width="40" alt="Hugging Face"/><br/><sub>Hugging Face</sub> |
2025-10-05 16:37:15 -07:00
| Docker | Jira | Confluence | Slack | AWS S3 | Google Cloud |
|:------:|:----:|:-----------:|:-----:|:------:|:---:|
| <img src="./docs/assets/icons/docker.svg" height="40" alt="Docker"/><br/><sub>Docker</sub> | <img src="./docs/assets/icons/jira.svg" height="40" alt="Jira"/><br/><sub>Jira</sub> | <img src="./docs/assets/icons/confluence.svg" height="40" alt="Confluence"/><br/><sub>Confluence</sub> | <img src="./docs/assets/icons/slack.svg" height="40" alt="Slack"/><br/><sub>Slack</sub> | <img src="./docs/assets/icons/aws-s3.svg" height="40" alt="AWS S3"/><br/><sub>AWS&nbsp;S3</sub> | <img src="./docs/assets/icons/gcs.svg" height="40" alt="Google Cloud Storage"/><br/><sub>Cloud Storage</sub> |
2025-09-23 16:18:44 -07:00
2025-10-05 16:38:10 -07:00
</div>
2025-10-05 16:37:15 -07:00
### Performance, Accuracy, and Hundreds of Rules
2025-07-29 19:51:02 -07:00
- **Performance**: multithreaded, Hyperscanpowered scanning built for huge codebases
- **Extensible rules**: hundreds of built-in detectors plus YAML-defined custom rules ([docs/RULES.md](/docs/RULES.md))
2026-02-10 19:43:34 -08:00
- **Validate & Revoke**: live validation of discovered secrets, plus direct revocation for supported platforms (GitHub, GitLab, Slack, AWS, GCP, and more) ([docs/USAGE.md](/docs/USAGE.md))
2026-01-01 22:24:57 -08:00
- **Blast Radius Mapping**: instantly map leaked keys to their effective cloud identities and exposed resources with `--access-map`. Supports AWS, GCP, Azure, GitHub, Gitlab, and more token support coming.
- **Broad AI SaaS coverage**: finds and validates tokens for OpenAI, Anthropic, Google Gemini, Cohere, AWS Bedrock, Voyage AI, Mistral, Stability AI, Replicate, xAI (Grok), Ollama, Langchain, Perplexity, Weights & Biases, Cerebras, Friendli, Fireworks.ai, NVIDIA NIM, Together.ai, Zhipu, and many more
- **Compressed Files**: Supports extracting and scanning compressed files for secrets
- **SQLite Database Scanning**: Automatically extracts and scans SQLite database contents for secrets stored in table rows
- **Python Bytecode (.pyc) Scanning**: Extracts and scans string constants from compiled Python (`.pyc`, `.pyo`) files
- **Baseline management**: generate and track baselines to suppress known secrets ([docs/BASELINE.md](/docs/BASELINE.md))
- **Checksum-aware detection**: verifies tokens with built-in checksums (e.g., GitHub, Confluent, Zuplo) — no API calls required
2026-01-31 21:54:08 -08:00
- **Built-in Report Viewer**: Visualize and triage findings locally with `kingfisher view ./report-file.json`
- **Audit reporting**: Generate compliance-oriented HTML reports with scan metadata and validation ordering
- **Library crates**: Embed Kingfisher's scanning engine in your own Rust applications ([docs/LIBRARY.md](docs/LIBRARY.md))
2025-06-24 17:17:16 -07:00
# Benchmark Results
See ([docs/COMPARISON.md](docs/COMPARISON.md))
<p align="center">
<img src="docs/runtime-comparison.png" alt="Kingfisher Runtime Comparison" style="vertical-align: center;" />
</p>
2025-12-16 21:13:00 -08:00
## Basic Usage Demo
2026-01-01 22:24:57 -08:00
```bash
kingfisher scan /path/to/scan --view-report
```
NOTE: Replay has been slowed down for demo
2026-02-24 12:25:12 -07:00
![Kingfisher secret scanning demo](docs/kingfisher-usage-01.gif)
2025-12-16 21:13:00 -08:00
## Report Viewer Demo
2026-01-31 21:54:08 -08:00
Explore Kingfisher's built-in report viewer and its `--access-map`, which can show what the token (AWS, GCP, Azure, GitHub, GitLab, and Slack...more coming) can actually access.
2026-01-01 22:24:57 -08:00
Note: when you pass `--view-report`, Kingfisher starts a web server on port `7890` (default) and opens it in your default browser. By default it binds to `127.0.0.1` for security. You'll see this near the end of the scan output, and **Kingfisher will keep running** until you stop it.
2026-01-02 13:04:30 -08:00
```bash
INFO kingfisher::cli::commands::view: Starting access-map viewer address=127.0.0.1:7890
Serving access-map viewer at http://127.0.0.1:7890 (Ctrl+C to stop)
```
**Usage:**
2026-01-01 22:24:57 -08:00
```bash
kingfisher scan /path/to/scan --access-map --view-report
```
2025-12-16 21:13:00 -08:00
2026-02-24 12:25:12 -07:00
![Kingfisher access map and report viewer demo](docs/kingfisher-usage-access-map-01.gif)
2026-01-02 12:49:58 -08:00
**Click to view video**
[![Demo](docs/demos/findings-thumbnail.png)](https://github.com/user-attachments/assets/d33ee7a6-c60a-4e42-88e0-ac03cb429a46)
2025-12-16 21:13:00 -08:00
# Table of Contents
2026-02-24 12:25:12 -07:00
- [What Is Kingfisher?](#what-is-kingfisher)
2026-01-31 21:54:08 -08:00
- [Key Features](#key-features)
- [Compliance and Audit-Ready Scans](#compliance-and-audit-ready-scans)
2025-09-05 07:54:50 -07:00
- [Benchmark Results](#benchmark-results)
- [Getting Started](#getting-started)
2026-01-31 21:54:08 -08:00
- [Quick Start](#quick-start)
2025-09-05 07:54:50 -07:00
- [Installation](#installation)
2026-01-31 21:54:08 -08:00
- [Detection Rules](#detection-rules)
- [Usage Examples](#usage-examples)
- [Platform Integrations](#platform-integrations)
- [Environment Variables](#environment-variables)
- [Advanced Features](#advanced-features)
- [Documentation](#documentation)
- [Library Usage](#library-usage)
2025-09-05 07:54:50 -07:00
- [Roadmap](#roadmap)
- [License](#license)
# Getting Started
2026-01-31 21:54:08 -08:00
## Quick Start
2026-02-07 09:03:57 -08:00
### 1: Install Kingfisher ([INSTALLATION.md](docs/INSTALLATION.md))
2025-06-28 07:08:22 -07:00
2025-06-24 17:17:16 -07:00
```bash
2026-02-07 09:03:57 -08:00
# Homebrew (Linux/macOS)
2025-06-24 17:17:16 -07:00
brew install kingfisher
2026-02-07 09:12:50 -08:00
# Or install from PyPI with uv
uv tool install kingfisher-bin
2026-01-31 21:54:08 -08:00
# Or use the install script (Linux/macOS)
curl -sSL https://raw.githubusercontent.com/mongodb/kingfisher/main/scripts/install-kingfisher.sh | bash
2026-02-07 09:03:57 -08:00
# Or use PowerShell based install script on Windows
Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass -Force
2026-02-07 09:21:41 -08:00
Invoke-WebRequest -Uri 'https://raw.githubusercontent.com/mongodb/kingfisher/main/scripts/install-kingfisher.ps1' -OutFile install-kingfisher.ps1
./install-kingfisher.ps1
# Or run with Docker (no install required)
docker run --rm -v "$PWD":/src ghcr.io/mongodb/kingfisher:latest scan /src
2025-07-29 10:12:40 -07:00
```
2026-02-07 09:03:57 -08:00
### 2: Scan a directory for secrets ([USAGE.md](/docs/USAGE.md))
```bash
2026-01-31 21:54:08 -08:00
kingfisher scan /path/to/code
```
2026-01-31 21:54:08 -08:00
### 3: Scan and view results in browser
```bash
2026-01-31 21:54:08 -08:00
kingfisher scan /path/to/code --view-report
```
2026-02-02 18:39:24 -08:00
### 4: Show only validated (live) secrets
2026-01-31 21:54:08 -08:00
```bash
kingfisher scan /path/to/code --only-valid
2025-07-29 10:12:40 -07:00
```
2026-01-31 21:54:08 -08:00
### 5: Revoke a discovered secret
2026-01-31 21:54:08 -08:00
```bash
# Revoke a GitHub token
kingfisher revoke --rule github "ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
2026-01-31 21:54:08 -08:00
# Revoke AWS credentials (sets access key to Inactive)
kingfisher revoke --rule aws --arg "AKIAIOSFODNN7EXAMPLE" "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
```
2025-12-09 12:56:55 -08:00
2026-02-07 09:03:57 -08:00
### 6: Scan a GitHub organization ([INTEGRATIONS.md](docs/INTEGRATIONS.md))
2025-12-09 12:56:55 -08:00
```bash
2026-01-31 21:54:08 -08:00
KF_GITHUB_TOKEN="ghp_..." kingfisher scan github --organization my-org
2025-12-09 12:56:55 -08:00
```
2026-01-31 21:54:08 -08:00
### 7: Scan a GitLab group
2025-12-09 12:56:55 -08:00
```bash
2026-01-31 21:54:08 -08:00
KF_GITLAB_TOKEN="glpat-..." kingfisher scan gitlab --group my-group
2025-12-09 12:56:55 -08:00
```
2026-01-31 21:54:08 -08:00
### 8: Scan Azure Repos
2025-12-09 12:56:55 -08:00
```bash
2026-01-31 21:54:08 -08:00
KF_AZURE_PAT="pat" kingfisher scan azure --organization my-org
2025-12-09 12:56:55 -08:00
```
2026-01-31 21:54:08 -08:00
### 9: Scan Bitbucket workspace
2025-12-09 12:56:55 -08:00
```bash
2026-01-31 21:54:08 -08:00
KF_BITBUCKET_TOKEN="token" kingfisher scan bitbucket --workspace my-team
2025-12-09 12:56:55 -08:00
```
2026-01-31 21:54:08 -08:00
### 10: Scan Gitea organization
2025-12-09 12:56:55 -08:00
```bash
2026-01-31 21:54:08 -08:00
KF_GITEA_TOKEN="token" kingfisher scan gitea --organization my-org
2025-12-09 12:56:55 -08:00
```
2026-01-31 21:54:08 -08:00
### 11: Scan Hugging Face
2025-12-09 12:56:55 -08:00
```bash
2026-01-31 21:54:08 -08:00
KF_HUGGINGFACE_TOKEN="hf_..." kingfisher scan huggingface --organization my-org
```
2026-01-31 21:54:08 -08:00
### 12: Scan an S3 bucket
2026-01-31 21:54:08 -08:00
```bash
kingfisher scan s3 bucket-name --prefix path/
```
2026-01-31 21:54:08 -08:00
### 13: Scan Google Cloud Storage
```bash
2026-01-31 21:54:08 -08:00
kingfisher scan gcs bucket-name --prefix path/
```
2026-01-31 21:54:08 -08:00
### 14: Scan a Docker image
```bash
2026-01-31 21:54:08 -08:00
kingfisher scan docker ghcr.io/org/image:latest
```
2026-01-31 21:54:08 -08:00
### 15: Scan Jira issues
```bash
2026-01-31 21:54:08 -08:00
KF_JIRA_TOKEN="token" kingfisher scan jira --url https://jira.company.com --jql "project = SEC"
```
Add `--include-comments` and/or `--include-changelog` to expand the scan beyond the issue body.
2026-01-31 21:54:08 -08:00
### 16: Scan Confluence pages
```bash
2026-01-31 21:54:08 -08:00
KF_CONFLUENCE_TOKEN="token" kingfisher scan confluence --url https://confluence.company.com --cql "label = secret"
```
2026-01-31 21:54:08 -08:00
### 17: Scan Slack messages
2026-01-31 21:54:08 -08:00
```bash
KF_SLACK_TOKEN="xoxp-..." kingfisher scan slack "api_key OR password"
```
2026-01-31 21:54:08 -08:00
### 18: Run with Docker (no install required)
2026-01-31 21:54:08 -08:00
```bash
docker run --rm -v "$PWD":/src ghcr.io/mongodb/kingfisher:latest scan /src
2025-12-09 12:56:55 -08:00
```
### 19: Run with Docker and view report in browser
To run a scan in Docker and view the HTML report on your host machine, use `--view-report-address 0.0.0.0` so the server is reachable from outside the container, and map the port with `-p`:
```bash
docker run --rm \
-v "$PWD":/src \
-p 7890:7890 \
ghcr.io/mongodb/kingfisher:latest \
scan https://github.com/leaktk/fake-leaks \
--access-map \
--view-report \
--view-report-address 0.0.0.0
```
Then open **http://localhost:7890** in your browser. If port 7890 is already in use, use `--view-report-port` and map accordingly:
```bash
docker run --rm \
-v "$PWD":/src \
-p 7891:7891 \
ghcr.io/mongodb/kingfisher:latest \
scan https://github.com/leaktk/fake-leaks \
--access-map \
--view-report \
--view-report-port 7891 \
--view-report-address 0.0.0.0
```
Then open **http://localhost:7891**.
### 20: Output JSON results
2025-06-24 17:17:16 -07:00
```bash
2026-01-31 21:54:08 -08:00
kingfisher scan /path/to/code --format json --output findings.json
2025-06-24 17:17:16 -07:00
```
### 21: Map blast radius of discovered credentials
2026-01-31 21:54:08 -08:00
2025-06-24 17:17:16 -07:00
```bash
2026-01-31 21:54:08 -08:00
kingfisher scan /path/to/code --access-map --view-report
2025-06-24 17:17:16 -07:00
```
2026-01-31 21:54:08 -08:00
## Installation
2026-01-31 21:54:08 -08:00
Kingfisher supports multiple installation methods:
2026-01-31 21:54:08 -08:00
- **Homebrew**: `brew install kingfisher` ![Homebrew Formula Version](https://img.shields.io/homebrew/v/kingfisher)
2026-02-07 09:21:41 -08:00
- **PyPI with uv**: `uv tool install kingfisher-bin`
2026-01-31 21:54:08 -08:00
- **Pre-built releases**: Download from [GitHub Releases](https://github.com/mongodb/kingfisher/releases)
2026-02-07 09:21:41 -08:00
- **Install scripts**: One-line installers for Linux, macOS, and Windows - [INSTALLATION.md](docs/INSTALLATION.md)
2026-01-31 21:54:08 -08:00
- **Docker**: `docker run ghcr.io/mongodb/kingfisher:latest`
- **Pre-commit hooks**: Integrate with git hooks, pre-commit framework, or Husky
- **Compile from source**: Build with `make` for your platform
2026-01-31 21:54:08 -08:00
**For complete installation instructions and pre-commit hook setup, see [docs/INSTALLATION.md](docs/INSTALLATION.md).**
2026-01-31 21:54:08 -08:00
# Detection Rules
2026-01-31 21:54:08 -08:00
Kingfisher ships with [hundreds of rules](crates/kingfisher-rules/data/rules/) that cover everything from classic cloud keys to the latest AI SaaS tokens. Below is an overview:
2025-07-17 15:11:35 -07:00
| Category | What we catch |
|----------|---------------|
| **AI SaaS APIs** | OpenAI, Anthropic, Google Gemini, Cohere, Mistral, Stability AI, Replicate, xAI (Grok), Ollama, Langchain, Perplexity, Weights & Biases, Cerebras, Friendli, Fireworks.ai, NVIDIA NIM, together.ai, Zhipu, and more |
| **Cloud Providers** | AWS, Azure, GCP, Alibaba Cloud, DigitalOcean, IBM Cloud, Cloudflare, Temporal Cloud, and more |
| **Dev & CI/CD** | GitHub/GitLab tokens, CircleCI, TravisCI, TeamCity, Docker Hub, npm, PyPI, Vercel, and more |
| **Messaging & Comms** | Slack, Discord, Microsoft Teams, Twilio, Mailgun, SendGrid, Mailchimp, and more |
| **Databases & Data Ops** | MongoDB Atlas, PlanetScale, Postgres DSNs, Grafana Cloud, Datadog, Dynatrace, and more |
| **Payments & Billing** | Stripe, PayPal, Square, GoCardless, and more |
| **Security & DevSecOps** | Snyk, Dependency-Track, CodeClimate, Codacy, OpsGenie, PagerDuty, and more |
| **Misc. SaaS & Tools** | 1Password, Adobe, Atlassian/Jira, Asana, Netlify, Baremetrics, and more |
2026-01-31 21:54:08 -08:00
## Write Custom Rules
2025-06-24 17:17:16 -07:00
Kingfisher ships with hundreds of rules with HTTP and servicespecific validation checks (AWS, Azure, GCP, etc.) to confirm if a detected string is a live credential.
However, you may want to add your own custom rules, or modify a detection to better suit your needs / environment.
2026-01-31 21:54:08 -08:00
**For complete rule documentation, see [docs/RULES.md](docs/RULES.md).**
2026-01-31 21:54:08 -08:00
### Checksum Intelligence
2026-01-31 21:54:08 -08:00
Modern API tokens increasingly include **built-in checksums**, short internal digests that make each credential self-verifiable. (For background, see [GitHub's write-up on their newer token formats](https://github.blog/engineering/platform-security/behind-githubs-new-authentication-token-formats/) and why checksums slash false positives.)
Kingfisher supports **checksum-aware matching** in rules, enabling **offline structural verification** of credentials *without* calling third-party APIs.
2026-01-31 21:54:08 -08:00
By validating each token's internal checksum (for tokens that support checksums), Kingfisher eliminates nearly all false positives—automatically skipping structurally invalid or fake tokens before validation ever runs.
**Why this matters**
2026-01-13 14:25:52 -08:00
- **Offline verification** — no API call required
- **Industry-aligned** — compatible with prefix + checksum token designs (e.g., modern PATs)
- **Lower false positives** — invalid tokens are filtered out by structure alone
**Learn more**: implementation details and templating are documented in **[docs/RULES.md](docs/RULES.md)**
2025-06-24 17:17:16 -07:00
2026-01-31 21:54:08 -08:00
# Usage Examples
2025-06-24 17:17:16 -07:00
2026-01-31 21:54:08 -08:00
> **Note**: `kingfisher scan` automatically detects whether the input is a Git repository or a plain directory—no extra flags required.
2025-06-24 17:17:16 -07:00
2026-01-31 21:54:08 -08:00
## Basic Scanning
2025-06-28 07:08:22 -07:00
2025-06-24 17:17:16 -07:00
```bash
2026-01-31 21:54:08 -08:00
# Scan with secret validation
2025-06-24 17:17:16 -07:00
kingfisher scan /path/to/code
## NOTE: This path can refer to:
# 1. a local git repo
# 2. a directory with many git repos
# 3. or just a folder with files and subdirectories
2026-01-31 21:54:08 -08:00
# Scan without validation
2025-06-24 17:17:16 -07:00
kingfisher scan ~/src/myrepo --no-validate
2025-06-28 07:08:22 -07:00
2026-02-24 12:25:12 -07:00
# Turbo mode: run as fast as possible by disabling Git commit metadata, Base64 decoding,
# MIME sniffing, language detection, and tree-sitter parsing
# (findings omit commit context, Base64-only matches, MIME type, and language metadata)
kingfisher scan ~/src/myrepo --turbo
2026-01-31 21:54:08 -08:00
# Display only secrets confirmed active by thirdparty APIs
kingfisher scan /path/to/repo --only-valid
2025-06-24 17:17:16 -07:00
2026-01-31 21:54:08 -08:00
# Output JSON and capture to a file
2025-06-24 17:17:16 -07:00
kingfisher scan . --format json | tee kingfisher.json
2025-06-28 07:08:22 -07:00
2026-01-31 21:54:08 -08:00
# Output SARIF directly to disk
kingfisher scan /path/to/repo --format sarif --output findings.sarif
2025-06-24 17:17:16 -07:00
```
2026-01-31 21:54:08 -08:00
## Access Map and Visualization
**Stop Guessing, Start Mapping: Understand Your True Blast Radius**
2026-01-31 21:54:08 -08:00
Finding a leaked credential is only the first step. The critical question isn't just "Is this a secret?"—it's "What can an attacker do with it?"
Kingfisher's `--access-map` feature transforms secret detection from a simple alert into a comprehensive threat assessment. Instead of leaving you with a cryptic API key, Kingfisher actively authenticates against your cloud provider (AWS, GCP, Azure Storage, Azure DevOps, GitHub, GitLab, or Slack) to map the full extent of the credential's power.
* Instant Identity Resolution: Immediately identify who the key belongs to—whether it's a specific IAM user, an assumed role, or a service account.
2026-01-01 22:24:57 -08:00
* Visualize the Blast Radius: See exactly which resources (S3 buckets, EC2 instances, projects, storage containers) are exposed and at risk.
2025-06-24 17:17:16 -07:00
```bash
2026-01-31 21:54:08 -08:00
# Generate access map during scan
kingfisher scan /path/to/code --access-map --view-report
2025-06-24 17:17:16 -07:00
2026-01-31 21:54:08 -08:00
# View access-map reports locally
kingfisher view kingfisher.json
```
2026-01-31 21:54:08 -08:00
> **Use the access map functionality only when you are authorized to inspect the target account, as Kingfisher will issue additional network requests to determine what access the secret grants**
2026-01-31 21:54:08 -08:00
## Direct Secret Validation & Revocation
```bash
2026-01-31 21:54:08 -08:00
# Validate a known secret without scanning
kingfisher validate --rule opsgenie "12345678-9abc-def0-1234-56789abcdef0"
# Validate from stdin
echo "ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" | kingfisher validate --rule github -
# Revoke a Slack token
kingfisher revoke --rule slack "xoxb-..."
# Revoke a GitHub PAT
kingfisher revoke --rule github "ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
```
Validation throttling is also available for direct validation:
- `--validation-rps <RPS>` sets a global request rate.
- `--validation-rps-rule <RULE_SELECTOR=RPS>` sets per-rule overrides (repeatable).
- Rule selectors accept short names, so `github=2` matches `kingfisher.github.*`.
```bash
# Limit direct validation to 1 req/sec for GitHub rules
kingfisher validate --rule github "ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
--validation-rps-rule github=1
```
## Compliance and Audit-Ready Scans
Kingfisher is built to support compliance and security-assurance goals, not just detection. In addition to finding secrets, it helps teams produce evidence that secure development controls are operating.
- **Audit scan output**: generate a standalone HTML report with scan timestamp, report generation time, validation status, and file-level links for findings
- **Evidence-friendly metadata**: include version, scan stats, and sanitized command arguments for review workflows
- **Control narrative support**: demonstrate that hardcoded credentials/secrets are actively detected and triaged in CI/CD and developer workflows
```bash
# Generate an audit-ready HTML report
kingfisher scan /path/to/code --format html --output kingfisher-audit.html
```
2026-01-31 21:54:08 -08:00
## Advanced Scanning Options
2025-09-04 23:52:43 -07:00
```bash
2026-01-31 21:54:08 -08:00
# Pipe any text directly into Kingfisher
cat /path/to/file.py | kingfisher scan -
2025-06-28 07:08:22 -07:00
2026-01-31 21:54:08 -08:00
# Limit maximum file size scanned (default: 256 MB)
kingfisher scan /some/file --max-file-size 500
2025-06-24 17:17:16 -07:00
2026-02-24 12:25:12 -07:00
# Turbo mode: equivalent to --commit-metadata=false --no-base64 and disables MIME sniffing,
# language detection/tree-sitter parsing for maximum speed
# No Git commit metadata (author, date, hash), Base64 decoding, MIME, or language metadata in findings
kingfisher scan /path/to/repo --turbo
2026-01-31 21:54:08 -08:00
# Scan using a rule family
2025-06-24 17:17:16 -07:00
kingfisher scan /path/to/repo --rule kingfisher.aws
2025-06-28 07:08:22 -07:00
2026-01-31 21:54:08 -08:00
# Display rule performance statistics
kingfisher scan /path/to/repo --rule-stats
# Throttle validation request rate globally
kingfisher scan /path/to/repo --validation-rps 5
# Override specific rule families (kingfisher. prefix optional)
kingfisher scan /path/to/repo \
--validation-rps 10 \
--validation-rps-rule github=2 \
--validation-rps-rule pypi=0.5
2026-02-09 12:14:50 -08:00
# Include full validation response bodies (not truncated to 512 characters)
# Useful for parsing complete validation responses (e.g., GitHub token metadata)
kingfisher scan /path/to/repo --full-validation-response
2026-01-31 21:54:08 -08:00
# Exclude specific paths
kingfisher scan ./my-project \
--exclude '*.py' \
--exclude '[Tt]ests'
2026-01-31 21:54:08 -08:00
# Scan changes in CI pipelines
kingfisher scan . \
--since-commit origin/main \
--branch "$CI_BRANCH"
```
> Validation rate limiting applies to all built-in validator types (HTTP/gRPC, cloud SDK validators such as AWS/GCP/Coinbase, and database/token validators such as MongoDB, Postgres, MySQL, JDBC, JWT, and Azure Storage). `Raw` validators are excluded.
2026-01-31 21:54:08 -08:00
# Platform Integrations
2026-01-31 21:54:08 -08:00
Kingfisher can scan multiple platforms and services directly:
2026-01-31 21:54:08 -08:00
**Version Control & Code Hosting:**
- GitHub (organizations, users, repositories)
- GitLab (groups, users, projects)
- Azure Repos (organizations, projects)
- Bitbucket (workspaces, users, repositories)
- Gitea (organizations, users, repositories)
- Hugging Face (models, datasets, spaces)
2026-01-31 21:54:08 -08:00
**Cloud Storage:**
- AWS S3
- Google Cloud Storage
2026-01-31 21:54:08 -08:00
**Containers:**
- Docker (images from registries)
2026-01-31 21:54:08 -08:00
**Collaboration & Documentation:**
- Jira (issues via JQL queries)
- Confluence (pages via CQL queries)
- Slack (messages via search queries)
See **[docs/INTEGRATIONS.md](docs/INTEGRATIONS.md)** for complete integration documentation and authentication setup.
2026-01-31 21:54:08 -08:00
## Quick Examples
```bash
2026-01-31 21:54:08 -08:00
# Scan AWS S3 bucket
kingfisher scan s3 bucket-name --prefix path/
2026-01-31 21:54:08 -08:00
# Scan Google Cloud Storage
kingfisher scan gcs bucket-name
2026-01-31 21:54:08 -08:00
# Scan Docker image
kingfisher scan docker ghcr.io/owasp/wrongsecrets/wrongsecrets-master:latest-master
2026-01-31 21:54:08 -08:00
# Scan GitHub organization
kingfisher scan github --organization my-org
2026-01-31 21:54:08 -08:00
# Scan GitLab group
kingfisher scan gitlab --group my-group
2026-01-31 21:54:08 -08:00
# Scan Azure Repos
kingfisher scan azure --organization my-org
2025-10-05 16:58:50 -07:00
2026-01-31 21:54:08 -08:00
# Scan Jira issues
KF_JIRA_TOKEN="token" kingfisher scan jira --url https://jira.company.com \
--jql "project = TEST AND status = Open"
# Scan Jira issues, comments, and changelog entries
KF_JIRA_TOKEN="token" kingfisher scan jira --url https://jira.company.com \
--jql "project = TEST AND status = Open" \
--include-comments \
--include-changelog
2026-01-31 21:54:08 -08:00
# Scan Confluence pages
KF_CONFLUENCE_TOKEN="token" kingfisher scan confluence --url https://confluence.company.com \
--cql "label = secret"
# Scan Slack messages
KF_SLACK_TOKEN="xoxp-..." kingfisher scan slack "from:username has:link"
```
2026-01-31 21:54:08 -08:00
**For detailed integration instructions and authentication setup, see [docs/INTEGRATIONS.md](docs/INTEGRATIONS.md).**
2026-01-31 21:54:08 -08:00
## Environment Variables
2026-01-31 21:54:08 -08:00
| Variable | Purpose |
| ----------------- | ---------------------------- |
| `KF_GITHUB_TOKEN` | GitHub Personal Access Token |
| `KF_GITLAB_TOKEN` | GitLab Personal Access Token |
| `KF_GITEA_TOKEN` | Gitea Personal Access Token |
| `KF_GITEA_USERNAME` | Username for private Gitea clones (used with `KF_GITEA_TOKEN`) |
| `KF_AZURE_TOKEN` / `KF_AZURE_PAT` | Azure Repos Personal Access Token |
| `KF_AZURE_USERNAME` | Username to use with Azure Repos PATs (defaults to `pat` when unset) |
| `KF_BITBUCKET_TOKEN` | Bitbucket Cloud workspace API token or Bitbucket Server PAT |
| `KF_BITBUCKET_USERNAME` | Optional Bitbucket username for legacy app passwords or server tokens |
| `KF_BITBUCKET_APP_PASSWORD` | Legacy Bitbucket app password (deprecated September 9, 2025; disabled June 9, 2026) |
| `KF_BITBUCKET_OAUTH_TOKEN` | Bitbucket OAuth or PAT token |
| `KF_HUGGINGFACE_TOKEN` | Hugging Face access token for API enumeration and git cloning |
| `KF_HUGGINGFACE_USERNAME` | Optional username for Hugging Face git operations (defaults to `hf_user`) |
| `KF_JIRA_TOKEN` | Jira API token |
| `KF_CONFLUENCE_TOKEN` | Confluence API token |
| `KF_SLACK_TOKEN` | Slack API token |
| `KF_DOCKER_TOKEN` | Docker registry token (`user:pass` or bearer token). If unset, credentials from the Docker keychain are used |
| `KF_AWS_KEY`, `KF_AWS_SECRET`, and `KF_AWS_SESSION_TOKEN` | AWS credentials for S3 bucket scanning. Session token is optional, for temporary credentials |
2026-01-31 21:54:08 -08:00
Set them temporarily per command:
```bash
2026-01-31 21:54:08 -08:00
KF_GITLAB_TOKEN="glpat-…" kingfisher scan gitlab --group my-group
```
2026-01-31 21:54:08 -08:00
Or export for the session:
2026-01-31 21:54:08 -08:00
```bash
export KF_GITLAB_TOKEN="glpat-…"
```
2026-01-31 21:54:08 -08:00
# Advanced Features
2026-01-31 21:54:08 -08:00
Kingfisher offers powerful features for complex scanning scenarios. See **[docs/ADVANCED.md](docs/ADVANCED.md)** for complete advanced documentation.
2026-01-31 21:54:08 -08:00
## Baseline Management
2026-01-31 21:54:08 -08:00
Track known secrets and detect only new ones:
```bash
2026-01-31 21:54:08 -08:00
# Create/update baseline
kingfisher scan /path/to/code \
--confidence low \
--manage-baseline \
--baseline-file ./baseline-file.yml
2025-10-05 16:58:50 -07:00
2026-01-31 21:54:08 -08:00
# Scan with baseline (suppress known findings)
kingfisher scan /path/to/code \
--baseline-file /path/to/baseline-file.yaml
```
2026-01-31 21:54:08 -08:00
## Filtering and Suppression
```bash
2026-01-31 21:54:08 -08:00
# Skip known false positives
kingfisher scan --skip-regex '(?i)TEST_KEY' path/
kingfisher scan --skip-word dummy path/
2026-01-31 21:54:08 -08:00
# Skip AWS canary tokens
kingfisher scan /path/to/code \
--skip-aws-account "171436882533,534261010715"
2026-01-31 21:54:08 -08:00
# Inline ignore directives in code
# Add `kingfisher:ignore` on the same line or surrounding lines
```
2026-01-31 21:54:08 -08:00
## CI Pipeline Scanning
```bash
2026-01-31 21:54:08 -08:00
# Scan only changes between branches
kingfisher scan . \
--since-commit origin/main \
--branch "$CI_BRANCH"
# Scan specific commit range
kingfisher scan /tmp/repo --branch feature-1 \
--branch-root-commit $(git -C /tmp/repo merge-base main feature-1)
```
2026-01-31 21:54:08 -08:00
**For more advanced features including confidence levels, validation tuning, and custom rules, see [docs/ADVANCED.md](docs/ADVANCED.md).**
2026-01-31 21:54:08 -08:00
# Documentation
2026-01-31 21:54:08 -08:00
| Document | Description |
|----------|-------------|
| [INSTALLATION.md](docs/INSTALLATION.md) | Complete installation guide including pre-commit hooks setup for git, pre-commit framework, and Husky |
| [INTEGRATIONS.md](docs/INTEGRATIONS.md) | Platform-specific scanning guide (GitHub, GitLab, AWS S3, Docker, Jira, Confluence, Slack, etc.) |
2026-02-13 16:41:28 -08:00
| [ACCESS_MAP.md](docs/ACCESS_MAP.md) | Access map: supported tokens and credential formats (GitHub/GitLab/Slack/AWS/GCP/Azure Storage/Postgres/MongoDB) |
2026-01-31 21:54:08 -08:00
| [ADVANCED.md](docs/ADVANCED.md) | Advanced features: baselines, confidence levels, validation tuning, CI scanning, and more |
| [RULES.md](docs/RULES.md) | Writing custom detection rules, pattern requirements, and checksum intelligence |
| [BASELINE.md](docs/BASELINE.md) | Baseline management for tracking known secrets and detecting new ones |
| [LIBRARY.md](docs/LIBRARY.md) | Using Kingfisher as a Rust library in your own applications |
| [FINGERPRINT.md](docs/FINGERPRINT.md) | Understanding finding fingerprints and deduplication |
| [COMPARISON.md](docs/COMPARISON.md) | Benchmark results and performance comparisons |
| [PARSING.md](docs/PARSING.md) | Language-aware parsing details |
2026-03-09 20:11:58 -07:00
| [TREE_SITTER.md](docs/TREE_SITTER.md) | Tree-sitter scanning flow, verification gates, and fallback behavior |
2026-01-31 21:54:08 -08:00
# Library Usage
2026-01-31 21:54:08 -08:00
(**beta feature**) - Kingfisher's scanning engine is available as a set of Rust library crates (`kingfisher-core`, `kingfisher-rules`, `kingfisher-scanner`) that can be embedded into other applications. This enables you to integrate secret scanning directly into your own tools and workflows.
2026-01-31 21:54:08 -08:00
**For complete documentation and examples, see [docs/LIBRARY.md](docs/LIBRARY.md).**
2026-01-31 21:54:08 -08:00
# Exit Codes
2025-06-24 17:17:16 -07:00
2025-06-28 07:08:22 -07:00
| Code | Meaning |
| ---- | ----------------------------- |
| 0 | No findings |
| 200 | Findings discovered |
| 205 | Validated findings discovered |
2025-06-24 17:17:16 -07:00
2026-01-31 21:54:08 -08:00
# Lineage and Evolution
2025-10-31 16:21:23 -07:00
Kingfisher began as an internal fork of [Nosey Parker](https://github.com/praetorian-inc/noseyparker), used as a high-performance foundation for secret detection.
2025-10-31 16:21:23 -07:00
Since then it has evolved far beyond that starting point, introducing live validation, hundreds of new rules, additional scan targets, and major architectural changes across nearly every subsystem.
2025-10-31 16:21:23 -07:00
**Key areas of evolution**
- **Live validation** of detected secrets directly within rules
- **Hundreds of new built-in rules** and an expanded YAML rule schema
- **Baseline management** to suppress known findings over time
- **Tree-sitter parsing** layered on Hyperscan for language-aware detection
- **More scan targets** (GitLab, Bitbucket, Gitea, Jira, Confluence, Slack, S3, GCS, Docker, Hugging Face, etc.)
- **Compressed Files**, **SQLite database**, and **Python bytecode (.pyc)** scanning support
2025-10-31 16:21:23 -07:00
- **New storage model** (in-memory + Bloom filter, replacing SQLite)
- **Unified workflow** with JSON/BSON/SARIF outputs
- **Cross-platform builds** for Linux, macOS, and Windows
2025-06-24 17:17:16 -07:00
# Roadmap
2025-06-28 07:08:22 -07:00
- More rules
- More targets
2025-08-19 09:30:26 -07:00
- Please file a [feature request](https://github.com/mongodb/kingfisher/issues), or open a PR, if you have features you'd like added
2025-06-24 17:17:16 -07:00
# License
[Apache2 License](LICENSE)