added AGENTS.md

This commit is contained in:
Mick Grove 2026-03-04 22:45:41 -08:00
commit dbdc5c0c82
2 changed files with 140 additions and 0 deletions

2
.gitignore vendored
View file

@ -220,5 +220,7 @@ rust-project.json
/dist-pypi/
*.whl
.claude/
# Created by https://www.toptal.com/developers/gitignore/api/python
# Edit at https://www.toptal.com/developers/gitignore?templates=python

138
AGENTS.md Normal file
View file

@ -0,0 +1,138 @@
# AGENTS.md
Guidance for coding agents working in this repository.
## Project Overview
Kingfisher is an open-source secret scanner and live secret validator written in Rust by MongoDB. It detects, validates, and helps remediate leaked API keys, tokens, and credentials across code repositories, git history, and integrated platforms.
Key capabilities:
- Secret detection with 500+ built-in rules (YAML-based, SIMD-accelerated via Hyperscan/vectorscan)
- Live credential validation against provider APIs
- Direct secret revocation from CLI
- Blast radius mapping (AWS, GCP, Azure, GitHub, GitLab, Slack)
- Output formats: JSON, SARIF, interactive HTML
- Platform integrations: GitHub, GitLab, Azure Repos, Bitbucket, Gitea, Hugging Face, S3, GCS, Docker, Jira, Confluence, Slack
## Scope
- Applies to the entire repository rooted at this file.
- If a deeper `AGENTS.md` exists in a subdirectory, that file takes precedence for its subtree.
## Project Overview
- Project: `kingfisher` (Rust)
- Purpose: secret detection, live validation, and remediation tooling
- Primary binary: `kingfisher`
## Repository Structure
- `src/`: main binary source
- `src/cli/commands/`: CLI command implementations
- `src/validation/`: provider-specific credential validators
- `src/matcher/`: pattern matching engine
- `src/scanner/`: core scanning logic
- `src/parser/`: language-aware parsing (`tree-sitter`)
- `src/reporter/`: JSON/SARIF/HTML report generation
- `src/access_map/`: access mapping analysis
- `crates/kingfisher-core/`: shared types and core logic
- `crates/kingfisher-rules/`: rule loading and rule data
- `crates/kingfisher-rules/data/rules/`: YAML detection rules
- `crates/kingfisher-scanner/`: embeddable high-level scanning API
- `tests/`: integration/e2e tests
- `testdata/`: test fixtures
- `docs/`: user and developer docs
- `vendor/vectorscan-rs/`: vendored vectorscan bindings
## Toolchain and Environment
- Shell assumptions in build scripts: `bash` with `set -eu -o pipefail`
- Workspace minimum Rust version: `1.90` (`Cargo.toml`)
- `make check-rust` enforces `>= 1.92.0` for build targets
- Prefer `rg` and `rg --files` for fast code/file search.
## Quick Commands
- Development build: `cargo build`
- Release build: `cargo build --release`
- Tests (preferred wrapper): `make tests`
- Tests (direct): `cargo test --workspace --all-targets`
- Nextest (if installed): `cargo nextest run --workspace --all-targets`
- Format: `cargo fmt --all`
- Lint: `cargo clippy --workspace --all-targets -- -D warnings`
- Clean: `make clean`
## Build Targets (Makefile)
- Host convenience:
- `make linux`
- `make darwin`
- Explicit platform archives:
- `make linux-x64`
- `make linux-arm64`
- `make darwin-x64`
- `make darwin-arm64`
- `make windows-x64` (Windows host only)
- Ubuntu bare-metal (Zig/cargo-zigbuild flow):
- `make ubuntu-x64`
- `make ubuntu-arm64`
## Code Style
- Rust formatting is defined in `rustfmt.toml` (`max_width = 100`, 4-space indentation, Unix newlines, reordered imports).
- Keep edits minimal and scoped; preserve existing conventions in touched files.
- Prefer clear, maintainable fixes over broad refactors unless requested.
## Architecture Notes
- Rules are YAML-driven and loaded from `crates/kingfisher-rules/data/rules/`.
- Allocator feature flags are in root `Cargo.toml`:
- `use-mimalloc` (default)
- `use-jemalloc`
- `system-alloc`
- Validation modules live in `crates/kingfisher-scanner/src/validation/`; optional validation feature sets are defined in `crates/kingfisher-scanner/Cargo.toml` (e.g., `validation-aws`, `validation-gcp`, `validation-database`, `validation-all`).
## Validation and Revocation Policy
- Default rule: define validation logic in rule YAML (`validation:` block), not Rust code.
- Code-based validation in `crates/kingfisher-scanner/src/validation/` is an exception path for cases that cannot be expressed reliably in YAML alone (for example AWS, GCP, Coinbase, MongoDB, and similar complex/provider-specific flows).
- Treat Rust validation additions as rare; prefer extending YAML-based validation first.
- For rules that include validation, add a `revocation:` section whenever the third-party API safely supports revocation.
## Common Development Tasks
- Add a detection rule: follow the workflow below and validate with relevant tests.
- Add a CLI command: implement under `src/cli/commands/` and register in the CLI command wiring.
- Add a validator (rare exception path): implement in `src/validation/` and wire feature flags/dependencies in `crates/kingfisher-scanner/Cargo.toml` only when YAML validation cannot express the required logic.
## Rule Authoring Workflow
Use this when creating or updating rules in `crates/kingfisher-rules/data/rules/`.
1. Pick a nearby reference rule file in the same provider family and copy its structure.
2. Define a stable rule id (`id`) and detection regex (`pattern`) under `rules:`.
3. Include `examples` that must match.
4. Set guardrails:
- `min_entropy` for high-entropy tokens.
- `pattern_requirements` (e.g., `min_digits`, `min_uppercase`, `min_lowercase`, `min_special_chars`, `ignore_if_contains`) when format constraints are known.
- `pattern_requirements.checksum` when provider formats include check digits/signatures.
5. Add `validation` only when a reliable provider/API check exists.
6. Put validation in YAML by default; only use Rust validator logic for rare, justified exceptions.
7. Add `revocation` when the provider API supports safe revocation and the flow is well understood.
8. If a rule needs context from another match (for example ID + secret pair), use `depends_on_rule` and consider `visible: false` on the helper rule.
9. Verify locally:
- `cargo test -p kingfisher-rules`
- `cargo test --workspace --all-targets`
- `kingfisher scan ./testdata --rule <rule-family-or-id> --rule-stats`
- If validation is implemented: `kingfisher validate --rule <rule-id> <token-or-secret>`
## Rule Docs (Read Before Editing)
- `docs/RULES.md`:
- `Rule Schema` for required/optional fields
- `Character Requirements` for `pattern_requirements`, `ignore_if_contains`, and `checksum`
- `Templating with Liquid` for request signing/transforms
- `Multi-Step Revocation` for complex revoke flows
- `Writing Custom Rules` and examples for best practices
## Workflow Expectations for Agents
- Do not revert user-authored or unrelated in-progress changes.
- Prefer targeted patches.
- After changes, run the narrowest relevant tests first, then broader checks when practical.
- If validation commands cannot be run, report exactly what was skipped and why.
## Documentation Pointers
- `docs/USAGE.md`
- `docs/ADVANCED.md`
- `docs/RULES.md`
- `docs/INSTALLATION.md`
- `docs/INTEGRATIONS.md`
- `docs/LIBRARY.md`