Guidance for coding agents working in this repository.
## Project Overview
Kingfisher is an open-source secret scanner and live secret validator written in Rust by MongoDB. It detects, validates, and helps remediate leaked API keys, tokens, and credentials across code repositories, git history, and integrated platforms.
Key capabilities:
- Secret detection with 500+ built-in rules (YAML-based, SIMD-accelerated via Hyperscan/vectorscan)
- Live credential validation against provider APIs
- Validation modules live in `crates/kingfisher-scanner/src/validation/`; optional validation feature sets are defined in `crates/kingfisher-scanner/Cargo.toml` (e.g., `validation-raw`, `validation-aws`, `validation-gcp`, `validation-database`, `validation-all`).
- Raw validators use `validation: { type: Raw, content: <name> }` and are the ad-hoc exception path for provider-specific or protocol-specific validation that cannot be expressed reliably in YAML alone. Implement them in `crates/kingfisher-scanner/src/validation/raw.rs`.
- If a Rust exception path is required, prefer adding a raw validator before introducing a new typed validator. Add a new typed validator only when it represents a reusable schema-level validation family.
- Do not convert existing typed validators to `Raw` just for consistency.
- Add a validator (rare exception path): implement it in `crates/kingfisher-scanner/src/validation/`, prefer `raw.rs` for one-off provider flows, and wire the narrowest feature/dependencies in `crates/kingfisher-scanner/Cargo.toml` only when YAML validation cannot express the required logic.
3. Include `examples` that must match. These can be tested with `cargo test check_rules` or `kingfisher rules check --rules-path crates/kingfisher-rules/data/rules/slack.yml --load-builtins=false --no-update-check`
-`pattern_requirements` (e.g., `min_digits`, `min_uppercase`, `min_lowercase`, `min_special_chars`, `ignore_if_contains`) when format constraints are known.
-`pattern_requirements.checksum` when provider formats include check digits/signatures.
5. Add `validation` only when a reliable provider/API check exists.
6. Put validation in YAML by default. If YAML cannot express the check, use an existing typed validator or `type: Raw` exception path; add new Rust validator logic only for rare, justified cases.
10. Confidence for rules should be set at `confidence: medium`
11. The `pattern` field must contain a valid Hyperscan/Vectorscan regular expression. Lookahead and lookbehind assertions aren’t supported. Because inefficient or overly broad regex can degrade performance, patterns should be as specific as possible and written to minimize false positives.
1.**Writing `pattern`**: Start with `(?x)` (free-spacing). Use one unnamed capture `( ... )` around the secret—it becomes `{{ TOKEN }}`. Use `\b` word boundaries and `(?: ... )` for non-capturing structure. For flexible context between keywords and token, use `(?:.|[\n\r]){0,N}?`. Hyperscan doesn't support `(?=...)`; use `pattern_requirements` (e.g. `min_digits`) instead.
- Prefer `kingfisher scan --format toon` when invoking Kingfisher from an LLM or agent workflow; keep `pretty` for interactive human CLI use unless the task explicitly calls for a different format.