diff --git a/CHANGELOG.md b/CHANGELOG.md index dabda9a..8f41796 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,14 +3,16 @@ All notable changes to this project will be documented in this file. ## [v1.62.0] -- This release is focused on further improving detection accuracy, before even attempting to validate findings. -- Added `pattern_requirements` checks to rules, providing lightweight post-regex character-class validation without lookarounds. -- Added an `ignore_if_contains` option to `PatternRequirements` to drop matches containing case-insensitive placeholder words, with tests covering the new behavior. +- Added `pattern_requirements` checks to rules, providing lightweight post-regex character-class validation without lookarounds. See docs/RULES.md for detail +- Added an `ignore_if_contains` option to `pattern_requirements` to drop matches containing case-insensitive placeholder words, with tests covering the new behavior. - Updated rules to adopt the new `pattern_requirements` support. - Added checksum comparisons to `pattern_requirements`, new `suffix`, `crc32`, and `base62` Liquid filters, and verbose logging so mismatched checksums are skipped with context rather than reported as findings. - Split GitHub token detections into fine-grained/fixed-format variants and enforce checksum validation for modern GitHub token families (PAT, OAuth, App, refresh) while preserving legacy coverage. +- Added a rule for Zuplo tokens. +- Added checksum calculation for Confluent, GitHub, and Zuplo tokens, which can drastically reduce false positive reports. +- Improved OpsGenie validation. - Automatically enable `--no-dedup` when `--manage-baseline` is supplied so baseline management keeps every finding. - +- This release is focused on further improving detection accuracy, before even attempting to validate findings. ## [v1.61.0] - Fixed local filesystem scans to keep `open_path_as_is` enabled when opening Git repositories and only disable it for diff-based scans. diff --git a/README.md b/README.md index 72736b9..81b38b1 100644 --- a/README.md +++ b/README.md @@ -36,6 +36,7 @@ For a look at how Kingfisher has grown from its early foundations into today's f - **Broad AI SaaS coverage**: finds and validates tokens for OpenAI, Anthropic, Google Gemini, Cohere, Mistral, Stability AI, Replicate, xAI (Grok), Ollama, Langchain, Perplexity, Weights & Biases, Cerebras, Friendli, Fireworks.ai, NVIDIA NIM, Together.ai, Zhipu, and many more - **Compressed Files**: Supports extracting and scanning compressed files for secrets - **Baseline management**: generate and track baselines to suppress known secrets ([docs/BASELINE.md](/docs/BASELINE.md)) +- **Checksum-aware detection**: verifies tokens with built-in checksums (e.g., GitHub, Confluent, Zuplo) β€” no API calls required **Learn more:** [Introducing Kingfisher: Real‑Time Secret Detection and Validation](https://www.mongodb.com/blog/post/product-release-announcements/introducing-kingfisher-real-time-secret-detection-validation) @@ -68,6 +69,7 @@ See ([docs/COMPARISON.md](docs/COMPARISON.md)) - [πŸ” Detection Rules at a Glance](#-detection-rules-at-a-glance) - [πŸ“ Write Custom Rules!](#-write-custom-rules) - [Pattern requirements and placeholder filtering](#pattern-requirements-and-placeholder-filtering) + - [πŸ” Checksum Intelligence (New!)](#-checksum-intelligence-new) - [πŸŽ‰ Usage](#-usage) - [Basic Examples](#basic-examples) - [Scan with secret validation](#scan-with-secret-validation) @@ -343,6 +345,26 @@ checksum mismatch lengths so you can confirm why a finding was suppressed. Once you've done that, you can provide your custom rules (defined in a YAML file) and provide it to Kingfisher at runtime --- no recompiling required! +### πŸ” Checksum Intelligence (New!) + +Modern API tokens increasingly include **built-in checksums**, short internal digests that make each credential self-verifiable. (For background, see [GitHub’s write-up on their newer token formats](https://github.blog/engineering/platform-security/behind-githubs-new-authentication-token-formats/) and why checksums slash false positives.) + +Kingfisher supports **checksum-aware matching** in rules, enabling **offline structural verification** of credentials *without* calling third-party APIs. + +By validating each token’s internal checksum (for tokens that support checksums), Kingfisher eliminates nearly all false positivesβ€”automatically skipping structurally invalid or fake tokens before validation ever runs. + +**Why this matters** +- βœ… **Offline verification** β€” no API call required +- 🧠 **Industry-aligned** β€” compatible with prefix + checksum token designs (e.g., modern PATs) +- ⚑ **Lower false positives** β€” invalid tokens are filtered out by structure alone + +**Learn more**: implementation details and templating are documented in **[docs/RULES.md](docs/RULES.md)** + +--- + + +- **Checksum-aware detection**: verifies tokens with embedded checksums (offline) to cut false positives β€” see [docs/RULES.md](docs/RULES.md) + # πŸŽ‰ Usage ## Basic Examples diff --git a/data/rules/azuredevops.yml b/data/rules/azuredevops.yml index 8a21d80..90fa4e8 100644 --- a/data/rules/azuredevops.yml +++ b/data/rules/azuredevops.yml @@ -21,7 +21,7 @@ rules: (?xi) \b ( - [a-z0-9]{75,76}AZDO[a-z0-9]{4,5} + [a-z0-9]{76}AZDO[a-z0-9]{4,5} ) \b pattern_requirements: diff --git a/data/rules/github.yml b/data/rules/github.yml index 334c7ed..3aa8d7d 100644 --- a/data/rules/github.yml +++ b/data/rules/github.yml @@ -277,32 +277,3 @@ rules: - | GITHUB_CLIENT_ID=ac58d6da7d7a84c039b7 GITHUB_SECRET=37d02377a3e9d849e18704c3ec883f9c5787d857 - - name: GitHub Personal Access Token (fine-grained permissions) - id: kingfisher.github.9 - pattern: | - (?xi) - ( - github_pat_[0-9A-Z_]{82} - ) - examples: - - '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' - validation: - type: Http - content: - request: - method: POST - url: https://api.github.com/graphql - headers: - Authorization: token {{ TOKEN }} - Accept: application/vnd.github+json - Content-Type: application/json - body: | - { - "query": "{ viewer { login } }" - } - response_matcher: - - report_response: true - - match_all_words: true - type: WordMatch - words: - - '"login"' \ No newline at end of file diff --git a/data/rules/zuplo.yml b/data/rules/zuplo.yml index 22ed4c1..bbfdb7b 100644 --- a/data/rules/zuplo.yml +++ b/data/rules/zuplo.yml @@ -20,3 +20,17 @@ rules: - zpka_b3f94d8d3d4d4a6ea5c5b20d0a5bb407_18eb262b references: - https://zuplo.com/blog/api-key-authentication + validation: + type: Http + content: + request: + headers: + authorization: "Bearer {{ TOKEN }}" + x-api-key: "{{ TOKEN }}" + method: GET + response_matcher: + - report_response: true + - status: + - 200 + type: StatusMatch + url: https://dev.zuplo.com/v1/who-am-i \ No newline at end of file diff --git a/src/liquid_filters.rs b/src/liquid_filters.rs index 9112fb6..66a2fab 100644 --- a/src/liquid_filters.rs +++ b/src/liquid_filters.rs @@ -109,6 +109,41 @@ impl Filter for ReplaceFilter { } } +#[derive(Debug, FilterParameters)] +struct LstripCharsArgs { + #[parameter( + description = "Characters to remove from the start of the input.", + arg_type = "str" + )] + chars: Expression, +} + +#[derive(Clone, ParseFilter, FilterReflection, Default)] +#[filter( + name = "lstrip_chars", + description = "Removes the provided characters from the beginning of the string.", + parameters(LstripCharsArgs), + parsed(LstripCharsFilter) +)] +pub struct LstripChars; + +#[derive(Debug, FromFilterParameters, Display_filter)] +#[name = "lstrip_chars"] +struct LstripCharsFilter { + #[parameters] + args: LstripCharsArgs, +} + +impl Filter for LstripCharsFilter { + fn evaluate(&self, input: &dyn ValueView, runtime: &dyn Runtime) -> Result { + let args = self.args.evaluate(runtime)?; + let chars = args.chars.to_string(); + let input_str = input.to_kstr(); + let trimmed = input_str.trim_start_matches(|c| chars.contains(c)).to_string(); + Ok(Value::scalar(trimmed)) + } +} + // ── HMAC args ───────────────────────────────────── #[derive(Debug, FilterParameters)] struct HmacArgs { @@ -803,6 +838,7 @@ pub fn register_all(builder: liquid::ParserBuilder) -> liquid::ParserBuilder { .filter(RandomStringFilter::default()) .filter(SuffixFilter::default()) .filter(PrefixFilter::default()) + .filter(LstripChars::default()) .filter(Crc32Filter::default()) .filter(Crc32DecFilter::default()) .filter(Crc32HexFilter::default()) @@ -1013,6 +1049,16 @@ mod tests { assert_eq!(render(r#"{{ "hello world" | replace: "world", "mars" }}"#), "hello mars"); } + #[test] + fn lstrip_chars_single() { + assert_eq!(render(r#"{{ "000abc" | lstrip_chars: "0" }}"#), "abc"); + } + + #[test] + fn lstrip_chars_multiple_chars() { + assert_eq!(render(r#"{{ "-=--token" | lstrip_chars: "-=" }}"#), "token"); + } + // ------------------------------------------------------------------------- // iso_timestamp_no_frac filter // ------------------------------------------------------------------------- diff --git a/src/main.rs b/src/main.rs index 99f0718..de047db 100644 --- a/src/main.rs +++ b/src/main.rs @@ -5,27 +5,26 @@ // * Fallback - system allocator (`system-alloc` feature) // ──────────────────────────────────────────────────────────── -// // --- jemalloc (opt-in) --- -// #[cfg(feature = "use-jemalloc")] -// #[global_allocator] -// static GLOBAL: tikv_jemallocator::Jemalloc = tikv_jemallocator::Jemalloc; +// --- jemalloc (opt-in) --- +#[cfg(feature = "use-jemalloc")] +#[global_allocator] +static GLOBAL: tikv_jemallocator::Jemalloc = tikv_jemallocator::Jemalloc; -// // --- mimalloc (default) --- -// #[cfg(all(not(feature = "use-jemalloc"), not(feature = "system-alloc")))] -// #[global_allocator] -// static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc; +// --- mimalloc (default) --- +#[cfg(all(not(feature = "use-jemalloc"), not(feature = "system-alloc")))] +#[global_allocator] +static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc; -// // --- system allocator (explicit opt-out) --- -// #[cfg(feature = "system-alloc")] -// use std::alloc::System; -// #[cfg(feature = "system-alloc")] -// #[global_allocator] +// --- system allocator (explicit opt-out) --- +#[cfg(feature = "system-alloc")] +use std::alloc::System; +#[cfg(feature = "system-alloc")] +#[global_allocator] // static GLOBAL: System = System; -use std::alloc::System; -#[global_allocator] -static GLOBAL: System = System; - +// use std::alloc::System; +// #[global_allocator] +// static GLOBAL: System = System; use std::{ io::{IsTerminal, Read}, sync::{Arc, Mutex}, diff --git a/tests/smoke_docker.rs b/tests/smoke_docker.rs index 46e22c7..3bd4307 100644 --- a/tests/smoke_docker.rs +++ b/tests/smoke_docker.rs @@ -1,4 +1,3 @@ -use assert_cmd::prelude::*; use std::process::Command; #[test]