updated confluent rule with a checksum. Added zuplo rule with a checksum

This commit is contained in:
Mick Grove 2025-11-09 08:42:16 -08:00
commit c856373fb5
8 changed files with 104 additions and 51 deletions

View file

@ -3,14 +3,16 @@
All notable changes to this project will be documented in this file.
## [v1.62.0]
- This release is focused on further improving detection accuracy, before even attempting to validate findings.
- Added `pattern_requirements` checks to rules, providing lightweight post-regex character-class validation without lookarounds.
- Added an `ignore_if_contains` option to `PatternRequirements` to drop matches containing case-insensitive placeholder words, with tests covering the new behavior.
- Added `pattern_requirements` checks to rules, providing lightweight post-regex character-class validation without lookarounds. See docs/RULES.md for detail
- Added an `ignore_if_contains` option to `pattern_requirements` to drop matches containing case-insensitive placeholder words, with tests covering the new behavior.
- Updated rules to adopt the new `pattern_requirements` support.
- Added checksum comparisons to `pattern_requirements`, new `suffix`, `crc32`, and `base62` Liquid filters, and verbose logging so mismatched checksums are skipped with context rather than reported as findings.
- Split GitHub token detections into fine-grained/fixed-format variants and enforce checksum validation for modern GitHub token families (PAT, OAuth, App, refresh) while preserving legacy coverage.
- Added a rule for Zuplo tokens.
- Added checksum calculation for Confluent, GitHub, and Zuplo tokens, which can drastically reduce false positive reports.
- Improved OpsGenie validation.
- Automatically enable `--no-dedup` when `--manage-baseline` is supplied so baseline management keeps every finding.
- This release is focused on further improving detection accuracy, before even attempting to validate findings.
## [v1.61.0]
- Fixed local filesystem scans to keep `open_path_as_is` enabled when opening Git repositories and only disable it for diff-based scans.

View file

@ -36,6 +36,7 @@ For a look at how Kingfisher has grown from its early foundations into today's f
- **Broad AI SaaS coverage**: finds and validates tokens for OpenAI, Anthropic, Google Gemini, Cohere, Mistral, Stability AI, Replicate, xAI (Grok), Ollama, Langchain, Perplexity, Weights & Biases, Cerebras, Friendli, Fireworks.ai, NVIDIA NIM, Together.ai, Zhipu, and many more
- **Compressed Files**: Supports extracting and scanning compressed files for secrets
- **Baseline management**: generate and track baselines to suppress known secrets ([docs/BASELINE.md](/docs/BASELINE.md))
- **Checksum-aware detection**: verifies tokens with built-in checksums (e.g., GitHub, Confluent, Zuplo) — no API calls required
**Learn more:** [Introducing Kingfisher: RealTime Secret Detection and Validation](https://www.mongodb.com/blog/post/product-release-announcements/introducing-kingfisher-real-time-secret-detection-validation)
@ -68,6 +69,7 @@ See ([docs/COMPARISON.md](docs/COMPARISON.md))
- [🔐 Detection Rules at a Glance](#-detection-rules-at-a-glance)
- [📝 Write Custom Rules!](#-write-custom-rules)
- [Pattern requirements and placeholder filtering](#pattern-requirements-and-placeholder-filtering)
- [🔍 Checksum Intelligence (New!)](#-checksum-intelligence-new)
- [🎉 Usage](#-usage)
- [Basic Examples](#basic-examples)
- [Scan with secret validation](#scan-with-secret-validation)
@ -343,6 +345,26 @@ checksum mismatch lengths so you can confirm why a finding was suppressed.
Once you've done that, you can provide your custom rules (defined in a YAML file) and provide it to Kingfisher at runtime --- no recompiling required!
### 🔍 Checksum Intelligence (New!)
Modern API tokens increasingly include **built-in checksums**, short internal digests that make each credential self-verifiable. (For background, see [GitHubs write-up on their newer token formats](https://github.blog/engineering/platform-security/behind-githubs-new-authentication-token-formats/) and why checksums slash false positives.)
Kingfisher supports **checksum-aware matching** in rules, enabling **offline structural verification** of credentials *without* calling third-party APIs.
By validating each tokens internal checksum (for tokens that support checksums), Kingfisher eliminates nearly all false positives—automatically skipping structurally invalid or fake tokens before validation ever runs.
**Why this matters**
- ✅ **Offline verification** — no API call required
- 🧠 **Industry-aligned** — compatible with prefix + checksum token designs (e.g., modern PATs)
- ⚡ **Lower false positives** — invalid tokens are filtered out by structure alone
**Learn more**: implementation details and templating are documented in **[docs/RULES.md](docs/RULES.md)**
---
<!-- Optional: add this one-liner to your “Performance, Accuracy, and Hundreds of Rules” bullets -->
- **Checksum-aware detection**: verifies tokens with embedded checksums (offline) to cut false positives — see [docs/RULES.md](docs/RULES.md)
# 🎉 Usage
## Basic Examples

View file

@ -21,7 +21,7 @@ rules:
(?xi)
\b
(
[a-z0-9]{75,76}AZDO[a-z0-9]{4,5}
[a-z0-9]{76}AZDO[a-z0-9]{4,5}
)
\b
pattern_requirements:

View file

@ -277,32 +277,3 @@ rules:
- |
GITHUB_CLIENT_ID=ac58d6da7d7a84c039b7
GITHUB_SECRET=37d02377a3e9d849e18704c3ec883f9c5787d857
- name: GitHub Personal Access Token (fine-grained permissions)
id: kingfisher.github.9
pattern: |
(?xi)
(
github_pat_[0-9A-Z_]{82}
)
examples:
- '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'
validation:
type: Http
content:
request:
method: POST
url: https://api.github.com/graphql
headers:
Authorization: token {{ TOKEN }}
Accept: application/vnd.github+json
Content-Type: application/json
body: |
{
"query": "{ viewer { login } }"
}
response_matcher:
- report_response: true
- match_all_words: true
type: WordMatch
words:
- '"login"'

View file

@ -20,3 +20,17 @@ rules:
- zpka_b3f94d8d3d4d4a6ea5c5b20d0a5bb407_18eb262b
references:
- https://zuplo.com/blog/api-key-authentication
validation:
type: Http
content:
request:
headers:
authorization: "Bearer {{ TOKEN }}"
x-api-key: "{{ TOKEN }}"
method: GET
response_matcher:
- report_response: true
- status:
- 200
type: StatusMatch
url: https://dev.zuplo.com/v1/who-am-i

View file

@ -109,6 +109,41 @@ impl Filter for ReplaceFilter {
}
}
#[derive(Debug, FilterParameters)]
struct LstripCharsArgs {
#[parameter(
description = "Characters to remove from the start of the input.",
arg_type = "str"
)]
chars: Expression,
}
#[derive(Clone, ParseFilter, FilterReflection, Default)]
#[filter(
name = "lstrip_chars",
description = "Removes the provided characters from the beginning of the string.",
parameters(LstripCharsArgs),
parsed(LstripCharsFilter)
)]
pub struct LstripChars;
#[derive(Debug, FromFilterParameters, Display_filter)]
#[name = "lstrip_chars"]
struct LstripCharsFilter {
#[parameters]
args: LstripCharsArgs,
}
impl Filter for LstripCharsFilter {
fn evaluate(&self, input: &dyn ValueView, runtime: &dyn Runtime) -> Result<Value> {
let args = self.args.evaluate(runtime)?;
let chars = args.chars.to_string();
let input_str = input.to_kstr();
let trimmed = input_str.trim_start_matches(|c| chars.contains(c)).to_string();
Ok(Value::scalar(trimmed))
}
}
// ── HMAC args ─────────────────────────────────────
#[derive(Debug, FilterParameters)]
struct HmacArgs {
@ -803,6 +838,7 @@ pub fn register_all(builder: liquid::ParserBuilder) -> liquid::ParserBuilder {
.filter(RandomStringFilter::default())
.filter(SuffixFilter::default())
.filter(PrefixFilter::default())
.filter(LstripChars::default())
.filter(Crc32Filter::default())
.filter(Crc32DecFilter::default())
.filter(Crc32HexFilter::default())
@ -1013,6 +1049,16 @@ mod tests {
assert_eq!(render(r#"{{ "hello world" | replace: "world", "mars" }}"#), "hello mars");
}
#[test]
fn lstrip_chars_single() {
assert_eq!(render(r#"{{ "000abc" | lstrip_chars: "0" }}"#), "abc");
}
#[test]
fn lstrip_chars_multiple_chars() {
assert_eq!(render(r#"{{ "-=--token" | lstrip_chars: "-=" }}"#), "token");
}
// -------------------------------------------------------------------------
// iso_timestamp_no_frac filter
// -------------------------------------------------------------------------

View file

@ -5,27 +5,26 @@
// * Fallback - system allocator (`system-alloc` feature)
// ────────────────────────────────────────────────────────────
// // --- jemalloc (opt-in) ---
// #[cfg(feature = "use-jemalloc")]
// #[global_allocator]
// static GLOBAL: tikv_jemallocator::Jemalloc = tikv_jemallocator::Jemalloc;
// --- jemalloc (opt-in) ---
#[cfg(feature = "use-jemalloc")]
#[global_allocator]
static GLOBAL: tikv_jemallocator::Jemalloc = tikv_jemallocator::Jemalloc;
// // --- mimalloc (default) ---
// #[cfg(all(not(feature = "use-jemalloc"), not(feature = "system-alloc")))]
// #[global_allocator]
// static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;
// --- mimalloc (default) ---
#[cfg(all(not(feature = "use-jemalloc"), not(feature = "system-alloc")))]
#[global_allocator]
static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;
// // --- system allocator (explicit opt-out) ---
// #[cfg(feature = "system-alloc")]
// use std::alloc::System;
// #[cfg(feature = "system-alloc")]
// #[global_allocator]
// --- system allocator (explicit opt-out) ---
#[cfg(feature = "system-alloc")]
use std::alloc::System;
#[cfg(feature = "system-alloc")]
#[global_allocator]
// static GLOBAL: System = System;
use std::alloc::System;
#[global_allocator]
static GLOBAL: System = System;
// use std::alloc::System;
// #[global_allocator]
// static GLOBAL: System = System;
use std::{
io::{IsTerminal, Read},
sync::{Arc, Mutex},

View file

@ -1,4 +1,3 @@
use assert_cmd::prelude::*;
use std::process::Command;
#[test]