added top level 'self-update' cli sub command to update the binary independently. Now supports updating over homebrew managed binary

This commit is contained in:
Mick Grove 2025-08-27 15:35:01 -07:00
commit b3f80d7a33
7 changed files with 125 additions and 142 deletions

View file

@ -4,8 +4,8 @@ All notable changes to this project will be documented in this file.
## [1.46.0]
- Improved rules: AWS, pem
- Added rule for Ollama, Weights and Biases, Cerebras, Friendli, Fireworks.ai, NVIDIA NIM, together.ai, zhipu,
- Added rule for Ollama, Weights and Biases, Cerebras, Friendli, Fireworks.ai, NVIDIA NIM, together.ai, zhipu
- Added `self-update` command to update the binary independently. Now supports updating over homebrew managed binary
## [1.45.0]
- Added `--repo-artifacts` flag to scan repository issues, gists/snippets, and wikis when cloning via `--git-url`

View file

@ -8,21 +8,12 @@
Kingfisher is a blazingly fast secretscanning and live validation tool built in Rust. It combines Intels hardwareaccelerated Hyperscan regex engine with languageaware parsing via TreeSitter, and **ships with hundreds of builtin rules** to detect, validate, and triage secrets before they ever reach production
</p>
Kingfisher originated as a fork of Praetorian's Nosey Parker, and is built atop their incredible work and the work contributed by the Nosey Parker community.
## What Kingfisher Adds
- **Live validation** via cloud-provider APIs
- **Extra targets**: GitLab repos, S3 buckets, Docker images, Jira issues, Confluence pages, and Slack messages
- **Compressed Files**: Supports extracting and scanning compressed files for secrets
- **Baseline mode**: ignore known secrets, flag only new ones
- **Allowlist support**: suppress false positives with custom regexes or words
- **Language-aware detection** (source-code parsing) for ~20 languages
- **Native Windows** binary
Originally forked from Praetorians Nosey Parker, Kingfisher adds live cloud-API validation; many more targets (GitLab, S3, Docker, Jira, Confluence, Slack); compressed-file extraction and scanning; baseline and allowlist controls; language-aware detection (~20 languages); and a native Windows binary. See [Origins and Divergence](#origins-and-divergence) for details.
## Key Features
- **Performance**: multithreaded, Hyperscanpowered scanning built for huge codebases
- **Extensible rules**: hundreds of built-in detectors plus YAML-defined custom rules ([docs/RULES.md](/docs/RULES.md))
- **Broad AI SaaS coverage**: finds and validates tokens for OpenAI, Anthropic, Google Gemini, Cohere, Mistral, Stability AI, Replicate, xAI (Grok), Ollama, Langchain, Perplexity, Weights & Biases, Cerebras, Friendli, Fireworks.ai, NVIDIA NIM, Together.ai, Zhipu, and many more
- **Multiple targets**:
- **Git history**: local repos or GitHub/GitLab orgs/users
- **Repository artifacts**: with `--repo-artifacts`, scan GitHub/GitLab repository artifacts such as issues, pull/merge requests, wikis, snippets, and owner gists in addition to code
@ -154,18 +145,18 @@ docker run --rm \
# 🔐 Detection Rules at a Glance
Kingfisher ships with hundreds of rules that cover everything from classic cloud keys to the latest LLM-API secrets. Below is an overview:
Kingfisher ships with [hundreds of rules](/data/rules/) that cover everything from classic cloud keys to the latest AI SaaS tokens. Below is an overview:
| Category | What we catch |
|----------|---------------|
| **AI / LLM APIs** | OpenAI, Anthropic, Google Gemini, Cohere, Mistral, Stability AI, Replicate, xAI (Grok), and more
| **Cloud Providers** | AWS, Azure, GCP, Alibaba Cloud, DigitalOcean, IBM Cloud, Cloudflare, and more
| **Dev & CI/CD** | GitHub/GitLab tokens, CircleCI, TravisCI, TeamCity, Docker Hub, npm, PyPI, and more
| **Messaging & Comms** | Slack, Discord, Microsoft Teams, Twilio, Mailgun, SendGrid, Mailchimp, and more
| **Databases & Data Ops** | MongoDB Atlas, PlanetScale, Postgres DSNs, Grafana Cloud, Datadog, Dynatrace, and more
| **Payments & Billing** | Stripe, PayPal, Square, GoCardless, and more
| **Security & DevSecOps** | Snyk, Dependency-Track, CodeClimate, Codacy, OpsGenie, PagerDuty, and more
| **Misc. SaaS & Tools** | 1Password, Adobe, Atlassian/Jira, Asana, Netlify, Baremetrics, and more
| **AI SaaS APIs** | OpenAI, Anthropic, Google Gemini, Cohere, Mistral, Stability AI, Replicate, xAI (Grok), Ollama, Langchain, Perplexity, Weights & Biases, Cerebras, Friendli, Fireworks.ai, NVIDIA NIM, together.ai, Zhipu, and more |
| **Cloud Providers** | AWS, Azure, GCP, Alibaba Cloud, DigitalOcean, IBM Cloud, Cloudflare, and more |
| **Dev & CI/CD** | GitHub/GitLab tokens, CircleCI, TravisCI, TeamCity, Docker Hub, npm, PyPI, and more |
| **Messaging & Comms** | Slack, Discord, Microsoft Teams, Twilio, Mailgun, SendGrid, Mailchimp, and more |
| **Databases & Data Ops** | MongoDB Atlas, PlanetScale, Postgres DSNs, Grafana Cloud, Datadog, Dynatrace, and more |
| **Payments & Billing** | Stripe, PayPal, Square, GoCardless, and more |
| **Security & DevSecOps** | Snyk, Dependency-Track, CodeClimate, Codacy, OpsGenie, PagerDuty, and more |
| **Misc. SaaS & Tools** | 1Password, Adobe, Atlassian/Jira, Asana, Netlify, Baremetrics, and more |
## Write Custom Rules!
@ -543,9 +534,11 @@ Kingfisher automatically queries GitHub for a newer release when it starts and t
- **Hands-free updates** Add `--self-update` to any Kingfisher command
* If a newer version exists, Kingfisher will download it, replace the running binary, and re-launch itself with the **exact same arguments**.
* If a newer version exists, Kingfisher will download it, replace the running binary, and re-launch itself with the **exact same arguments**.
* If the update fails or no newer release is found, the current run proceeds as normal
- **Manual update** Run `kingfisher self-update` to update the binary without scanning
- **Disable version checks** Pass `--no-update-check` to skip both the startup and shutdown checks entirely
# Advanced Options
@ -661,6 +654,20 @@ Use `--rule-stats` to collect timing information for every rule. After scanning,
kingfisher scan --help
```
## Origins and Divergence
Kingfisher began as a fork of Praetorians Nosey Parker, as our experiment with adding live validation support and embedding that validation directly inside each rule.
Since that initial fork, it has diverged heavily from Nosey Parker:
- Replaced the SQLite datastore with an in-memory store + Bloom filter
- Collapsed the workflow into a single scan-and-report phase with direct JSON/BSON/SARIF outputs
- Added Tree-Sitter parsing on top of Hyperscan for deeper language-aware detection
- Removed datastore-driven reporting/annotations in favor of live validation, baselines, allowlists, and compressed-file extraction
- Expanded support for new targets (GitLab, Jira, Confluence, Slack, S3, Docker, etc.)
- Delivered cross-platform builds, including native Windows
# Roadmap
- More rules

View file

@ -33,4 +33,4 @@ rules:
- "csk-6nptf4w5cx36fw58t3hkx48jvm52wm693pex5tjm29kn55yt"
- "csk-e2knhj8h3h4erp6crfx6rh52tvecj4xnwmtjf3mtrvtt54et"
- "csk-rhw8npjrp6kpv9phm55n5nv5rkkm4492jepx3yh65dc9cwe9"
- "csk-w6p3nxk3`c5249mrpmv642fffert28rwdkepffrpn8rtfr9h"
- "csk-w6p3nxk3dc5249mrpmv642fffert28rwdkepffrpn8rtfr9h"

View file

@ -7,8 +7,7 @@ use sysinfo::{MemoryRefreshKind, RefreshKind, System};
use tracing::Level;
use crate::cli::commands::{
github::GitHubArgs, gitlab::GitLabArgs, rules::RulesArgs,
scan::ScanArgs,
github::GitHubArgs, gitlab::GitLabArgs, rules::RulesArgs, scan::ScanArgs,
};
#[deny(missing_docs)]
@ -63,6 +62,10 @@ pub enum Command {
/// Manage rules
#[command(alias = "rule")]
Rules(RulesArgs),
/// Update the Kingfisher binary
#[command(name = "self-update")]
SelfUpdate,
}
pub static RAM_GB: Lazy<Option<f64>> = Lazy::new(|| {

View file

@ -78,9 +78,10 @@ fn main() -> anyhow::Result<()> {
// Determine the number of jobs, defaulting to the number of CPUs
let num_jobs = match args.command {
Command::Scan(ref scan_args) => scan_args.num_jobs,
Command::SelfUpdate => 1, // Self-update doesn't need a thread pool
Command::GitHub(_) => num_cpus::get(), // Default for GitHub commands
Command::GitLab(_) => num_cpus::get(), // Default for GitLab commands
Command::Rules(_) => num_cpus::get(), // Default for Rules commands
Command::Rules(_) => num_cpus::get(), // Default for Rules commands
};
// Set up the Tokio runtime with the specified number of threads
@ -171,92 +172,97 @@ pub fn determine_exit_code(datastore: &Arc<Mutex<findings_store::FindingsStore>>
}
async fn async_main(args: CommandLineArgs) -> Result<()> {
// Create a temporary directory
let temp_dir = TempDir::new().context("Failed to create temporary directory")?;
let clone_dir = temp_dir.path().to_path_buf();
// Create the in-memory datastore
let datastore = Arc::new(Mutex::new(FindingsStore::new(clone_dir)));
setup_logging(&args.global_args);
let update_msg = check_for_update(&args.global_args, None);
let global_args = args.global_args.clone();
match args.command {
Command::Scan(mut scan_args) => {
// —————————————————————————————————————————
// If no paths or a single "-", slurp stdin into a temp file
// —————————————————————————————————————————
info!(
"Launching with {} concurrent scan jobs. Use --num-jobs to override.",
&scan_args.num_jobs
);
let paths = &scan_args.input_specifier_args.path_inputs;
let is_dash = paths.iter().any(|p| p.as_os_str() == "-");
if (paths.is_empty() || is_dash) && !atty::is(atty::Stream::Stdin) {
// read all stdin
let mut buf = Vec::new();
std::io::stdin().read_to_end(&mut buf)?;
// write into temp_dir
let stdin_file = temp_dir.path().join("stdin_input");
std::fs::write(&stdin_file, buf)?;
// replace inputs
scan_args.input_specifier_args.path_inputs = vec![stdin_file.into()];
}
// now proceed exactly as before
let rules_db = Arc::new(load_and_record_rules(&scan_args, &datastore)?);
run_scan(&args.global_args, &scan_args, &rules_db, Arc::clone(&datastore)).await?;
let exit_code = determine_exit_code(&datastore);
if let Err(e) = temp_dir.close() {
eprintln!("Failed to close temporary directory: {}", e);
}
std::process::exit(exit_code);
Command::SelfUpdate => {
let mut g = global_args;
g.self_update = true;
g.no_update_check = false;
check_for_update(&g, None);
Ok(())
}
Command::Rules(ref rule_args) => match &rule_args.command {
RulesCommand::Check(check_args) => {
run_rules_check(&check_args)?;
}
RulesCommand::List(list_args) => {
run_rules_list(&list_args)?;
}
},
Command::GitHub(github_args) => match github_args.command {
GitHubCommand::Repos(repos_command) => match repos_command {
GitHubReposCommand::List(list_args) => {
github::list_repositories(
github_args.github_api_url,
args.global_args.ignore_certs,
args.global_args.use_progress(),
&list_args.repo_specifiers.user,
&list_args.repo_specifiers.organization,
list_args.repo_specifiers.all_organizations,
list_args.repo_specifiers.repo_type.into(),
)
.await?;
command => {
let temp_dir = TempDir::new().context("Failed to create temporary directory")?;
let clone_dir = temp_dir.path().to_path_buf();
let datastore = Arc::new(Mutex::new(FindingsStore::new(clone_dir)));
let update_msg = check_for_update(&global_args, None);
match command {
Command::Scan(mut scan_args) => {
info!(
"Launching with {} concurrent scan jobs. Use --num-jobs to override.",
&scan_args.num_jobs
);
let paths = &scan_args.input_specifier_args.path_inputs;
let is_dash = paths.iter().any(|p| p.as_os_str() == "-");
if (paths.is_empty() || is_dash) && !atty::is(atty::Stream::Stdin) {
let mut buf = Vec::new();
std::io::stdin().read_to_end(&mut buf)?;
let stdin_file = temp_dir.path().join("stdin_input");
std::fs::write(&stdin_file, buf)?;
scan_args.input_specifier_args.path_inputs = vec![stdin_file.into()];
}
let rules_db = Arc::new(load_and_record_rules(&scan_args, &datastore)?);
run_scan(&global_args, &scan_args, &rules_db, Arc::clone(&datastore)).await?;
let exit_code = determine_exit_code(&datastore);
if let Err(e) = temp_dir.close() {
eprintln!("Failed to close temporary directory: {}", e);
}
std::process::exit(exit_code);
}
},
},
Command::GitLab(gitlab_args) => match gitlab_args.command {
GitLabCommand::Repos(repos_command) => match repos_command {
GitLabReposCommand::List(list_args) => {
kingfisher::gitlab::list_repositories(
gitlab_args.gitlab_api_url,
args.global_args.ignore_certs,
args.global_args.use_progress(),
&list_args.repo_specifiers.user,
&list_args.repo_specifiers.group,
list_args.repo_specifiers.all_groups,
list_args.repo_specifiers.include_subgroups,
list_args.repo_specifiers.repo_type.into(),
)
.await?;
}
},
},
Command::Rules(ref rule_args) => match &rule_args.command {
RulesCommand::Check(check_args) => {
run_rules_check(&check_args)?;
}
RulesCommand::List(list_args) => {
run_rules_list(&list_args)?;
}
},
Command::GitHub(github_args) => match github_args.command {
GitHubCommand::Repos(repos_command) => match repos_command {
GitHubReposCommand::List(list_args) => {
github::list_repositories(
github_args.github_api_url,
global_args.ignore_certs,
global_args.use_progress(),
&list_args.repo_specifiers.user,
&list_args.repo_specifiers.organization,
list_args.repo_specifiers.all_organizations,
list_args.repo_specifiers.repo_type.into(),
)
.await?;
}
},
},
Command::GitLab(gitlab_args) => match gitlab_args.command {
GitLabCommand::Repos(repos_command) => match repos_command {
GitLabReposCommand::List(list_args) => {
kingfisher::gitlab::list_repositories(
gitlab_args.gitlab_api_url,
global_args.ignore_certs,
global_args.use_progress(),
&list_args.repo_specifiers.user,
&list_args.repo_specifiers.group,
list_args.repo_specifiers.all_groups,
list_args.repo_specifiers.include_subgroups,
list_args.repo_specifiers.repo_type.into(),
)
.await?;
}
},
},
Command::SelfUpdate => unreachable!(),
}
if let Some(msg) = update_msg {
info!("{msg}");
}
Ok(())
}
}
if let Some(msg) = update_msg {
info!("{msg}");
}
Ok(())
}
/// Create a default ScanArgs instance for rule loading

View file

@ -15,11 +15,7 @@
// `style_finding_active_heading` style so that they stand out alongside normal
// scan output.
use std::{
fs,
io::{ErrorKind, IsTerminal},
path::PathBuf,
};
use std::io::{ErrorKind, IsTerminal};
use self_update::{backends::github::Update, cargo_crate_version, errors::Error as UpdError};
use semver::Version;
@ -27,17 +23,6 @@ use tracing::{error, info, warn};
use crate::{cli::global::GlobalArgs, reporter::styles::Styles};
/// Return `true` when the canonical executable path lives inside a Homebrew Cellar.
/// Works for Intel macOS (/usr/local/Cellar), AppleSilicon macOS (/opt/homebrew/Cellar)
/// and Linuxbrew (~/.linuxbrew/Cellar).
fn installed_via_homebrew() -> bool {
fn canonical_exe() -> Option<PathBuf> {
std::env::current_exe().ok().and_then(|p| fs::canonicalize(p).ok())
}
canonical_exe().map(|p| p.components().any(|c| c.as_os_str() == "Cellar")).unwrap_or(false)
}
/// Check GitHub for a newer Kingfisher release and optionally selfupdate.
///
/// * `base_url` lets tests point at a mock server.
@ -51,16 +36,6 @@ pub fn check_for_update(global_args: &GlobalArgs, base_url: Option<&str>) -> Opt
let use_color = std::io::stderr().is_terminal() && !global_args.quiet;
let styles = Styles::new(use_color);
let is_brew = installed_via_homebrew();
if is_brew {
info!(
"{}",
styles.style_finding_active_heading.apply_to(
"Homebrew install detected - will notify about updates but not self-update"
)
);
}
info!("{}", "Checking for updates…");
let mut builder = Update::configure();
@ -145,7 +120,7 @@ pub fn check_for_update(global_args: &GlobalArgs, base_url: Option<&str>) -> Opt
info!("{}", styles.style_finding_active_heading.apply_to(&plain));
// Attempt selfupdate when allowed and feasible.
if global_args.self_update && !is_brew {
if global_args.self_update {
match updater.update() {
Ok(status) => info!(
"{}",
@ -167,13 +142,6 @@ pub fn check_for_update(global_args: &GlobalArgs, base_url: Option<&str>) -> Opt
_ => error!("Failed to update: {e}"),
},
}
} else if is_brew {
info!(
"{}",
styles
.style_finding_active_heading
.apply_to("Run `brew upgrade kingfisher` to install the new version.")
);
}
Some(plain)

View file

@ -101,7 +101,6 @@ pub async fn validate_jwt_with(token: &str, opts: &ValidateOptions) -> Result<(b
let header_val: serde_json::Value =
serde_json::from_slice(&header_json).map_err(|e| anyhow!("invalid header json: {e}"))?;
let alg_str = header_val.get("alg").and_then(|v| v.as_str()).unwrap_or("");
// --- Policy: reject `alg: none` unless explicitly allowed ------------------
if alg_str.eq_ignore_ascii_case("none") {
@ -119,7 +118,7 @@ pub async fn validate_jwt_with(token: &str, opts: &ValidateOptions) -> Result<(b
return Ok((false, "unsigned JWT (alg: none) not allowed".into()));
}
}
// Safe to decode full header now that we know alg != none
let header = decode_header(token).map_err(|e| anyhow!("decode header: {e}"))?;
let alg = header.alg;