This commit is contained in:
Mick Grove 2026-01-01 22:24:57 -08:00
commit 7237a931d5
101 changed files with 4943 additions and 350 deletions

1
.gitignore vendored
View file

@ -28,6 +28,7 @@ logs/*
# Icon must end with two \r
Icon
node_modules/
# Thumbnails
._*

View file

@ -1,7 +1,7 @@
- id: kingfisher-docker
name: kingfisher (docker)
description: Run Kingfisher in Docker against staged changes at the repository root. No local install required.
entry: ghcr.io/kingfisher-sec/kingfisher:latest
entry: ghcr.io/mongodb/kingfisher:latest
language: docker
args: ["scan", ".", "--staged", "--quiet", "--no-update-check"]
pass_filenames: false

View file

@ -2,6 +2,14 @@
All notable changes to this project will be documented in this file.
## [v1.73.0]
- Will now prefer git history findings when identical secrets appear in both current files and git history (dedup only).
- Fixed report viewer to add support for opening JSONL.
- Add opt-in contributor repository enumeration for GitHub/GitLab `--git-url` scans with `--include-contributors`, plus `--repo-clone-limit` to cap repo cloning.
- Add `--git-clone-dir` to set the parent clone directory and `--keep-clones` to preserve cloned repos after scans.
- Added several new rules.
- Added configurable validation timeout and retry settings for `kingfisher scan`.
## [v1.72.0]
- Fixed deduplication for dependency-provider rules so dependent validations run per blob
- Updated Artifactory rule entropy and added new artifactory rule

View file

@ -10,7 +10,7 @@ publish = false
[package]
name = "kingfisher"
version = "1.72.0"
version = "1.73.0"
description = "MongoDB's blazingly fast and accurate secret scanning and validation tool"
edition.workspace = true
rust-version.workspace = true
@ -74,6 +74,7 @@ url = "2.5.7"
include_dir = { version = "0.7", features = ["glob"] }
strum = { version = "0.26", features = ["derive"] }
sysinfo = "0.31.4"
webbrowser = "1.0.5"
reqwest = { version = "0.12", default-features = false, features = [
"json",
"gzip",

105
README.md
View file

@ -11,6 +11,8 @@ Kingfisher is a blazingly fast secret-scanning and **live validation** tool buil
It combines Intels SIMD-accelerated regex engine (Hyperscan) with language-aware parsing to achieve high accuracy at massive scale, and **ships with hundreds of built-in rules** to detect, **validate**, and triage secrets before they ever reach production.
Designed for offensive security engineers and blue-teamers alike, Kingfisher helps you pivot across repo ecosystems, validate exposure paths, and hunt for developer-owned leaks that spill beyond the primary codebase.
For a look at how Kingfisher has grown from its early foundations into today's full-featured scanner, see [Lineage and Evolution](#lineage-and-evolution).
</p>
@ -33,7 +35,7 @@ For a look at how Kingfisher has grown from its early foundations into today's f
### Performance, Accuracy, and Hundreds of Rules
- **Performance**: multithreaded, Hyperscanpowered scanning built for huge codebases
- **Extensible rules**: hundreds of built-in detectors plus YAML-defined custom rules ([docs/RULES.md](/docs/RULES.md))
- **Blast Radius Mapping**: instantly map leaked keys to their effective cloud identities and exposed resources with `--access-map`
- **Blast Radius Mapping**: instantly map leaked keys to their effective cloud identities and exposed resources with `--access-map`. Supports AWS, GCP, Azure, GitHub, Gitlab, and more token support coming.
- **Broad AI SaaS coverage**: finds and validates tokens for OpenAI, Anthropic, Google Gemini, Cohere, AWS Bedrock, Voyage AI, Mistral, Stability AI, Replicate, xAI (Grok), Ollama, Langchain, Perplexity, Weights & Biases, Cerebras, Friendli, Fireworks.ai, NVIDIA NIM, Together.ai, Zhipu, and many more
- **Compressed Files**: Supports extracting and scanning compressed files for secrets
- **Baseline management**: generate and track baselines to suppress known secrets ([docs/BASELINE.md](/docs/BASELINE.md))
@ -51,10 +53,18 @@ See ([docs/COMPARISON.md](docs/COMPARISON.md))
</p>
## Basic Usage Demo
```bash
kingfisher scan /path/to/scan --view-report
```
NOTE: Replay has been slowed down for demo
![alt text](docs/kingfisher-usage-01.gif)
## Report Viewer Demo
Explore Kingfishers built-in report viewer and its `--access-map` feature for visualizing access relationships: [Access map outputs and viewer](#access-map-outputs-and-viewer)
Explore Kingfishers built-in report viewer and its `--access-map`, which can show what the token (AWS, GCP, Azure, GitHub, and GitLab...more coming) can actually access : [Access map outputs and viewer](#access-map-outputs-and-viewer)
```bash
kingfisher scan /path/to/scan --access-map --view-report
```
![alt text](docs/kingfisher-usage-access-map.gif)
@ -159,6 +169,7 @@ Explore Kingfishers built-in report viewer and its `--access-map` feature for
- [To add your rules alongside the builtins:](#to-add-your-rules-alongside-the-builtins)
- [Other Examples](#other-examples)
- [Customize the HTTP User-Agent](#customize-the-http-user-agent)
- [Validation tuning flags](#validation-tuning-flags)
- [Notable Scan Options](#notable-scan-options)
- [Understanding `--confidence`](#understanding---confidence)
- [Ignore known false positives](#ignore-known-false-positives)
@ -579,15 +590,16 @@ kingfisher scan /path/to/repo --format sarif --output findings.sarif
Finding a leaked credential is only the first step. The critical question isnt just “Is this a secret?”—its “What can an attacker do with it?”
Kingfisher's `--access-map` feature transforms secret detection from a simple alert into a comprehensive threat assessment. Instead of leaving you with a cryptic API key, Kingfisher actively authenticates against your cloud provider (AWS or GCP) to map the full extent of the credential's power.
Kingfisher's `--access-map` feature transforms secret detection from a simple alert into a comprehensive threat assessment. Instead of leaving you with a cryptic API key, Kingfisher actively authenticates against your cloud provider (AWS, GCP, Azure Storage, Azure DevOps, GitHub, or GitLab) to map the full extent of the credential's power.
* Instant Identity Resolution: Immediately identify who the key belongs to—whether it's a specific IAM user, an assumed role, or a service account.
* Visualize the Blast Radius: See exactly which resources (S3 buckets, EC2 instances, projects) are exposed and at risk.
* Visualize the Blast Radius: See exactly which resources (S3 buckets, EC2 instances, projects, storage containers) are exposed and at risk.
Add `--access-map` to enrich JSON, JSONL, BSON, pretty, and SARIF reports with an `access_map` containing the resources and the permissions that the key can access - for each resource (grouped when identical).
- If you validated cloud credentials without `--access-map`, Kingfisher will remind you on stderr to rerun with the flag so the access map appears in the output.
- Run `kingfisher view ./kingfisher.json` to explore a report locally in a local web UI
- Run `kingfisher view ./kingfisher.json` to explore a report locally in a local web UI (opens your browser automatically when a report is provided).
- Or use `kingfisher scan --view-report ...` to generate a JSON report, start the viewer at `http://127.0.0.1:7890`, and open it in your browser.
> **Use the access map functionality only when you are authorized to inspect the target account, as Kingfisher will issue additional network requests to determine what access the secret grants**
@ -599,7 +611,7 @@ Add `--access-map` to enrich JSON, JSONL, BSON, pretty, and SARIF reports with a
kingfisher view kingfisher.json
```
The `view` subcommand starts a local-only server (default port `7890`) that bundles the HTML, CSS, and JavaScript for the access-map viewer directly into the Kingfisher binary. Provide a JSON or JSONL report to load it automatically, or open the page and upload a report in the browser. If port 7890 is already in use, Kingfisher will exit and tell you to re-run with `--port <PORT>`.
The `view` subcommand starts a local-only server (default port `7890`) that bundles the HTML, CSS, and JavaScript for the access-map viewer directly into the Kingfisher binary. Provide a JSON or JSONL report to load it automatically and Kingfisher will open your browser, or open the page and upload a report in the browser. If port 7890 is already in use, Kingfisher will exit and tell you to re-run with `--port <PORT>`.
### Pipe any text directly into Kingfisher by passing `-`
@ -867,6 +879,7 @@ kingfisher scan docker private.registry.example.com/my-image:tag
```bash
kingfisher scan github --organization my-org
kingfisher scan github --organization my-org --repo-clone-limit 500
```
### Skip specific GitHub repositories during enumeration
@ -884,11 +897,24 @@ kingfisher scan github --organization my-org \
### Scan remote GitHub repository
`--git-url` clones the repository and scans its files and history. To also inspect
related server-side data, supply `--repo-artifacts`. This flag pulls down the
repository's issues (including pull requests), wiki, and any public gists owned by
the repository owner and scans them for secrets. Fetching these extras counts
against API rate limits and private artifacts require a `KF_GITHUB_TOKEN`.
`--git-url` clones the repository and scans its files and history. When the URL
targets GitHub and you pass `--include-contributors`, Kingfisher enumerates
repository contributors and attempts to clone **all public repos owned by those
contributors**—a common offensive and blue-team pivot when developers leak
secrets in personal or side projects. Use `--repo-clone-limit` to cap how many
repositories are cloned during this enumeration.
**NOTE**: This may cause you to be temporarily rate-limited by GitHub.
Providing a token (`KF_GITHUB_TOKEN`) will provide a higher rate limit.
To inspect related server-side data, supply `--repo-artifacts`. This flag pulls
down the repository's issues (including pull requests), wiki, and any public
gists owned by the repository owner and scans them for secrets. Fetching these
extras counts against API rate limits and private artifacts require a
`KF_GITHUB_TOKEN`.
Use `--git-clone-dir` to choose where cloned repositories land and
`--keep-clones` to preserve them for follow-on analysis.
> **Why does `--git-url` sometimes report fewer findings than scanning a local checkout?**.
>
@ -905,6 +931,16 @@ against API rate limits and private artifacts require a `KF_GITHUB_TOKEN`.
# Scan the repository only
kingfisher scan --git-url https://github.com/org/repo.git
# Scan the repository plus contributor repos, but cap the crawl
kingfisher scan --git-url https://github.com/org/repo.git \
--include-contributors \
--repo-clone-limit 250
# Keep clones for later manual inspection
kingfisher scan --git-url https://github.com/org/repo.git \
--git-clone-dir ./kingfisher-clones \
--keep-clones
# Include issues, wiki, and owner gists
kingfisher scan --git-url https://github.com/org/repo.git --repo-artifacts
@ -922,6 +958,7 @@ KF_GITHUB_TOKEN="ghp_…" kingfisher scan --git-url https://github.com/org/priva
kingfisher scan gitlab --group my-group
# include repositories from all nested subgroups
kingfisher scan gitlab --group my-group --include-subgroups
kingfisher scan gitlab --group my-group --repo-clone-limit 500
```
### Scan GitLab user
@ -945,15 +982,37 @@ kingfisher scan gitlab --group my-group \
### Scan remote GitLab repository by URL
`--git-url` by itself clones the project repository. To include server-side
artifacts owned by the project, add `--repo-artifacts`. Kingfisher will retrieve
the project's issues, wiki, and snippets and scan them for secrets. These extra
requests may take longer and require a `KF_GITLAB_TOKEN` for private projects.
`--git-url` by itself clones the project repository. When the URL targets
GitLab and you pass `--include-contributors`, Kingfisher enumerates contributors
and tries to clone **their other public projects** to catch secrets that escape
the main repo. Apply `--repo-clone-limit` to cap the total repos cloned during
this pivot.
**NOTE**: This may cause you to be temporarily rate-limited by GitLab.
Providing a token (`KF_GITLAB_TOKEN`) will provide a higher rate limit.
To include server-side artifacts owned by the project, add `--repo-artifacts`.
Kingfisher will retrieve the project's issues, wiki, and snippets and scan them
for secrets. These extra requests may take longer and require a
`KF_GITLAB_TOKEN` for private projects.
Use `--git-clone-dir` to choose where cloned projects land and `--keep-clones`
to preserve them for later review.
```bash
# Scan the repository only
kingfisher scan --git-url https://gitlab.com/group/project.git
# Scan the repository plus contributor projects, but cap the crawl
kingfisher scan --git-url https://gitlab.com/group/project.git \
--include-contributors \
--repo-clone-limit 250
# Keep clones for later manual inspection
kingfisher scan --git-url https://gitlab.com/group/project.git \
--git-clone-dir ./kingfisher-clones \
--keep-clones
# Include issues, wiki, and snippets
kingfisher scan --git-url https://gitlab.com/group/project.git --repo-artifacts
@ -1377,14 +1436,24 @@ kingfisher --user-agent-suffix "Sept 2025 testing" scan github --user my-user --
```
When omitted, Kingfisher defaults to `kingfisher/<version> Mozilla/5.0 ...`. The suffix is trimmed; passing an empty string
leaves the default unchanged.
{"$id":"1","innerException":null,"message":"VS403403: Cannot find any branches for the test-project repository.","typeName":"Microsoft.TeamFoundation.Git.Server.GitItemNotFoundException, Microsoft.TeamFoundation.Git.Server","typeKey":"GitItemNotFoundException","errorCode":0,"eventId":3000}
## Validation tuning flags
Use these options with `kingfisher scan` to customize live validation behavior:
- `--validation-timeout SECONDS`: per-request and per-match timeout for validation (default: 10, range: 1-60).
- `--validation-retries N`: number of retry attempts for validation requests (default: 1, range: 0-5).
## Notable Scan Options
- `--no-dedup`: Report every occurrence of a finding (disable the default de-duplicate behavior)
- `--no-base64`: By default, Kingfisher finds and decodes base64 blobs and scans them for secrets. This adds a slight performance overhead; use this flag to disable
- `--confidence <LEVEL>`: (low|medium|high)
- `--min-entropy <VAL>`: Override default threshold
- `--include-contributors`: When using `--git-url` for GitHub or GitLab, include contributor-owned repos in the scan
- `--git-clone-dir <DIR>`: Choose the parent directory for cloned repos and scan artifacts (use with `--git-url`)
- `--keep-clones`: Preserve cloned repositories on disk after a scan completes
- `--repo-clone-limit <N>`: Cap the number of GitHub/GitLab repositories cloned when enumerating orgs/groups or contributor repos
- `--no-binary`: Skip binary files
- `--no-extract-archives`: Do not scan inside archives
- `--extraction-depth <N>`: Specifies how deep nested archives should be extracted and scanned (default: 2)
@ -1399,6 +1468,8 @@ leaves the default unchanged.
- `--ignore-comment <DIRECTIVE>`: Honor additional inline directives from other scanners (repeatable; e.g. `--ignore-comment "gitleaks:allow"`)
- `--no-ignore`: Disable inline directives entirely so every match is reported
- `--no-ignore-if-contains`: Ignore the `ignore_if_contains` filter in rules so placeholder words still produce findings
- `--validation-timeout SECONDS`: per-request and per-match timeout for validation (default: 10, range: 1-60).
- `--validation-retries N`: number of retry attempts for validation requests (default: 1, range: 0-5).
## Understanding `--confidence`

View file

@ -8,7 +8,7 @@ rules:
pat
[a-z0-9]{14}
\.
[a-z0-9]{62,66}
[a-z0-9]{64}
)
\b
pattern_requirements:

46
data/rules/alchemy.yml Normal file
View file

@ -0,0 +1,46 @@
rules:
- name: Alchemy API Key
id: kingfisher.alchemy.1
pattern: |
(?xi)
\balchemy
(?:.|[\n\r]){0,96}?
(?:
/v2/
|
api[_-]?key|key|token|secret|url|endpoint|rpc
)
(?:.|[\n\r]){0,96}?
\b
(
[A-Za-z0-9_-]{24,64}
)
\b
pattern_requirements:
min_digits: 4
min_entropy: 3.5
confidence: medium
examples:
- alchemy_key="PajdHzB75s1V_7aldcQ6XbodqDCWMC7m"
- https://eth-mainnet.alchemyapi.io/v2/PajdHzB75s1V_7aldcQ6XbodqDCWMC7m
- https://eth-goerli.alchemyapi.io/v2/AGtF3w2AsccY_bfsdDleaVRehW2xGS7W
references:
- https://www.alchemy.com/rpc/ethereum
- https://www.alchemy.com/docs/reference/nft-api-endpoints/nft-api-endpoints/nft-ownership-endpoints/get-nf-ts-for-owner-v-3
validation:
type: Http
content:
request:
method: GET
url: "https://eth-mainnet.g.alchemy.com/nft/v3/{{ TOKEN }}/getNFTsForOwner?owner=0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045"
headers:
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"ownedNfts"']
- type: WordMatch
negative: true
words: ['"error"']

View file

@ -8,6 +8,8 @@ rules:
(?: AccountKey | SharedAccessKey | SharedSecretValue) \s*=\s* ([^;]{1,100})
(?: ;|$ )
min_entropy: 3.3
pattern_requirements:
min_digits: 2
confidence: medium
examples:
- |
@ -78,6 +80,7 @@ rules:
[a-z0-9][a-z0-9-]{1,100}[a-z0-9]
)\.azurecr\.io
confidence: medium
visible: false
min_entropy: 2.0
examples:
- "myregistry.azurecr.io"

View file

@ -36,11 +36,11 @@ rules:
["':\s=}\]\)]
(
(?:
[A-Z0-9+\-]{86,88}={1,2}
[A-Z0-9+\\/-]{86,88}={1,2}
)
|
(?:
[A-Z0-9+\-]{86,88}\b
[A-Z0-9+\\/-]{86,88}\b
)
)
pattern_requirements:

81
data/rules/coveralls.yml Normal file
View file

@ -0,0 +1,81 @@
rules:
- name: Coveralls Repo Identifier
id: kingfisher.coveralls.1
visible: false
confidence: medium
min_entropy: 2.0
pattern: |
(?xi)
(?:
coveralls\.io/
(?:
(?:
github|bitbucket|gitlab
)
/
(
[A-Z0-9_.-]+
)
/
(
[A-Z0-9_.-]+
)
)
|
api/v1/repos/
(
github|bitbucket|gitlab
)
/
(
[A-Z0-9_.-]+
)
)
examples:
- https://coveralls.io/github/lemurheavy/coveralls-public
- https://coveralls.io/gitlab/group/project
- https://coveralls.io/api/v1/repos/github/octocat/hello-world
- name: Coveralls Personal API Token
id: kingfisher.coveralls.2
pattern: |
(?xi)
\b
coveralls
(?:.|[\n\r]){0,1}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN)
(?:.|[\n\r]){0,32}?
\b
(
[A-Z0-9-]{37}
)
\b
pattern_requirements:
min_digits: 3
min_entropy: 3.3
confidence: medium
examples:
- coveralls_SECRETTOKEN abcdefghijklmnopqrstuvwxyzab12345cdef
- coveralls-SECRET-KEY mnopqrstuvwxyzabcdefghi12345678901234
- coveralls_PRIVATEKEY-1234567890abcdefghijklmnopqrstuvwxyza
references:
- https://docs.coveralls.io/api-repos-endpoint
- https://docs.coveralls.io/api-introduction
depends_on_rule:
- rule_id: kingfisher.coveralls.1
variable: COVERALLS_REPO_ID
validation:
type: Http
content:
request:
method: GET
url: "https://coveralls.io/api/v1/repos/{{ COVERALLS_REPO_ID }}"
headers:
Authorization: "token {{ TOKEN }}"
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"service"', '"name"', '"id"']

View file

@ -1,20 +1,52 @@
rules:
- name: Datadog API Key
id: kingfisher.datadog.3
# Helper: extract the Datadog site domain from common config/env/URLs.
# We capture the "site parameter" (domain), then validation uses https://api.<site>.
- name: Datadog Site Domain
id: kingfisher.datadog.1
visible: false
confidence: medium
min_entropy: 2.0
pattern: |
(?xi)
(?:
# env/config patterns
\b(?:DD_SITE|DATADOG_SITE|DATADOG_HOST)\b\s*[:=]\s*["']?
(?:https?://)?
(?:api\.|app\.)?
|
# raw URLs in code/docs
\bhttps?://(?:api\.|app\.)?
)?
(
datadoghq\.com
| us3\.datadoghq\.com
| us5\.datadoghq\.com
| datadoghq\.eu
| ap1\.datadoghq\.com
| ap2\.datadoghq\.com
| ddog-gov\.com
)
\b
(?:datadog|dd)
examples:
- DD_SITE=datadoghq.eu
- DATADOG_HOST=https://api.us3.datadoghq.com
- https://app.datadoghq.com
- https://api.ddog-gov.com
- name: Datadog API Key
id: kingfisher.datadog.2
pattern: |
(?xi)
\b(?:datadog|dd)
(?:.|[\n\r]){0,64}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN)
(?:api[_-]?key|dd[_-]?api[_-]?key|secret|private|access|token)
(?:.|[\n\r]){0,32}?
\b
(
[A-Za-z0-9]{32}
[A-Z0-9]{32}
)
\b
pattern_requirements:
min_digits: 2
min_digits: 3
min_entropy: 3.3
confidence: medium
examples:
@ -30,84 +62,55 @@ rules:
headers:
Accept: application/json
DD-API-KEY: "{{ TOKEN }}"
response_matcher:
- report_response: true
- status:
- 200
type: StatusMatch
- type: WordMatch
words:
- '"Forbidden"'
negative: true
- name: Datadog Application Key
id: kingfisher.datadog.3
pattern: |
(?xi)
\b(?:datadog|dd)
(?:.|[\n\r]){0,64}?
(?:app(?:lication)?[_-]?key|dd[_-]?application[_-]?key|secret|private|access|token)
(?:.|[\n\r]){0,32}?
\b
(
[A-Za-z0-9-]{40}
)
\b
pattern_requirements:
min_digits: 3
min_entropy: 3.5
confidence: medium
examples:
- DD_APPLICATION_KEY=abcDEF0123456789abcDEF0123456789abcDEF01
references:
- https://docs.datadoghq.com/account_management/api-app-keys/
- https://docs.datadoghq.com/getting_started/site/
depends_on_rule:
- rule_id: kingfisher.datadog.2
variable: DD_API_KEY
- rule_id: kingfisher.datadog.1
variable: DD_SITE_DOMAIN
validation:
type: Http
content:
request:
method: GET
# Datadog recommends /api/v2/validate_keys to verify app keys with the key pair
url: "https://api.{{ DD_SITE_DOMAIN }}/api/v2/validate_keys"
headers:
Accept: application/json
DD-API-KEY: "{{ DD_API_KEY }}"
DD-APPLICATION-KEY: "{{ TOKEN }}"
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
# - name: Datadog API Key
# id: kingfisher.datadog.1
# pattern: |
# (?xi)
# \b
# datadog
# (?:.|[\n\r]){0,64}?
# (?:SECRET|PRIVATE|ACCESS|KEY|TOKEN)
# (?:.|[\n\r]){0,32}?
# \b
# (
# [a-z0-9]{32}
# )
# \b
# pattern_requirements:
# min_digits: 2
# min_entropy: 3.3
# confidence: medium
# examples:
# - datadog-secrettoken-0024a29224affe29d173c0bf99e5a89d
# references:
# - https://docs.datadoghq.com/account_management/api-app-keys/
# validation:
# type: Http
# content:
# request:
# headers:
# Accept: application/json
# DD-API-KEY: '{{ TOKEN }}'
# DD-APPLICATION-KEY: '{{ APPKEY }}'
# method: GET
# response_matcher:
# - report_response: true
# - status:
# - 200
# type: StatusMatch
# url: https://api.datadoghq.com/api/v2/current_user
# depends_on_rule:
# - rule_id: kingfisher.datadog.2
# variable: APPKEY
# - name: Datadog API Key (API-only validation)
# id: kingfisher.datadog.3
# pattern: |
# (?xi)
# \b
# (?:datadog|dd)
# (?:.|[\n\r]){0,64}?
# (?:SECRET|PRIVATE|ACCESS|KEY|TOKEN)?
# (?:.|[\n\r]){0,32}?
# \b
# (
# [A-Za-z0-9]{32}
# )
# \b
# pattern_requirements:
# min_digits: 2
# min_entropy: 3.3
# confidence: medium
# examples:
# - DD_API_KEY=0024a29224affe29d173c0bf99e5a89d
# references:
# - https://docs.datadoghq.com/account_management/api-app-keys/
# validation:
# type: Http
# content:
# request:
# method: GET
# url: https://api.datadoghq.com/api/v1/validate
# headers:
# Accept: application/json
# DD-API-KEY: "{{ TOKEN }}"
# response_matcher:
# - report_response: true
# - type: StatusMatch
# status: [200]

40
data/rules/datagov.yml Normal file
View file

@ -0,0 +1,40 @@
rules:
- name: Data.gov API Key
id: kingfisher.datagov.1
pattern: |
(?xi)
\b
data\.gov
(?:.|[\n\r]){0,32}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN)
(?:.|[\n\r]){0,32}?
\b
(
[a-zA-Z0-9]{40}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5
confidence: medium
examples:
- data.gov_api_key=pBZm2kXbuPdRfzYyarRT0bvcWAisnJg98YJzBJyJ
- data.gov_token=plZJDnKs4OrPeV8wgBr2fYO6VnXb1YPEcVaZbnYI
references:
- https://api.data.gov/docs/developer-manual/
- https://developer.nrel.gov/docs/api-key/
- https://developer.nrel.gov/docs/errors/
validation:
type: Http
content:
request:
method: GET
# NREL (developer.nrel.gov) uses api.data.gov-managed keys and accepts api_key as a query param.
url: "https://developer.nrel.gov/api/alt-fuel-stations/v1.json?limit=1&api_key={{ TOKEN }}"
headers:
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]

39
data/rules/disqus.yml Normal file
View file

@ -0,0 +1,39 @@
rules:
- name: Disqus API Key
id: kingfisher.disqus.1
pattern: |
(?xi)
\b
disqus
(?:.|[\n\r]){0,32}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN)
(?:.|[\n\r]){0,32}?
\b
(
[a-zA-Z0-9]{64}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.3
confidence: medium
examples:
- disqus_secret_key = jK5HbxY2QrPn7vMNL8tADcF3mWg4kXqR9sBdZyE1hVuT6fGwJpC0nI9vUxY2aM3K
- DISQUS_PRIVATE_TOKEN = Nh7vRf3mKp9wXc5tJq2YbL8sAg4dB6TzWeUx1nGQjCkPyDHVME0aI1FSx2Z5vY3n
references:
- https://disqus.com/api/docs/requests/
- https://disqus.com/api/docs/threads/list/
validation:
type: Http
content:
request:
method: GET
url: "https://disqus.com/api/3.0/threads/list.json?limit=1&api_secret={{ TOKEN }}"
headers:
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"code":0', '"response"']

61
data/rules/endorlabs.yml Normal file
View file

@ -0,0 +1,61 @@
rules:
- name: Endor Labs API Key
id: kingfisher.endorlabs.1
visible: false
confidence: medium
min_entropy: 3.0
pattern: |
(?xi)
\b
ENDOR_API_CREDENTIALS_KEY
(?:.|[\n\r]){0,32}?
(
endr\+[A-Za-z0-9-]{16}
)
\b
examples:
- ENDOR_API_CREDENTIALS_KEY=endr+foo1234567890abc
pattern_requirements:
min_digits: 2
- name: Endor Labs API Secret
id: kingfisher.endorlabs.2
pattern: |
(?xi)
\b
ENDOR_API_CREDENTIALS_SECRET
(?:.|[\n\r]){0,32}?
(
endr\+[A-Za-z0-9-]{16}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5
confidence: medium
examples:
- ENDOR_API_CREDENTIALS_SECRET=endr+bar1234567890abc
references:
- https://docs.endorlabs.com/rest-api/authentication/
depends_on_rule:
- rule_id: kingfisher.endorlabs.1
variable: ENDOR_API_KEY
validation:
type: Http
content:
request:
method: POST
# Endor Labs exchanges key+secret for an ENDOR_TOKEN via this endpoint
url: https://api.endorlabs.com/v1/auth/api-key
headers:
Content-Type: application/json
Accept: application/json
body: |
{"key":"{{ ENDOR_API_KEY }}","secret":"{{ TOKEN }}"}
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: JsonValid
- type: WordMatch
words: ['"token"']

42
data/rules/eventbrite.yml Normal file
View file

@ -0,0 +1,42 @@
rules:
- name: Eventbrite API Key
id: kingfisher.eventbrite.1
pattern: |
(?xi)
\b
eventbrite
(?:.|[\n\r]){0,32}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN)
(?:.|[\n\r]){0,32}?
\b
(
[0-9A-Z]{20}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5
confidence: medium
examples:
- eventbrite secretkey X7W8HTTHLVYXPPVRZJZS
- eventbrite privatekey YTR4GR5T89WQP8HJKLDF
- '"eventbrite private access key ZXC2JK3HV4TY5UIO6PLK"'
- eventbrite token ABCDEF1234567890QRST
references:
- https://www.eventbrite.com/platform/docs/authentication
- https://www.eventbrite.com/platform/docs/organizations
validation:
type: Http
content:
request:
method: GET
url: https://www.eventbriteapi.com/v3/users/me/organizations/
headers:
Authorization: "Bearer {{ TOKEN }}"
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"organizations"']

47
data/rules/exaai.yml Normal file
View file

@ -0,0 +1,47 @@
rules:
- name: Exa AI API Key
id: kingfisher.exa.1
pattern: |
(?xi)
(?:
\b(?:exa|exa[_-]?api|exa[_-]?key|exa[_-]?api[_-]?key)\b
(?:.|[\n\r]){0,96}?
|
\bx-api-key\b
(?:\s*[:=]\s*|(?:.|[\n\r]){0,16}?)
)
\b
(
[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}
)
\b
pattern_requirements:
min_digits: 4
min_entropy: 3.0
confidence: medium
examples:
- EXA_API_KEY=3f5a9c1e-2b4d-4a6f-8c10-1d2e3f4a5b6c
- 'exa_api_key: "3f5a9c1e-2b4d-4a6f-8c10-1d2e3f4a5b6c"'
- 'x-api-key: 3f5a9c1e-2b4d-4a6f-8c10-1d2e3f4a5b6c'
references:
- https://docs.exa.ai/reference/answer
- https://docs.exa.ai/reference/getting-started
validation:
type: Http
content:
request:
method: POST
url: https://api.exa.ai/answer
headers:
x-api-key: "{{ TOKEN }}"
Content-Type: application/json
Accept: application/json
body: |
{"query":"ping","text":false}
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: JsonValid
- type: WordMatch
words: ['"answer"']

36
data/rules/fleetbase.yml Normal file
View file

@ -0,0 +1,36 @@
rules:
- name: Fleetbase API Key
id: kingfisher.fleetbase.1
pattern: |
(?xi)
\b
(
flb_(?:live|test)_[0-9a-zA-Z]{20,64}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5
confidence: medium
examples:
- flb_live_1234567890abcdefGHIJ
- flb_test_1234567890abcdefGHIJ
- 'Authorization: Bearer flb_live_1234567890abcdefGHIJ'
categories:
- api
- secret
references:
- https://docs.fleetbase.io/developers/api/
validation:
type: Http
content:
request:
method: GET
url: "https://api.fleetbase.io/v1"
headers:
Authorization: "Bearer {{ TOKEN }}"
Accept: "application/json"
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]

69
data/rules/foursquare.yml Normal file
View file

@ -0,0 +1,69 @@
rules:
- name: Foursquare Client ID
id: kingfisher.foursquare.client_id.1
visible: false
confidence: low
min_entropy: 0.0
pattern: |
(?xi)
(?:
\bclient_id\b\s*[:=]\s*["']?
|
\bclient_id=
)
(
[0-9A-Z]{48}
)
\b
examples:
- client_id=0F12A345BB67C8D901EFG23H45IJKL67MNO89PQ12RST34UV
- 'client_id: "0F12A345BB67C8D901EFG23H45IJKL67MNO89PQ12RST34UV"'
- name: Foursquare Client Secret
id: kingfisher.foursquare.1
pattern: |
(?xi)
(?:
\bfoursquare\b
(?:.|[\n\r]){0,32}?
)?
(?:
\bclient_secret\b\s*[:=]\s*["']?
|
\bclient_secret=
|
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN)
)
(?:.|[\n\r]){0,32}?
\b
(
[0-9A-Z]{48}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5
confidence: medium
examples:
- 'client_secret=0F12A345BB67C8D901EFG23H45IJKL67MNO89PQ12RST34UV'
- 'foursquare client_secret: "0F12A345BB67C8D901EFG23H45IJKL67MNO89PQ12RST34UV"'
references:
- https://docs.foursquare.com/developer/reference/v2-authentication
- https://docs.foursquare.com/developer/reference/upcoming-changes
depends_on_rule:
- rule_id: kingfisher.foursquare.client_id.1
variable: FOURSQUARE_CLIENT_ID
validation:
type: Http
content:
request:
method: GET
url: "https://api.foursquare.com/v2/venues/search?ll=34.0522,-118.2437&query=coffee&client_id={{ FOURSQUARE_CLIENT_ID }}&client_secret={{ TOKEN }}&v=20211019&limit=1"
headers:
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"response"', '"venues"']

62
data/rules/freshdesk.yml Normal file
View file

@ -0,0 +1,62 @@
rules:
- name: Freshdesk Domain
id: kingfisher.freshdesk.1
visible: false
confidence: low
min_entropy: 0.0
pattern: |
(?xi)
\b
(
[0-9a-z-]{1,63}\.freshdesk\.com
)
\b
examples:
- acme-support.freshdesk.com
- mycompany-helpdesk.freshdesk.com
- name: Freshdesk API Key
id: kingfisher.freshdesk.2
pattern: |
(?xi)
\b
freshdesk
(?:.|[\n\r]){0,64}?
(?:api[_-]?key|secret|private|access|key|token)
(?:.|[\n\r]){0,32}?
\b
(
[0-9A-Z]{20}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.3
confidence: medium
examples:
- 'FRESHDESK_API_KEY=abcdefghij1234567890'
- 'freshdesk token: ABCDEFGHIJ1234567890'
references:
- https://developers.freshdesk.com/api/#authentication
- https://developers.freshworks.com/docs/app-sdk/v3.0/support_agent/rest-apis/
depends_on_rule:
- rule_id: kingfisher.freshdesk.1
variable: FRESHDESK_DOMAIN
validation:
type: Http
content:
request:
method: GET
url: "https://{{ FRESHDESK_DOMAIN }}/api/v2/agents/me"
headers:
Accept: application/json
# Freshdesk API key auth is HTTP Basic where username=apikey and password can be any dummy value (commonly "X").
# Docs note you can use a dummy password and (when using Authorization header) base64("apikey:X")
Authorization: "Basic {{ TOKEN | append: ':X' | b64enc }}"
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: JsonValid
- type: WordMatch
words: ['"id"']

View file

@ -20,22 +20,18 @@ rules:
type: Http
content:
request:
method: POST
url: https://api.github.com/graphql
method: GET
url: https://api.github.com/user
headers:
Authorization: token {{ TOKEN }}
Accept: application/vnd.github+json
Content-Type: application/json
body: |
{
"query": "{ viewer { login } }"
}
response_matcher:
- report_response: true
- match_all_words: true
type: WordMatch
words:
- '"login"'
- '"id"'
- name: GitHub Personal Access Token
id: kingfisher.github.2
pattern: |
@ -65,22 +61,18 @@ rules:
type: Http
content:
request:
method: POST
url: https://api.github.com/graphql
method: GET
url: https://api.github.com/user
headers:
Authorization: token {{ TOKEN }}
Accept: application/vnd.github+json
Content-Type: application/json
body: |
{
"query": "{ viewer { login } }"
}
response_matcher:
- report_response: true
- match_all_words: true
type: WordMatch
words:
- '"login"'
- '"id"'
- name: GitHub OAuth Access Token
id: kingfisher.github.3
pattern: |
@ -107,22 +99,18 @@ rules:
type: Http
content:
request:
method: POST
url: https://api.github.com/graphql
method: GET
url: https://api.github.com/user
headers:
Authorization: token {{ TOKEN }}
Accept: application/vnd.github+json
Content-Type: application/json
body: |
{
"query": "{ viewer { login } }"
}
response_matcher:
- report_response: true
- match_all_words: true
type: WordMatch
words:
- '"login"'
- '"id"'
- name: GitHub App User-to-Server Token
id: kingfisher.github.4
pattern: |
@ -141,22 +129,18 @@ rules:
type: Http
content:
request:
method: POST
url: https://api.github.com/graphql
method: GET
url: https://api.github.com/user
headers:
Authorization: token {{ TOKEN }}
Accept: application/vnd.github+json
Content-Type: application/json
body: |
{
"query": "{ viewer { login } }"
}
response_matcher:
- report_response: true
- match_all_words: true
type: WordMatch
words:
- '"login"'
- '"id"'
- name: GitHub App Server-to-Server Token
id: kingfisher.github.5
pattern: |
@ -175,22 +159,18 @@ rules:
type: Http
content:
request:
method: POST
url: https://api.github.com/graphql
method: GET
url: https://api.github.com/user
headers:
Authorization: token {{ TOKEN }}
Accept: application/vnd.github+json
Content-Type: application/json
body: |
{
"query": "{ viewer { login } }"
}
response_matcher:
- report_response: true
- match_all_words: true
type: WordMatch
words:
- '"login"'
- '"id"'
- name: GitHub Refresh Token
id: kingfisher.github.6
pattern: |
@ -206,22 +186,18 @@ rules:
type: Http
content:
request:
method: POST
url: https://api.github.com/graphql
method: GET
url: https://api.github.com/user
headers:
Authorization: token {{ TOKEN }}
Accept: application/vnd.github+json
Content-Type: application/json
body: |
{
"query": "{ viewer { login } }"
}
response_matcher:
- report_response: true
- match_all_words: true
type: WordMatch
words:
- '"login"'
- '"id"'
- name: GitHub Client ID
id: kingfisher.github.7
pattern: |

View file

@ -2,10 +2,12 @@ rules:
- name: Grafana API Token
id: kingfisher.grafana.1
pattern: |
(?xi)
(?x)
\b
(
eyJrIjoi[a-z0-9]{60,100}
eyJrIjoi
[A-Za-z0-9+/]{40,380}
={0,2}
)
\b
pattern_requirements:
@ -13,21 +15,42 @@ rules:
min_entropy: 3.3
confidence: medium
examples:
- 'Authorization: Bearer eyJrIjoiWHZiSWd5NzdCYUZnNUtibE8obUpESmE2bzJYNDRIc1UiLCJuIjoibXlrZXkiLCJpZCI7MX1'
- 'Authorization: Bearer eyJrIjoiWHZiSWd5NzdCYUZnNUtibE8obUpESmE2bzJYNDRIc1UiLCJuIjoibXlrZXkiLCJpZCI6MX0='
- 'admin_client = GrafanaClient("eyJrIjoiY21sM1JRYjB6RnVYSTNLenRWQkFEaWN2bXI2V202U2IiLCJuIjoiYWRtaW5rZXkiLCJpZCI6MX0=", host=grafana_host, port=3000, protocol="http")'
references:
- https://grafana.com/docs/grafana/latest/developers/http_api/auth/
- https://grafana.com/docs/grafana/latest/developer-resources/api-reference/http-api/authentication/
- https://grafana.com/docs/grafana/latest/developer-resources/api-reference/http-api/org/
depends_on_rule:
- rule_id: kingfisher.grafana.4
variable: GRAFANADOMAIN
validation:
type: Http
content:
request:
method: GET
url: "https://{{ GRAFANADOMAIN }}/api/org"
headers:
Authorization: "Bearer {{ TOKEN }}"
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: JsonValid
- type: WordMatch
words: ['"id"', '"name"']
- name: Grafana Cloud API Token
id: kingfisher.grafana.2
pattern: |
(?xi)
\b
(?xi)
\b
(
glc_
glc_
[a-z0-9+/]{40,150}
={0,2}
)
\b
pattern_requirements:
min_digits: 2
min_lowercase: 2
@ -37,20 +60,21 @@ rules:
- ' "token": "glc_eyJrIjoiZjI0YzZkNGEwZDBmZmZjMmUzNTU3ODcxMmY0ZWZlNTQ1NTljMDFjOCIsIm6iOiJteXRva3VuIiwiaWQiOjF8"'
- 'grafana = glc_etLvNLoNMLt7MTczNNwNbN6Nm1ldGEtbW9paxRvcmlpZt14ZXN4NNwNatN6NLCxdKeH7KTUvWpNqCrHlMKE9EhLcZH7to'
references:
- https://grafana.com/docs/grafana-cloud/developer-resources/api-reference/cloud-api/#regions
- https://grafana.com/docs/grafana/latest/developer-resources/api-reference/cloud-api/
validation:
type: Http
content:
request:
headers:
Authorization: Bearer {{ TOKEN }}
method: GET
url: https://grafana.com/api/stack-regions
headers:
Authorization: "Bearer {{ TOKEN }}"
Accept: application/json
response_matcher:
- report_response: true
- status:
- 200
type: StatusMatch
url: https://grafana.com/api/stack-regions
- type: StatusMatch
status: [200]
- type: JsonValid
- name: Grafana Service Account Token
id: kingfisher.grafana.3
@ -67,50 +91,50 @@ rules:
confidence: medium
examples:
- |
curl -H "Authorization: Bearer glsa_HOruNAb7SOiCdshU7algkrq7FDsNSLAa_55e2f8be" -X GET '<grafana_url>/api/access-control/user/permissions' | jq
curl -H "Authorization: Bearer glsa_HOruNAb7SOiCdshU7algkrq7FDsNSLAa_55e2f8be" -X GET '<grafana_url>/api/org' | jq
- |
// getData()
// {
// let url="http://localhost:4200/api/search"
// const headers = new HttpHeaders({
// 'Content-Type': 'application/json',
// 'Authorization': `Bearer glsa_Sof0HKi3agxrQP9qm5r2G98VacBNwV5P_9b638c45`
// })
// return this.http.get(url, {headers: headers});
// }
// headers: { Authorization: `Bearer glsa_Sof0HKi3agxrQP9qm5r2G98VacBNwV5P_9b638c45` }
references:
- https://grafana.com/blog/new-in-grafana-9-1-service-accounts-are-now-ga/
- https://grafana.com/docs/grafana/latest/administration/service-accounts/
- https://grafana.com/docs/grafana/latest/developer-resources/api-reference/http-api/org/
depends_on_rule:
- rule_id: kingfisher.grafana.4
variable: GRAFANADOMAIN
validation:
type: Http
content:
request:
method: GET
url: "https://{{ GRAFANADOMAIN }}/api/org"
headers:
Authorization: Bearer {{ TOKEN }}
Authorization: "Bearer {{ TOKEN }}"
Accept: application/json
response_matcher:
- report_response: true
- status:
- 200
type: StatusMatch
url: "{{ GRAFANADOMAIN }}/api/access-control/me"
depends_on_rule:
- rule_id: kingfisher.grafana.4
variable: GRAFANADOMAIN
- report_response: true
- type: StatusMatch
status: [200]
- type: JsonValid
- type: WordMatch
words: ['"id"', '"name"']
- name: Grafana Domain
id: kingfisher.grafana.4
pattern: |
(?xi)
(?:https?://)?
(?:[A-Z0-9-]+\.){0,32}
grafana\.[A-Z0-9.-]{3,32}
(?::\d{2,5})?
(?:[/?\#]\S*)?
min_entropy: 3.0
\b
(
(?:[a-z0-9-]+\.){0,16}
grafana\.[a-z0-9.-]{2,64}
(?::\d{2,5})?
)
\b
min_entropy: 3.0
visible: false
confidence: medium
examples:
- https://grafana.example.com
- http://grafana.prod.eu-west.mycorp.internal:3000/login
- https://api.team1.grafana.services.cluster.local/health
- grafana.example.com
- grafana.prod.eu-west.mycorp.internal:3000
- api.team1.grafana.services.cluster.local
- grafana.dev.foo-bar.co.uk

39
data/rules/guardian.yml Normal file
View file

@ -0,0 +1,39 @@
rules:
- name: Guardian API Key
id: kingfisher.guardian.1
pattern: |
(?xi)
\b
guardian
(?:.|[\n\r]){0,32}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN|API)
(?:.|[\n\r]){0,32}?
\b
(
[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5
confidence: medium
examples:
- guardian SECRET=abcdef12-1234-abcd-5678-abcdef123456
- guardianPRIVATEKEY=abcdef12-1234-abcd-5678-abcdef123456
references:
- https://open-platform.theguardian.com/documentation/
- https://open-platform.theguardian.com/documentation/section
validation:
type: Http
content:
request:
method: GET
url: "https://content.guardianapis.com/sections?api-key={{ TOKEN }}"
headers:
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"status":"ok"']

42
data/rules/gumroad.yml Normal file
View file

@ -0,0 +1,42 @@
rules:
- name: Gumroad Access Token
id: kingfisher.gumroad.1
pattern: |
(?xi)
\b
gumroad
(?:.|[\n\r]){0,32}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN|ACCESS_TOKEN|OAUTH)
(?:.|[\n\r]){0,48}?
\b
(
[a-f0-9]{64}
|
[A-Z0-9-]{43}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5
confidence: medium
examples:
- gumroad_access_token=abf11e4ab2850ffd50ef690257f7a1c998a443059513d1a4826f2b3159620505
- gumroadSECRET = abf11e4ab2850ffd50ef690257f7a1c998a443059513d1a4826f2b3159620505
- gumroadPRIVATE-abf11e4ab2850ffd50ef690257f7a1c998a443059513d1a4826f2b3159620505
references:
- https://gumroad.com/api
- https://gumroad.com/help/article/280-create-application-api
validation:
type: Http
content:
request:
method: GET
url: "https://api.gumroad.com/v2/user?access_token={{ TOKEN }}"
headers:
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"success":true', '"user"']

40
data/rules/hereapi.yml Normal file
View file

@ -0,0 +1,40 @@
rules:
- name: HERE API Key
id: kingfisher.hereapi.1
pattern: |
(?xi)
\b
hereapi
(?:.|[\n\r]){0,32}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN|APIKEY)
(?:.|[\n\r]){0,32}?
\b
(
[A-Z0-9_-]{43}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5
confidence: medium
examples:
- "hereapi_key=XxK6G3m_pQ8nR2vT4wY9jL5bN7cA1dF3hJ0iM4eP9su"
- "HEREAPI_SECRET=ZzY8xW6vU4tS2qP0nM5kJ9hF7dC1bA3gL8iK4eR9wQm"
references:
- https://stackoverflow.com/questions/65610274/here-geocoding-api-not-working-inside-my-react-app
- https://github.com/spara/geocoding_tutorial
validation:
type: Http
content:
request:
method: GET
url: "https://geocode.search.hereapi.com/v1/geocode?q=Berlin&apiKey={{ TOKEN }}"
headers:
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
# Successful geocode responses include an "items" array.
- type: WordMatch
words: ['"items"']

41
data/rules/honeycomb.yml Normal file
View file

@ -0,0 +1,41 @@
rules:
- name: Honeycomb API Key
id: kingfisher.honeycomb.1
pattern: |
(?xi)
\b
honeycomb
(?:.|[\n\r]){0,16}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN)
(?:.|[\n\r]){0,32}?
\b
(
[0-9a-f]{32}|
[0-9a-zA-Z]{22}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5
confidence: medium
examples:
- honeycomb_secret_key=8f14e45fceea167a5a36dedd4bea2543
- honeycomb_token=z0d1f2bcaloumn3456789P
references:
- https://api-docs.honeycomb.io/api/auth
- https://docs.honeycomb.io/api/
validation:
type: Http
content:
request:
method: GET
url: https://api.honeycomb.io/1/auth
headers:
X-Honeycomb-Team: "{{ TOKEN }}"
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"id"', '"type"', '"team"', '"environment"']

40
data/rules/imagekit.yml Normal file
View file

@ -0,0 +1,40 @@
rules:
- name: ImageKit Private API Key
id: kingfisher.imagekit.1
pattern: |
(?xi)
\b
imagekit
(?:.|[\n\r]){0,64}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN|PRIVATE_KEY)
(?:.|[\n\r]){0,64}?
\b
(
private_[A-Z0-9_-]{8,128}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.2
confidence: medium
examples:
- IMAGEKIT_PRIVATE_KEY=private_rGAPQJbhBx
- imagekit token private_AbCdEf0123456789GhIjKlMn
references:
- https://imagekit.io/docs/api-keys
- https://imagekit.io/docs/api-reference/account-management-api/url-endpoints/list-url-endpoints
validation:
type: Http
content:
request:
method: GET
url: "https://api.imagekit.io/v1/accounts/url-endpoints"
headers:
Authorization: "Basic {{ TOKEN | append: ':' | b64enc }}"
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"urlEndpoint"']

43
data/rules/infura.yml Normal file
View file

@ -0,0 +1,43 @@
rules:
- name: Infura API Key
id: kingfisher.infura.1
pattern: |
(?xi)
\b
infura
(?:.|[\n\r]){0,32}?
\b
(
[0-9a-z]{32}
)
pattern_requirements:
min_digits: 2
min_entropy: 3.5
confidence: medium
examples:
- https://mainnet.infura.io/v3/7238211010344719ad14a89db874158c
- infuraKEYwithspecial-abcdef1234567890abcdef1234567890
references:
- https://www.infura.io/docs
- https://docs.metamask.io/services/reference/ethereum/json-rpc-methods/
validation:
type: Http
content:
request:
method: POST
url: "https://mainnet.infura.io/v3/{{ TOKEN }}"
headers:
Content-Type: application/json
Accept: application/json
body: |
{"jsonrpc":"2.0","id":1,"method":"eth_chainId","params":[]}
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: JsonValid
- type: WordMatch
words: ['"result"']
- type: WordMatch
negative: true
words: ["invalid project id", "project id required in the URL", "invalid project id or project secret"]

View file

@ -16,6 +16,7 @@ rules:
ignore_if_contains:
- "****"
- "xxxx"
- "example"
min_entropy: 3.3
confidence: medium
validation:

35
data/rules/jotform.yml Normal file
View file

@ -0,0 +1,35 @@
rules:
- name: Jotform API Key
id: kingfisher.jotform.1
pattern: |
(?xi)
\b
jotform
(?:.|[\n\r]){0,64}?
(?:api[_-]?key|apikey|token|secret|key)
(?:.|[\n\r]){0,32}?
\b
(
[0-9A-Z]{32}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5
confidence: medium
examples:
- jotform apikey=abcde12345abcde67890abcde12345fg
references:
- https://api.jotform.com/docs/
validation:
type: Http
content:
request:
method: GET
url: "https://api.jotform.com/user/usage?apiKey={{ TOKEN }}"
headers:
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]

40
data/rules/jumpcloud.yml Normal file
View file

@ -0,0 +1,40 @@
rules:
- name: Jumpcloud API Key
id: kingfisher.jumpcloud.1
pattern: |
(?xi)
\b
jumpcloud
(?:.|[\n\r]){0,32}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN)
(?:.|[\n\r]){0,32}?
\b
(
[a-zA-Z0-9]{40}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5
confidence: medium
examples:
- jumpcloud_api_key=1a2b3c4d5e6f7g8h9i0j1a2b3c4d5e6f7g8h9i0j
- JUMPCLOUD_SECRET=k9l8m7n6o5p4q3r2s1t0k9l8m7n6o5p4q3r2s1t0
references:
- https://docs.jumpcloud.com/api/
- https://jumpcloud.com/support/retrieve-object-ids-from-the-api
validation:
type: Http
content:
request:
method: GET
url: "https://console.jumpcloud.com/api/systemusers?limit=1&skip=0"
headers:
x-api-key: "{{ TOKEN }}"
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"results"', '"totalCount"']

33
data/rules/klaviyo.yml Normal file
View file

@ -0,0 +1,33 @@
rules:
- name: Klaviyo API Key
id: kingfisher.klaviyo.1
pattern: |
(?xi)
\b
klaviyo
(?:.|[\n\r]){0,16}?
\b
(
pk_[A-Z0-9]{34}
)
\b
min_entropy: 3.3
confidence: medium
examples:
- klaviyo_key = pk_abcd1234fghij5678klmn9012opqr3456s
validation:
type: Http
content:
request:
method: GET
url: https://a.klaviyo.com/api/accounts
headers:
Revision: "2023-02-22"
Authorization: "Klaviyo-API-Key {{ TOKEN }}"
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"data"']

85
data/rules/looker.yml Normal file
View file

@ -0,0 +1,85 @@
rules:
- name: Looker Base URL
id: kingfisher.looker.1
visible: false
confidence: low
min_entropy: 2.0
pattern: |
(?xi)
\b
(
https?://[a-z0-9.-]+(?::\d{2,5})?
)
(?:/api/(?:4\.0|3\.1))?
\b
examples:
- https://example.cloud.looker.com
- https://example.cloud.looker.com:19999
- https://example.cloud.looker.com:19999/api/4.0
- name: Looker Client ID
id: kingfisher.looker.2
confidence: medium
min_entropy: 3.0
pattern: |
(?xi)
\blooker
(?:.|[\n\r]){0,64}?
\b
(
[a-z0-9]{20}
)
\b
pattern_requirements:
min_digits: 2
examples:
- LOOKER_CLIENT_ID=1a2b3c4d5e6f7g8h9i0j
- 'looker client_id: "0a1b2c3d4e5f6g7h8i9j"'
references:
- https://docs.cloud.google.com/looker/docs/api-auth
- name: Looker Client Secret
id: kingfisher.looker.3
confidence: medium
min_entropy: 3.5
pattern: |
(?xi)
\b
looker
(?:.|[\n\r]){0,64}?
\b
(
[a-z0-9]{24}
)
\b
pattern_requirements:
min_digits: 2
examples:
- LOOKER_CLIENT_SECRET=1a2b3c4d5e6f7g8h9i0j1k2l
- 'looker client_secret: "0a1b2c3d4e5f6g7h8i9j0k1l"'
references:
- https://docs.cloud.google.com/looker/docs/api-auth
- https://docs.cloud.google.com/looker/docs/reference/looker-api/latest/methods/ApiAuth/login
depends_on_rule:
- rule_id: kingfisher.looker.1
variable: LOOKER_BASE_URL
- rule_id: kingfisher.looker.2
variable: LOOKER_CLIENT_ID
validation:
type: Http
content:
request:
method: POST
url: "{{ LOOKER_BASE_URL }}/api/4.0/login"
headers:
Content-Type: application/x-www-form-urlencoded
Accept: application/json
body: |
client_id={{ LOOKER_CLIENT_ID }}&client_secret={{ TOKEN }}
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: JsonValid
- type: WordMatch
words: ['"access_token"']

75
data/rules/mailjet.yml Normal file
View file

@ -0,0 +1,75 @@
rules:
- name: MailJetSMS API Key
id: kingfisher.mailjet.1
pattern: |
(?xi)
\b
mailjet
(?:.|[\n\r]){0,32}?
\b
(
[A-Z0-9]{32}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5
confidence: medium
examples:
- mailjet ABCDEFGHIJKLMNOPQRSTUVWXYZ012345
- mailjet-token 9A1B2C3D4E5F6G7H8I9J0K1L2M3N4O5P
references:
- https://dev.mailjet.com/sms/reference/overview/authentication/
- https://www.postman.com/mailjet-api/mailjet-s-public-workspace/request/velnqvd/retrieve-a-count-of-all-sms-messages
validation:
type: Http
content:
request:
method: GET
url: "https://api.mailjet.com/v4/sms/count"
headers:
Accept: "application/vnd.mailjetsms+json; version=3"
Authorization: "Bearer {{ TOKEN }}"
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"Count"']
- name: MailJet Basic Auth
id: kingfisher.mailjet.2
pattern: |
(?xi)
\b
mailjet
(?:.|[\n\r]){0,32}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN)
(?:.|[\n\r]){0,32}?
\b
(
[A-Z0-9]{87}=
)
pattern_requirements:
min_digits: 2
min_entropy: 3.5
confidence: medium
examples:
- mailjet_token = neno01fy530zukbtvq8xunwec74b7m7lsmzha8su93zdvy4mp4dc5gctfa2rcwetllcjzncirjv58se7iwkehhh=
references:
- https://dev.mailjet.com/email/reference/overview/authentication/
- https://www.postman.com/mailjet-api/mailjet-s-public-workspace/request/5pnoxig/retrieve-all-api-keys
validation:
type: Http
content:
request:
method: GET
url: "https://api.mailjet.com/v3/REST/apikey?Limit=1"
headers:
Authorization: "Basic {{ TOKEN }}"
Accept: "application/json"
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"Data"', '"Count"']

View file

@ -85,6 +85,7 @@ rules:
ignore_if_contains:
- "****"
- "xxxx"
- "example"
min_entropy: 3
examples:
- client = mongoc_client_new ("mongodb+srv://someuser:hunter2@my-atlas-rd941.mongodb.net/test?retryWrites=true&w=majority");

View file

@ -36,6 +36,7 @@ rules:
ignore_if_contains:
- "****"
- "xxxx"
- "example"
min_entropy: 3.3
confidence: medium
examples:

View file

@ -38,10 +38,9 @@ rules:
id: kingfisher.notion.2
pattern: |
(?xi)
notion
(?:.|[\\n\r]){0,32}?
\b
(
ntn_[A-Z0-9]{40,55}
ntn_[0-9]{11}[A-Za-z0-9]{35}
)
min_entropy: 4.0
confidence: medium

62
data/rules/nylas.yml Normal file
View file

@ -0,0 +1,62 @@
rules:
# Helper: capture the Nylas API base URI (data residency) from config/env so validation hits the right region.
- name: Nylas API URI
id: kingfisher.nylas.api_uri.1
visible: false
confidence: medium
min_entropy: 2.0
pattern: |
(?xi)
\b
(
https://api\.(?:us|eu)\.nylas\.com
)
\b
examples:
- https://api.us.nylas.com
- https://api.eu.nylas.com
- name: Nylas API Key
id: kingfisher.nylas.1
pattern: |
(?xi)
\b
nylas
(?:.|[\n\r]){0,64}?
(?:api[_-]?key|apikey|secret|private|access|token)
(?:.|[\n\r]){0,64}?
\b
(
nyk_[A-Z0-9]{67} # common v3 API key format (71 chars total)
|
[0-9A-Z]{30} # legacy/older patterns seen in repos
)
\b
pattern_requirements:
min_digits: 4
min_entropy: 3.3
confidence: medium
examples:
- NYLAS_API_KEY=nyk_0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234
- nylas_token = 2temab2qpfioneggb01j2dhfllqgiu
references:
- https://developer.nylas.com/docs/v3/auth/hosted-oauth-apikey/
- https://developer.nylas.com/docs/v3/notifications/
depends_on_rule:
- rule_id: kingfisher.nylas.api_uri.1
variable: NYLAS_API_URI
validation:
type: Http
content:
request:
method: GET
url: "{{ NYLAS_API_URI }}/v3/webhooks"
headers:
Authorization: "Bearer {{ TOKEN }}"
Accept: "application/json"
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"request_id"', '"data"']

36
data/rules/openrouter.yml Normal file
View file

@ -0,0 +1,36 @@
rules:
- name: OpenRouter API Key
id: kingfisher.openrouter.1
pattern: |
(?xi)
\b
(
sk-or-v1-[0-9a-f]{64}
)
\b
pattern_requirements:
min_digits: 4
min_entropy: 4.0
confidence: high
examples:
- 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
- 'Authorization: Bearer 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'
references:
- https://openrouter.ai/docs/api/reference/authentication
- https://openrouter.ai/docs/api/api-reference/credits/get-credits
validation:
type: Http
content:
request:
method: GET
url: https://openrouter.ai/api/v1/credits
headers:
Authorization: "Bearer {{ TOKEN }}"
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: JsonValid
- type: WordMatch
words: ['"data"', '"total_credits"', '"total_usage"']

38
data/rules/optimizely.yml Normal file
View file

@ -0,0 +1,38 @@
rules:
- name: Optimizely Personal Access Token
id: kingfisher.optimizely.1
pattern: |
(?xi)
\b
optimizely
(?:.|[\n\r]){0,64}?
\b
(
[0-9A-Z-:]{54}
)
\b
pattern_requirements:
min_digits: 4
min_entropy: 3.6
confidence: medium
examples:
- OPTIMIZELY_TOKEN=AbCDefGhijKlmnOpqrStuvWxYz01-23:45AbCDefGhijKlmnOpqrSt
- 'Optimizely Authorization: Bearer AbCDefGhijKlmnOpqrStuvWxYz01-23:45AbCDefGhijKlmnOpqrSt'
references:
- https://docs.developers.optimizely.com/web-experimentation/docs/rest-api-getting-started
- https://docs.developers.optimizely.com/feature-experimentation/reference/get_me
- https://docs.developers.optimizely.com/web-experimentation/docs/api-conventions
validation:
type: Http
content:
request:
method: GET
url: https://api.optimizely.com/v2/me
headers:
Authorization: "Bearer {{ TOKEN }}"
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: JsonValid

39
data/rules/owlbot.yml Normal file
View file

@ -0,0 +1,39 @@
rules:
- name: Owlbot API Key
id: kingfisher.owlbot.1
pattern: |
(?xi)
\b
owlbot
(?:.|[\n\r]){0,64}?
(?:api[_-]?key|secret|private|access|token|key)
(?:.|[\n\r]){0,64}?
\b
(
[a-f0-9]{40}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5
confidence: medium
examples:
- "owlbot SECRET b7d21c0e88e9a3c5938fb045b2b6a5e693eaf9d1"
- "owlbot TOKEN 8a5de3a89b7e4f29bf728b45adcdea6ea3410c78"
references:
- https://owlbot.info/
validation:
type: Http
content:
request:
method: GET
url: "https://owlbot.info/api/v4/dictionary/owl?format=json"
headers:
Authorization: "Token {{ TOKEN }}"
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"word"', '"definitions"']

View file

@ -0,0 +1,44 @@
rules:
- name: PackageCloud API Key
id: kingfisher.packagecloud.1
pattern: |
(?xi)
\b
packagecloud
(?:.|[\n\r]){0,32}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN|API[_-]?TOKEN|AUTH)
(?:.|[\n\r]){0,32}?
\b
(
[0-9a-f]{48}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5
confidence: medium
examples:
- packagecloud accessKEY 1234567890abcdef1234567890abcdef1234567890abcdef
- "packagecloud:token=1234567890abcdef1234567890abcdef1234567890abcdef"
- |
"config": {
"packagecloud_secret": "1234567890abcdef1234567890abcdef1234567890abcdef"
}
- packagecloudPRIVATEkey 1234567890abcdef1234567890abcdef1234567890abcdef
references:
- https://packagecloud.io/docs/api
validation:
type: Http
content:
request:
method: GET
url: "https://packagecloud.io/api/v1/distributions.json"
headers:
Authorization: "Basic {{ TOKEN | append: ':' | b64enc }}"
Accept: "application/json"
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"deb"', '"rpm"']

View file

@ -39,7 +39,6 @@ rules:
Accept: application/json
response_matcher:
- report_response: true
- type: JsonValid
- type: WordMatch
words:
- '"users":'

40
data/rules/paystack.yml Normal file
View file

@ -0,0 +1,40 @@
rules:
- name: Paystack API Key
id: kingfisher.paystack.1
pattern: |
(?xi)
\b
(
sk_
[a-z]{1,}
_
[A-Z0-9]{40}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.3
confidence: medium
examples:
- sk_test_abcdef1234567890abcdef1234567890abcdef12
- sk_live_gwjaoi1234567890abcdef1234567890abcdef12
references:
- https://paystack.com/docs/api/authentication/
- https://paystack.com/docs/api/transfer-control/
validation:
type: Http
content:
request:
method: GET
# Different endpoint than /customer: Check Balance
url: https://api.paystack.co/balance
headers:
Authorization: "Bearer {{ TOKEN }}"
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: JsonValid
- type: WordMatch
words: ['"message":"Balances retrieved"', '"data"']

47
data/rules/pdflayer.yml Normal file
View file

@ -0,0 +1,47 @@
rules:
- name: PdfLayer API Key
id: kingfisher.pdflayer.1
pattern: |
(?xi)
(?:
\b
pdflayer
(?:.|[\n\r]){0,32}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN|ACCESS_KEY)
(?:.|[\n\r]){0,32}?
|
\bapi\.pdflayer\.com/api/convert\?[^ \t\r\n"'<>]*\baccess_key\s*=\s*
)
\b
(
[a-z0-9]{32}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5
confidence: medium
examples:
- pdflayer_key=1234567890abcdef1234567890abcdef
- PDFLAYER_ACCESS_TOKEN=abcdef1234567890abcdef1234567890
- pdflayer_secret=0123456789abcdef0123456789abcdef
references:
- https://pdflayer.com/documentation
validation:
type: Http
content:
request:
method: GET
# Use Sandbox Mode (test=1) and intentionally omit document_url/document_html.
# This yields a JSON error response (instead of generating a PDF) and should not count
# toward monthly API volume per docs.
url: "https://api.pdflayer.com/api/convert?access_key={{ TOKEN }}&test=1"
headers:
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
negative: true
words: ['"invalid_access_key"', '"missing_access_key"']

View file

@ -2,7 +2,7 @@ rules:
- name: Perplexity AI API Key
id: kingfisher.perplexity.1
pattern: |
(?xi)
(?x)
\b
(
pplx-[A-Za-z0-9]{48}

View file

@ -28,6 +28,7 @@ rules:
ignore_if_contains:
- "****"
- "xxxx"
- "example"
min_entropy: 3.3
confidence: medium
examples:

41
data/rules/rapidapi.yml Normal file
View file

@ -0,0 +1,41 @@
rules:
- name: RapidAPI Key
id: kingfisher.rapidapi.1
pattern: |
(?xi)
\b
rapidapi
(?:.|[\n\r]){0,32}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN)
(?:.|[\n\r]){0,32}?
\b
(
[A-Za-z0-9_-]{50}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 4.0
confidence: medium
examples:
- rapidapi_key=abcdefghij1234567890ABCDEFGHIJ1234567890abcdefghij
- '"rapidapiKey":"ABCDEFGHIJ1234567890abcdefghij1234567890ABCDEFGHIJ"'
references:
- https://docs.rapidapi.com/docs/configuring-api-security
- https://docs.rapidapi.com/docs/keys-and-key-rotation
validation:
type: Http
content:
request:
method: GET
url: "https://weatherapi-com.p.rapidapi.com/current.json?q=London"
headers:
x-rapidapi-key: "{{ TOKEN }}"
x-rapidapi-host: "weatherapi-com.p.rapidapi.com"
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"country"']

55
data/rules/riot.yml Normal file
View file

@ -0,0 +1,55 @@
rules:
- name: Riot Platform Host
id: kingfisher.riot.1
visible: false
confidence: medium
min_entropy: 1.0
pattern: |
(?xi)
\b
(
(?:br1|eun1|euw1|jp1|kr|la1|la2|na1|oc1|ru|tr1
|ph2|sg2|th2|tw2|vn2
|americas|europe|asia)
\.api\.riotgames\.com
)
\b
examples:
- na1.api.riotgames.com
- euw1.api.riotgames.com
- americas.api.riotgames.com
- name: Riot Games API Key
id: kingfisher.riot.2
pattern: |
(?xi)
\b
(
RGAPI-[a-z0-9_-]{36}
)
\b
pattern_requirements:
min_digits: 4
min_entropy: 3.0
confidence: medium
examples:
- RGAPI-4sb3f6a1-2941-5a81-9c23-4bf3a83c14f3
references:
- https://developer.riotgames.com/docs/lol
- https://developer.riotgames.com/apis
depends_on_rule:
- rule_id: kingfisher.riot.1
variable: RIOT_PLATFORM_HOST
validation:
type: Http
content:
request:
method: GET
url: "https://{{ RIOT_PLATFORM_HOST }}/lol/status/v4/platform-data"
headers:
X-Riot-Token: "{{ TOKEN }}"
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]

View file

@ -11,6 +11,7 @@ rules:
\.
[0-9A-Z_-]{39,47}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5
@ -18,7 +19,7 @@ rules:
examples:
- " 'SENDGRID_API_KEYSID': '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'"
- "var sendgrid = require('sendgrid')('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');"
- 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
- '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'
references:
- https://docs.sendgrid.com/ui/account-and-settings/api-keys
validation:

View file

@ -3,18 +3,19 @@ rules:
id: kingfisher.sentry.1
pattern: |
(?xi)
\b
sentry
(?:.|[\n\r]){0,32}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN)
(?:.|[\n\r]){0,32}?
\b
(
[a-f0-9]{64}
[a-f0-9]{64}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5
min_entropy: 3.0
confidence: medium
examples:
- SENTRY_TOKEN=cbadefcbadefcbadefcbadefcbadefcbadefcbadefcbadefcbadefcbadefcbad
@ -39,19 +40,32 @@ rules:
- name: Sentry Organization Token
id: kingfisher.sentry.2
pattern: |
(?xi)
(?x)
\b
(
sntrys_eyJpYXQiO[a-zA-Z0-9+/]{10,200}(?:LCJyZWdpb25fdXJs|InJlZ2lvbl91cmwi|cmVnaW9uX3VybCI6)[a-zA-Z0-9+/]{10,200}={0,2}_[a-zA-Z0-9+/]{43}
sntrys_eyJpYXQiO
[a-zA-Z0-9+/]{10,192}
(?:
LCJyZWdpb25fdXJs
| InJlZ2lvbl91cmwi
| cmVnaW9uX3VybCI6
)
[a-zA-Z0-9+/]{10,192}
={0,2}
_
[a-zA-Z0-9+/]{43}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 4.2
min_entropy: 4.5
confidence: medium
examples:
- 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
- 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
references:
- https://docs.sentry.io/api/auth/
- https://github.com/getsentry/rfcs/blob/main/text/0091-ci-upload-tokens.md
validation:
type: Http
content:
@ -71,9 +85,11 @@ rules:
id: kingfisher.sentry.3
pattern: |
(?xi)
\b
(
sntryu_[a-f0-9]{64}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5

View file

@ -5,7 +5,7 @@ rules:
(?xi)
\b
(
(?:shpat|shpca|shppa|shpss)_[a-f0-9]{30,34}
(?:shpat|shpca|shppa|shpss)_[a-f0-9]{32}
)
\b
pattern_requirements:

View file

@ -5,7 +5,7 @@ rules:
(?xi)
\b
(
sgp_(?:[a-f0-9]{16}_local_)?[a-f0-9]{40}
sgp_(?:[A-F0-9]{16}|local)_[A-F0-9]{40}|sgp_[A-F0-9]{40}
)
\b
pattern_requirements:

39
data/rules/sslmate.yml Normal file
View file

@ -0,0 +1,39 @@
rules:
- name: SslMate API Key
id: kingfisher.sslmate.1
pattern: |
(?xi)
\b
sslmate
(?:.|[\n\r]){0,32}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN)
(?:.|[\n\r]){0,32}?
\b
(
[A-Z0-9]{36}
)
\b
pattern_requirements:
min_digits: 4
min_entropy: 3.5
confidence: medium
examples:
- sslmate_key="10000r90kriAAAAAseZJwsawemws03jdlmZY"
- SSLMATE_SECRET_KEY=ABCDEFGHIJ1234567890ABCDEFGHIJ123456
references:
- https://sslmate.com/help/reference/apiv2
validation:
type: Http
content:
request:
method: GET
url: "https://sslmate.com/api/v2/certs/example.com"
headers:
Authorization: "Basic {{ TOKEN | append: ':' | b64enc }}"
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"cn"', '"exists"']

42
data/rules/statuspage.yml Normal file
View file

@ -0,0 +1,42 @@
rules:
- name: Statuspage API Key
id: kingfisher.statuspage.1
pattern: |
(?xi)
\b
statuspage
(?:.|[\n\r]){0,32}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN|API)
(?:.|[\n\r]){0,48}?
\b
(
[0-9a-f]{64}
|
# Legacy UUID-ish keys (seen in older configs)
[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.5
confidence: medium
examples:
- 'statuspage_api: OAuth 89a229ce1a8dbcf9ff30430fbe35eb4c0426574bca932061892cefd2138aa4b1'
- statuspage_key = 123e4567-e89b-12d3-a456-426614174000
references:
- https://developer.statuspage.io/
validation:
type: Http
content:
request:
method: GET
url: "https://api.statuspage.io/v1/pages"
headers:
Authorization: "OAuth {{ TOKEN }}"
Accept: "application/json"
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"id"', '"name"']

39
data/rules/yandex.yml Normal file
View file

@ -0,0 +1,39 @@
rules:
- name: Yandex API Key
id: kingfisher.yandex.1
pattern: |
(?xi)
\b
yandex
(?:.|[\n\r]){0,32}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN)
(?:.|[\n\r]){0,32}?
\b
(
[A-Z0-9.]{83}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.3
confidence: medium
examples:
- "yandex_api_key= 'pdct.1.1.20218925T124723Z.07193b9c567c0c90.ebba3042fcf1acfc4d682db12c01a5289f9769c0'"
- "yandex_secret=1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTU"
references:
- https://yandex.com/dev/dictionary
- https://pkg.go.dev/github.com/unitrans/unitrans/src/translator/backend_particular
validation:
type: Http
content:
request:
method: GET
url: "https://dictionary.yandex.net/api/v1/dicservice.json/lookup?key={{ TOKEN }}&lang=en-ru&text=time&ui=en"
headers:
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"def"']

39
data/rules/yelp.yml Normal file
View file

@ -0,0 +1,39 @@
rules:
- name: Yelp API Key
id: kingfisher.yelp.1
pattern: |
(?xi)
\b
yelp
(?:.|[\n\r]){0,32}?
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN)
(?:.|[\n\r]){0,32}?
\b
(
[a-zA-Z0-9_\\=.\\-]{128}
)
\b
pattern_requirements:
min_digits: 6
min_entropy: 3.8
confidence: medium
examples:
- yelp_token = wiuck20l8j-oWwCd9r53FqpN6ELB7K03zGw-ccUQR7uLHc9NaWubovOMdGdyFqIGGM4aVK6nxQ1DreDZn_qBYU4jky_5kQRVkiIDPSheCPggY3WzyRzi27kxoOpoYAYx
references:
- https://docs.developer.yelp.com/docs/places-authentication
- https://docs.developer.yelp.com/reference/v3_all_categories
validation:
type: Http
content:
request:
method: GET
url: "https://api.yelp.com/v3/categories?locale=en_US"
headers:
Authorization: "Bearer {{ TOKEN }}"
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"categories"']

35
data/rules/zohocrm.yml Normal file
View file

@ -0,0 +1,35 @@
rules:
- name: Zoho CRM API Access Token
id: kingfisher.zohocrm.1
pattern: |
(?xi)
\b
(
1000\.[a-f0-9]{32}\.[a-f0-9]{32}
)
\b
pattern_requirements:
min_digits: 4
min_entropy: 3.5
confidence: medium
examples:
- 1000.a23f12b4c5d6e7f8901234567890abc1.23d4e5f67890abcdef1234567890abcd
- 1000.123fa4b5c678d90eabcdef1234567890.ab12c3d4e5f6a7890bcd12ef345678ab
references:
- https://www.zoho.com/crm/developer/docs/api/v8/access-refresh.html
- https://www.zoho.com/crm/developer/docs/api/v8/get-users.html
validation:
type: Http
content:
request:
method: GET
url: "https://www.zohoapis.com/crm/v7/users?type=CurrentUser"
headers:
Authorization: "Zoho-oauthtoken {{ TOKEN }}"
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status: [200]
- type: WordMatch
words: ['"users"']

View file

@ -453,6 +453,9 @@
.badge-aws { background: #fff7ed; color: #c2410c; border-color: #fed7aa; }
.badge-gcp { background: #eff6ff; color: #1e40af; border-color: #bfdbfe; }
.badge-azure { background: #ecfeff; color: #0e7490; border-color: #a5f3fc; }
.badge-github { background: #f4f4f5; color: #18181b; border-color: #d4d4d8; }
.badge-gitlab { background: #fff1f2; color: #be123c; border-color: #fecdd3; }
.badge-perm { background: #ecfdf5; color: #16a34a; border-color: #bbf7d0; font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", "Courier New", monospace; }
.detail-grid {
@ -842,7 +845,7 @@
Dashboard
</button>
<button class="nav-button" data-view-target="view-access">
<span class="nav-icon">🗺</span>
<span class="nav-icon">🕸</span>
Access Map
</button>
<button class="nav-button" data-view-target="view-findings">
@ -948,6 +951,68 @@
<div id="am-detail-cloud">-</div>
</div>
</div>
<div id="am-token-container" class="hidden">
<h4 style="margin-bottom:10px; border-bottom:1px solid var(--border); padding-bottom:6px;">Token Details</h4>
<div class="detail-grid">
<div class="detail-field">
<label>Token Name</label>
<div id="am-token-name">-</div>
</div>
<div class="detail-field">
<label>Username</label>
<div id="am-token-username">-</div>
</div>
<div class="detail-field">
<label>Token Type</label>
<div id="am-token-type">-</div>
</div>
<div class="detail-field">
<label>Account Type</label>
<div id="am-token-account-type">-</div>
</div>
<div class="detail-field">
<label>User ID</label>
<div id="am-token-user">-</div>
</div>
<div class="detail-field">
<label>Company</label>
<div id="am-token-company">-</div>
</div>
<div class="detail-field">
<label>Created At</label>
<div id="am-token-created">-</div>
</div>
<div class="detail-field">
<label>Location</label>
<div id="am-token-location">-</div>
</div>
<div class="detail-field">
<label>Last Used</label>
<div id="am-token-last-used">-</div>
</div>
<div class="detail-field">
<label>Email</label>
<div id="am-token-email">-</div>
</div>
<div class="detail-field">
<label>Expires At</label>
<div id="am-token-expires">-</div>
</div>
<div class="detail-field">
<label>Profile URL</label>
<div id="am-token-url">-</div>
</div>
<div class="detail-field">
<label>Instance Version</label>
<div id="am-token-version">-</div>
</div>
<div class="detail-field">
<label>Enterprise</label>
<div id="am-token-enterprise">-</div>
</div>
</div>
<div id="am-token-scopes" style="display:flex; flex-wrap:wrap; gap:6px;"></div>
</div>
<div id="am-perms-container" class="hidden">
<h4 style="margin-bottom:10px; border-bottom:1px solid var(--border); padding-bottom:6px;">Permissions</h4>
<div id="am-perms-list" style="display:flex; flex-wrap:wrap; gap:6px;"></div>
@ -1298,7 +1363,7 @@
return JSON.parse(text);
} catch (e) {
const trimmed = text.trim();
const parts = trimmed.split(/\n(?={)/g);
const parts = trimmed.split(/\r?\n(?=\s*{)/g).filter(Boolean);
if (parts.length > 1) {
try {
return JSON.parse("[" + parts.join(",") + "]");
@ -1308,6 +1373,53 @@
return null;
}
function collectReportData(entries) {
const findings = [];
const accessMap = [];
let mainReport = null;
let statsReport = null;
const items = Array.isArray(entries) ? entries : [entries];
items.forEach((item) => {
if (!item || typeof item !== "object") return;
if (Array.isArray(item.findings) || Array.isArray(item.access_map)) {
if (!mainReport) {
mainReport = item;
}
}
if (Array.isArray(item.findings)) {
findings.push(...item.findings);
}
if (Array.isArray(item.access_map)) {
accessMap.push(...item.access_map);
}
if (item.rule && item.finding) {
findings.push(item);
}
if (item.provider && item.account) {
accessMap.push(item);
}
if (
typeof item.scan_duration !== "undefined" ||
typeof item.scanDuration !== "undefined" ||
typeof item.bytes_scanned !== "undefined" ||
item.kingfisher ||
item.stats ||
item.summary
) {
statsReport = item;
}
});
return { findings, accessMap, mainReport, statsReport };
}
function processFile(file) {
loaderText.textContent = 'Processing "' + file.name + '"…';
loader.classList.remove("hidden");
@ -1408,64 +1520,33 @@
let parsed = parsePossiblyMultiJson(text);
if (parsed === null) {
const lines = text.split("\\n");
const tmpFindings = [];
const tmpAccessMap = [];
let mainReport = null;
let statsReport = null;
const lines = text.split(/\r?\n/);
const entries = [];
for (let i = 0; i < lines.length; i++) {
const line = lines[i].trim();
if (!line || line[0] !== "{") continue;
try {
const obj = JSON.parse(line);
if (!mainReport && (Array.isArray(obj.findings) || Array.isArray(obj.access_map))) {
mainReport = obj;
}
if (typeof obj.scan_duration !== "undefined" || obj.kingfisher || typeof obj.bytes_scanned !== "undefined") {
statsReport = obj;
}
if (obj.findings) tmpFindings.push(...obj.findings);
else if (obj.access_map) tmpAccessMap.push(...obj.access_map);
else if (obj.rule && obj.finding) tmpFindings.push(obj);
else if (obj.provider && obj.account) tmpAccessMap.push(obj);
entries.push(obj);
} catch (errLine) {
console.warn("Skipping invalid JSON line", i);
}
}
findings = tmpFindings;
accessMap = tmpAccessMap;
rawData = Object.assign({}, mainReport || {}, statsReport || {});
} else {
let mainReport = null;
let statsReport = null;
if (Array.isArray(parsed)) {
parsed.forEach((item) => {
if (!item || typeof item !== "object") return;
if (!mainReport && (Array.isArray(item.findings) || Array.isArray(item.access_map))) {
mainReport = item;
}
if (typeof item.scan_duration !== "undefined" || item.kingfisher || typeof item.bytes_scanned !== "undefined") {
statsReport = item;
}
});
} else {
const item = parsed;
if (Array.isArray(item.findings) || Array.isArray(item.access_map)) {
mainReport = item;
}
if (typeof item.scan_duration !== "undefined" || item.kingfisher || typeof item.bytes_scanned !== "undefined") {
statsReport = item;
}
const collected = collectReportData(entries);
findings = collected.findings;
accessMap = collected.accessMap;
if (collected.mainReport || collected.statsReport) {
rawData = Object.assign({}, collected.mainReport || {}, collected.statsReport || {});
}
} else {
const collected = collectReportData(parsed);
findings = collected.findings;
accessMap = collected.accessMap;
if (collected.mainReport || collected.statsReport || parsed) {
rawData = Object.assign({}, collected.mainReport || {}, collected.statsReport || {});
}
const dataForFindings = mainReport || statsReport || {};
findings = Array.isArray(dataForFindings.findings) ? dataForFindings.findings : [];
accessMap = Array.isArray(dataForFindings.access_map) ? dataForFindings.access_map : [];
rawData = Object.assign({}, dataForFindings, statsReport || {});
}
currentPage = 1;
@ -2180,9 +2261,9 @@
filteredAccessMapView.forEach((entry) => {
const identity = entry.identity;
const account = identity.account || "unknown-identity";
const account = formatIdentityLabel(identity);
const identityNameMatches = entry.identityNameMatches;
const idNode = createTreeNode(account, "identity", true);
const idNode = createTreeNode(account, "identity", true, identity.provider);
idNode.container.style.borderLeft = "none";
idNode.container.style.marginLeft = "0";
idNode.header.addEventListener("click", (e) => {
@ -2248,7 +2329,7 @@
const identities = [];
accessMap.forEach((identity) => {
const account = identity.account || "unknown-identity";
const account = formatIdentityLabel(identity);
const groups = Array.isArray(identity.groups) ? identity.groups : [];
const identityNameMatches = Boolean(filterLower) && account.toLowerCase().includes(filterLower);
@ -2315,7 +2396,7 @@
}
}
function createTreeNode(label, type, isOpen) {
function createTreeNode(label, type, isOpen, provider) {
const container = document.createElement("div");
container.className = "tree-node";
@ -2331,8 +2412,8 @@
const icon = document.createElement("span");
icon.className = "node-icon icon-" + type;
if (type === "identity") icon.textContent = "👤";
else if (type === "resource") icon.textContent = "Hx";
else if (type === "group") icon.textContent = "📁";
else if (type === "resource") icon.textContent = "📦";
else if (type === "group") icon.textContent = "🗂️";
const text = document.createElement("span");
text.textContent = label;
@ -2343,6 +2424,9 @@
header.appendChild(toggle);
header.appendChild(icon);
header.appendChild(text);
if (type === "identity" && provider) {
addBadge(header, provider.toUpperCase(), providerBadgeClass(provider));
}
const children = document.createElement("div");
children.className = "tree-children";
@ -2395,6 +2479,22 @@
container.appendChild(span);
}
function providerBadgeClass(provider) {
const normalized = String(provider || "").toLowerCase();
if (normalized === "aws") return "badge-aws";
if (normalized === "gcp") return "badge-gcp";
if (normalized === "azure") return "badge-azure";
if (normalized === "github") return "badge-github";
if (normalized === "gitlab") return "badge-gitlab";
return "badge-aws";
}
function formatIdentityLabel(identity) {
const base = identity && identity.account ? identity.account : "unknown-identity";
const provider = identity && identity.provider ? identity.provider.toUpperCase() : null;
return provider ? `[${provider}] ${base}` : base;
}
function setDetailName(text, link) {
const nameEl = document.getElementById("am-detail-name");
nameEl.innerHTML = "";
@ -2553,10 +2653,28 @@
const cloudField = document.getElementById("am-detail-cloud");
const permsContainer = document.getElementById("am-perms-container");
const permsList = document.getElementById("am-perms-list");
const tokenContainer = document.getElementById("am-token-container");
const tokenName = document.getElementById("am-token-name");
const tokenUsername = document.getElementById("am-token-username");
const tokenType = document.getElementById("am-token-type");
const tokenAccountType = document.getElementById("am-token-account-type");
const tokenUser = document.getElementById("am-token-user");
const tokenCompany = document.getElementById("am-token-company");
const tokenCreated = document.getElementById("am-token-created");
const tokenLocation = document.getElementById("am-token-location");
const tokenLastUsed = document.getElementById("am-token-last-used");
const tokenEmail = document.getElementById("am-token-email");
const tokenExpires = document.getElementById("am-token-expires");
const tokenUrl = document.getElementById("am-token-url");
const tokenVersion = document.getElementById("am-token-version");
const tokenEnterprise = document.getElementById("am-token-enterprise");
const tokenScopes = document.getElementById("am-token-scopes");
meta.innerHTML = "";
permsList.innerHTML = "";
permsContainer.classList.add("hidden");
tokenScopes.innerHTML = "";
tokenContainer.classList.add("hidden");
let detailName = "";
let detailLink = null;
@ -2569,7 +2687,44 @@
const provider = (data.provider || "unknown").toUpperCase();
cloudField.textContent = provider;
if (data.provider) {
addBadge(meta, provider, data.provider === "gcp" ? "badge-gcp" : "badge-aws");
addBadge(meta, provider, providerBadgeClass(data.provider));
}
if (data.token_details) {
const details = data.token_details;
tokenName.textContent = details.name || "-";
tokenUsername.textContent = details.username || "-";
tokenType.textContent = details.token_type || "-";
tokenAccountType.textContent = details.account_type || "-";
tokenUser.textContent = details.user_id || "-";
tokenCompany.textContent = details.company || "-";
tokenCreated.textContent = details.created_at || "-";
tokenLocation.textContent = details.location || "-";
tokenLastUsed.textContent = details.last_used_at || "-";
tokenEmail.textContent = details.email || "-";
tokenExpires.textContent = details.expires_at || "-";
if (details.url) {
tokenUrl.innerHTML = "";
const urlLink = document.createElement("a");
urlLink.href = details.url;
urlLink.target = "_blank";
urlLink.rel = "noreferrer noopener";
urlLink.textContent = details.url;
tokenUrl.appendChild(urlLink);
} else {
tokenUrl.textContent = "-";
}
tokenVersion.textContent =
(data.provider_metadata && data.provider_metadata.version) || "-";
tokenEnterprise.textContent =
data.provider_metadata && typeof data.provider_metadata.enterprise === "boolean"
? data.provider_metadata.enterprise
? "true"
: "false"
: "-";
if (Array.isArray(details.scopes)) {
details.scopes.forEach((scope) => addBadge(tokenScopes, scope, "badge-perm"));
}
tokenContainer.classList.remove("hidden");
}
} else if (type === "resource") {
icon.textContent = "📦";
@ -2582,7 +2737,7 @@
const provider = (data.provider || "unknown").toUpperCase();
cloudField.textContent = provider;
if (data.provider) {
addBadge(meta, provider, data.provider === "gcp" ? "badge-gcp" : "badge-aws");
addBadge(meta, provider, providerBadgeClass(data.provider));
}
if (Array.isArray(data.permissions) && data.permissions.length) {

Binary file not shown.

Before

Width:  |  Height:  |  Size: 6.2 MiB

After

Width:  |  Height:  |  Size: 1.8 MiB

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 91 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 7.1 MiB

View file

@ -1,11 +1,15 @@
use anyhow::{bail, Result};
use anyhow::Result;
use schemars::JsonSchema;
use serde::Serialize;
use crate::cli::commands::access_map::{AccessMapArgs, AccessMapProvider};
mod aws;
mod azure;
mod azure_devops;
mod gcp;
mod github;
mod gitlab;
mod report;
/// Run the identity mapping workflow for the selected cloud provider.
@ -14,6 +18,8 @@ pub async fn run(args: AccessMapArgs) -> Result<()> {
AccessMapProvider::Gcp => gcp::map_access(args.credential_path.as_deref()).await?,
AccessMapProvider::Aws => aws::map_access(&args).await?,
AccessMapProvider::Azure => azure::map_access(&args).await?,
AccessMapProvider::Github => github::map_access(&args).await?,
AccessMapProvider::Gitlab => gitlab::map_access(&args).await?,
};
let json = serde_json::to_string_pretty(&result)?;
@ -37,6 +43,14 @@ pub enum AccessMapRequest {
Aws { access_key: String, secret_key: String, session_token: Option<String> },
/// A GCP service account JSON document.
Gcp { credential_json: String },
/// An Azure storage account JSON document.
Azure { credential_json: String, containers: Option<Vec<String>> },
/// An Azure DevOps personal access token with organization.
AzureDevops { token: String, organization: String },
/// A GitHub token.
Github { token: String },
/// A GitLab token.
Gitlab { token: String },
}
/// Structured output describing the resolved identity and its risk profile.
@ -62,6 +76,13 @@ pub struct AccessMapResult {
pub recommendations: Vec<String>,
/// Additional risk notes derived from permissions and impersonation exposure.
pub risk_notes: Vec<String>,
/// Optional access token metadata (for GitHub/GitLab).
#[serde(skip_serializing_if = "Option::is_none")]
pub token_details: Option<AccessTokenDetails>,
/// Optional provider metadata (for GitLab instance details, etc.).
#[serde(skip_serializing_if = "Option::is_none")]
pub provider_metadata: Option<ProviderMetadata>,
}
/// Identity details such as email or ARN.
@ -131,6 +152,32 @@ pub enum Severity {
Critical,
}
/// Optional metadata for access tokens.
#[derive(Debug, Serialize, Clone, Default, JsonSchema)]
pub struct AccessTokenDetails {
pub name: Option<String>,
pub username: Option<String>,
pub account_type: Option<String>,
pub company: Option<String>,
pub location: Option<String>,
pub email: Option<String>,
pub url: Option<String>,
pub token_type: Option<String>,
pub created_at: Option<String>,
pub last_used_at: Option<String>,
pub expires_at: Option<String>,
pub user_id: Option<String>,
#[serde(skip_serializing_if = "Vec::is_empty", default)]
pub scopes: Vec<String>,
}
/// Optional metadata about the provider instance.
#[derive(Debug, Serialize, Clone, Default, JsonSchema)]
pub struct ProviderMetadata {
pub version: Option<String>,
pub enterprise: Option<bool>,
}
/// Map a batch of credentials to their effective identities.
pub async fn map_requests(requests: Vec<AccessMapRequest>) -> Vec<AccessMapResult> {
let mut results = Vec::new();
@ -147,6 +194,22 @@ pub async fn map_requests(requests: Vec<AccessMapRequest>) -> Vec<AccessMapResul
.await
.unwrap_or_else(|err| build_failed_result("gcp", "service_account", err))
}
AccessMapRequest::Azure { credential_json, containers } => {
azure::map_access_from_json_with_hints(&credential_json, containers.as_deref())
.await
.unwrap_or_else(|err| build_failed_result("azure", "storage_account", err))
}
AccessMapRequest::AzureDevops { token, organization } => {
azure_devops::map_access_from_token(&token, &organization)
.await
.unwrap_or_else(|err| build_failed_result("azure_devops", "pat", err))
}
AccessMapRequest::Github { token } => github::map_access_from_token(&token)
.await
.unwrap_or_else(|err| build_failed_result("github", "token", err)),
AccessMapRequest::Gitlab { token } => gitlab::map_access_from_token(&token)
.await
.unwrap_or_else(|err| build_failed_result("gitlab", "token", err)),
};
results.push(mapped);
@ -186,6 +249,8 @@ fn build_failed_result(cloud: &str, identity_label: &str, err: anyhow::Error) ->
severity: Severity::Medium,
recommendations: build_recommendations(Severity::Medium),
risk_notes: vec![format!("Identity mapping failed: {err}")],
token_details: None,
provider_metadata: None,
}
}
@ -234,7 +299,7 @@ pub(crate) fn build_recommendations(severity: Severity) -> Vec<String> {
recs
}
/// Fallback handler for unsupported providers.
async fn unsupported_provider(provider: &AccessMapProvider) -> Result<AccessMapResult> {
bail!("Identity mapping for {:?} is not implemented", provider)
}
// /// Fallback handler for unsupported providers.
// async fn unsupported_provider(provider: &AccessMapProvider) -> Result<AccessMapResult> {
// bail!("Identity mapping for {:?} is not implemented", provider)
// }

View file

@ -20,7 +20,7 @@ use crate::cli::commands::access_map::AccessMapArgs;
use super::{
build_default_account_resource, build_recommendations, AccessMapResult, AccessSummary,
PermissionSummary, ResourceExposure, RoleBinding, Severity,
AccessTokenDetails, PermissionSummary, ResourceExposure, RoleBinding, Severity,
};
pub async fn map_access(args: &AccessMapArgs) -> Result<AccessMapResult> {
@ -138,6 +138,22 @@ async fn map_access_with_config(config: SdkConfig) -> Result<AccessMapResult> {
severity,
recommendations,
risk_notes,
token_details: Some(AccessTokenDetails {
name: account_id.clone(),
username: None,
account_type: None,
company: None,
location: None,
email: None,
url: None,
token_type: Some("access_key".into()),
created_at: None,
last_used_at: None,
expires_at: None,
user_id: Some(arn.clone()),
scopes: Vec::new(),
}),
provider_metadata: None,
})
}

View file

@ -1,9 +1,220 @@
use anyhow::Result;
use anyhow::{anyhow, Context, Result};
use base64::{engine::general_purpose::STANDARD as b64, Engine as _};
use chrono::Utc;
use hmac::{Hmac, Mac};
use quick_xml::{events::Event, Reader};
use reqwest::{header::HeaderValue, Client};
use serde_json::Value as JsonValue;
use sha2::Sha256;
use crate::cli::commands::access_map::AccessMapArgs;
use super::AccessMapResult;
use super::{
build_recommendations, AccessMapResult, AccessSummary, PermissionSummary, ResourceExposure,
RoleBinding, Severity,
};
pub async fn map_access(args: &AccessMapArgs) -> Result<AccessMapResult> {
super::unsupported_provider(&args.provider).await
let path = args
.credential_path
.as_deref()
.ok_or_else(|| anyhow!("Azure access-map requires a credential JSON path"))?;
let data = std::fs::read_to_string(path).context("Failed to read credential file")?;
map_access_from_json(&data).await
}
pub async fn map_access_from_json(data: &str) -> Result<AccessMapResult> {
map_access_from_json_with_hints(data, None).await
}
pub async fn map_access_from_json_with_hints(
data: &str,
containers_hint: Option<&[String]>,
) -> Result<AccessMapResult> {
let (storage_account, storage_key) = parse_storage_credentials(data)?;
let mut risk_notes =
vec!["Storage account keys grant full control over the storage account".to_string()];
let containers = match containers_hint {
Some(list) if !list.is_empty() => list.to_vec(),
_ => match list_containers(&storage_account, &storage_key).await {
Ok(list) => list,
Err(err) => {
risk_notes.push(format!("Container enumeration failed: {err}"));
Vec::new()
}
},
};
let severity = Severity::Critical;
let permissions =
PermissionSummary { admin: vec!["storage:*".into()], ..PermissionSummary::default() };
let roles = vec![RoleBinding {
name: "storage_account_key".into(),
source: "shared_key".into(),
permissions: vec!["storage:*".into()],
}];
let mut resources = Vec::new();
resources.push(ResourceExposure {
resource_type: "storage_account".into(),
name: storage_account.clone(),
permissions: vec!["storage:*".into()],
risk: "critical".into(),
reason: "Storage account accessible with shared key".into(),
});
if containers.is_empty() {
resources.push(ResourceExposure {
resource_type: "storage_container".into(),
name: String::new(),
permissions: vec!["storage:*".into()],
risk: "critical".into(),
reason: "Container list unavailable; storage account key still grants full access"
.into(),
});
} else {
for container in containers {
resources.push(ResourceExposure {
resource_type: "storage_container".into(),
name: container,
permissions: vec!["storage:*".into()],
risk: "critical".into(),
reason: "Container accessible with shared key".into(),
});
}
}
let identity = AccessSummary {
id: storage_account,
access_type: "storage_account_key".into(),
project: None,
tenant: None,
account_id: None,
};
Ok(AccessMapResult {
cloud: "azure".into(),
identity,
roles,
permissions,
resources,
severity,
recommendations: build_recommendations(severity),
risk_notes,
token_details: None,
provider_metadata: None,
})
}
fn parse_storage_credentials(data: &str) -> Result<(String, String)> {
let token: JsonValue = serde_json::from_str(data)?;
let storage_account = token["storage_account"]
.as_str()
.ok_or_else(|| anyhow!("Missing storage_account in credential JSON"))?
.to_string();
let storage_key = token["storage_key"]
.as_str()
.ok_or_else(|| anyhow!("Missing storage_key in credential JSON"))?
.to_string();
Ok((storage_account, storage_key))
}
async fn list_containers(storage_account: &str, storage_key: &str) -> Result<Vec<String>> {
let mut containers = std::collections::BTreeSet::new();
let mut marker: Option<String> = None;
loop {
let now_rfc = Utc::now().format("%a, %d %b %Y %H:%M:%S GMT").to_string();
let mut url = reqwest::Url::parse(&format!(
"https://{account}.blob.core.windows.net/",
account = storage_account
))?;
{
let mut query = url.query_pairs_mut();
query.append_pair("comp", "list");
if let Some(marker_value) = marker.as_deref() {
query.append_pair("marker", marker_value);
}
}
let canon_headers = format!("x-ms-date:{now_rfc}\nx-ms-version:2023-11-03\n");
let mut canon_resource = format!("/{account}/\ncomp:list", account = storage_account);
if let Some(marker_value) = marker.as_deref() {
canon_resource.push_str(&format!("\nmarker:{marker_value}"));
}
let string_to_sign = format!(
"GET\n\n\n\n\n\n\n\n\n\n\n\n{headers}{resource}",
headers = canon_headers,
resource = canon_resource
);
let key_bytes = b64.decode(storage_key)?;
let mut mac = Hmac::<Sha256>::new_from_slice(&key_bytes)
.map_err(|_| anyhow!("invalid key length"))?;
mac.update(string_to_sign.as_bytes());
let signature = b64.encode(mac.finalize().into_bytes());
let mut headers = reqwest::header::HeaderMap::new();
headers.insert("x-ms-date", HeaderValue::from_str(&now_rfc)?);
headers.insert("x-ms-version", HeaderValue::from_static("2023-11-03"));
headers.insert(
"Authorization",
HeaderValue::from_str(&format!(
"SharedKey {account}:{sig}",
account = storage_account,
sig = signature
))?,
);
let client = Client::builder().build()?;
let resp = client.get(url).headers(headers).send().await?;
let status = resp.status();
let body_txt = resp.text().await?;
if !status.is_success() {
return Err(anyhow!(
"Azure Storage list containers failed (HTTP {}): {}",
status,
body_txt
));
}
let mut reader = Reader::from_str(&body_txt);
reader.config_mut().trim_text(true);
let mut buf = Vec::new();
let mut next_marker: Option<String> = None;
loop {
match reader.read_event_into(&mut buf) {
Ok(Event::Eof) => break,
Ok(Event::Start(e)) if e.name().as_ref().eq_ignore_ascii_case(b"name") => {
let text = reader.read_text(e.name())?;
let name = text.into_owned();
if !name.is_empty() {
containers.insert(name);
}
}
Ok(Event::Start(e)) if e.name().as_ref().eq_ignore_ascii_case(b"nextmarker") => {
let text = reader.read_text(e.name())?;
let value = text.into_owned();
if !value.trim().is_empty() {
next_marker = Some(value);
}
}
Err(e) => return Err(anyhow!("XML parse error: {e}")),
_ => {}
}
buf.clear();
}
if next_marker.is_none() {
break;
}
marker = next_marker;
}
Ok(containers.into_iter().collect())
}

View file

@ -0,0 +1,619 @@
use anyhow::{anyhow, Context, Result};
use base64::{engine::general_purpose::STANDARD as b64, Engine as _};
use reqwest::{header, Client, Url};
use serde::Deserialize;
use tracing::warn;
use crate::validation::GLOBAL_USER_AGENT;
use super::{
build_recommendations, AccessMapResult, AccessSummary, AccessTokenDetails, PermissionSummary,
ResourceExposure, RoleBinding, Severity,
};
const AZURE_DEVOPS_PROFILE_API: &str =
"https://app.vssps.visualstudio.com/_apis/profile/profiles/me?api-version=7.1-preview.1";
const AZURE_DEVOPS_API_VERSION: &str = "7.1-preview.1";
const AZURE_DEVOPS_TOKEN_ADMIN_VERSION: &str = "7.1";
#[derive(Deserialize)]
struct AzureDevopsProfile {
#[serde(rename = "displayName")]
display_name: Option<String>,
#[serde(rename = "publicAlias")]
public_alias: Option<String>,
#[serde(rename = "emailAddress")]
email_address: Option<String>,
id: Option<String>,
}
#[derive(Deserialize)]
struct AzureDevopsProject {
name: String,
#[serde(default)]
visibility: Option<String>,
#[serde(default)]
_state: Option<String>,
}
#[derive(Deserialize)]
struct AzureDevopsRepo {
name: String,
#[serde(rename = "isDisabled", default)]
is_disabled: bool,
#[serde(default)]
project: AzureDevopsProjectRef,
}
#[derive(Deserialize, Default)]
struct AzureDevopsProjectRef {
name: Option<String>,
}
#[derive(Deserialize)]
struct AzureDevopsListResponse<T> {
value: Vec<T>,
}
#[derive(Deserialize)]
struct AzureDevopsIdentity {
#[serde(rename = "subjectDescriptor")]
subject_descriptor: Option<String>,
}
#[derive(Clone, Deserialize)]
struct AzureDevopsPat {
#[serde(rename = "displayName")]
display_name: Option<String>,
#[serde(rename = "validFrom")]
valid_from: Option<String>,
#[serde(rename = "validTo")]
valid_to: Option<String>,
#[serde(rename = "userId")]
user_id: Option<String>,
scope: Option<String>,
}
pub async fn map_access_from_token(token: &str, organization: &str) -> Result<AccessMapResult> {
let org = normalize_org(organization);
if org.is_empty() {
return Err(anyhow!("Azure DevOps access-map requires a valid organization name"));
}
let client = Client::builder()
.user_agent(GLOBAL_USER_AGENT.as_str())
.build()
.context("Failed to build Azure DevOps HTTP client")?;
let auth_header = build_auth_header(token)?;
let (profile, scopes, user_data) = match fetch_profile(&client, auth_header.clone()).await {
Ok(value) => value,
Err(err) => {
warn!("Azure DevOps access-map: profile lookup failed: {err}");
(
AzureDevopsProfile {
display_name: None,
public_alias: None,
email_address: None,
id: None,
},
Vec::new(),
AzureDevopsUserData::default(),
)
}
};
let pat_details =
fetch_pat_details(&client, &org, auth_header.clone(), &profile, &scopes).await;
let projects = list_projects(&client, &org, auth_header.clone()).await?;
let repos = list_repositories(&client, &org, auth_header.clone(), &projects).await?;
let identity_id = profile
.email_address
.clone()
.or_else(|| user_data.email.clone())
.or(profile.public_alias.clone())
.or(profile.display_name.clone())
.or(profile.id.clone())
.or_else(|| user_data.user_id.clone())
.unwrap_or_else(|| "azure_devops_user".to_string());
let identity = AccessSummary {
id: identity_id,
access_type: "pat".into(),
project: Some(org.clone()),
tenant: None,
account_id: None,
};
let mut resources = Vec::new();
let mut permissions = PermissionSummary::default();
let mut risk_notes = Vec::new();
let mut seen_repos = std::collections::BTreeSet::new();
for repo in &repos {
let risk = if repo.is_disabled { Severity::Low } else { Severity::Medium };
let reason = if repo.is_disabled {
"Repository is disabled but visible to the token".to_string()
} else {
"Accessible Azure DevOps repository".to_string()
};
let mut repo_permissions = Vec::new();
repo_permissions.push("repo:read".to_string());
permissions.read_only.push("repo:read".to_string());
let repo_name = match repo.project.name.as_deref() {
Some(project_name) if !project_name.is_empty() => {
format!("{}/{}", project_name, repo.name)
}
_ => repo.name.clone(),
};
if !seen_repos.insert(repo_name.clone()) {
continue;
}
resources.push(ResourceExposure {
resource_type: "repository".into(),
name: repo_name,
permissions: repo_permissions,
risk: severity_to_str(risk).to_string(),
reason,
});
}
permissions.read_only.sort();
permissions.read_only.dedup();
let severity = derive_severity(&projects, &repos);
let mut roles = Vec::new();
if !scopes.is_empty() {
roles.push(RoleBinding {
name: "token_scopes".into(),
source: "azure_devops".into(),
permissions: scopes.clone(),
});
}
if repos.is_empty() {
for project in &projects {
let is_private = project
.visibility
.as_deref()
.map(|v| v.eq_ignore_ascii_case("private"))
.unwrap_or(false);
let risk = if is_private { Severity::Medium } else { Severity::Low };
let reason = if is_private {
"Accessible private Azure DevOps project".to_string()
} else {
"Accessible public Azure DevOps project".to_string()
};
resources.push(ResourceExposure {
resource_type: "project".into(),
name: project.name.clone(),
permissions: vec!["project:read".to_string()],
risk: severity_to_str(risk).to_string(),
reason,
});
}
if projects.is_empty() {
resources.push(ResourceExposure {
resource_type: "organization".into(),
name: org.clone(),
permissions: Vec::new(),
risk: severity_to_str(Severity::Low).to_string(),
reason: "Azure DevOps organization associated with the token".into(),
});
}
risk_notes.push("Token did not enumerate any repositories".into());
}
if roles.is_empty() {
risk_notes
.push("Azure DevOps did not report PAT scopes; review the token permissions".into());
}
let pat_scopes =
pat_details.as_ref().map(|pat| parse_pat_scopes(pat.scope.as_deref())).unwrap_or_default();
let token_scopes = if scopes.is_empty() { pat_scopes.clone() } else { scopes.clone() };
Ok(AccessMapResult {
cloud: "azure_devops".into(),
identity,
roles,
permissions,
resources,
severity,
recommendations: build_recommendations(severity),
risk_notes,
token_details: Some(AccessTokenDetails {
name: pat_details
.as_ref()
.and_then(|pat| pat.display_name.clone())
.filter(|value| !value.trim().is_empty())
.or_else(|| {
profile
.display_name
.clone()
.or(profile.public_alias.clone())
.filter(|value| !value.trim().is_empty())
}),
username: profile.public_alias.clone().filter(|value| !value.trim().is_empty()),
account_type: None,
company: None,
location: None,
email: profile.email_address.clone().filter(|value| !value.trim().is_empty()),
url: None,
token_type: Some("pat".into()),
created_at: pat_details.as_ref().and_then(|pat| pat.valid_from.clone()),
last_used_at: None,
expires_at: pat_details.as_ref().and_then(|pat| pat.valid_to.clone()),
user_id: pat_details
.as_ref()
.and_then(|pat| pat.user_id.clone())
.or(profile.id.clone())
.or_else(|| user_data.user_id.clone())
.or(profile.email_address.clone())
.or(profile.public_alias.clone()),
scopes: token_scopes,
}),
provider_metadata: None,
})
}
#[derive(Default)]
struct AzureDevopsUserData {
user_id: Option<String>,
email: Option<String>,
}
async fn fetch_profile(
client: &Client,
auth_header: header::HeaderValue,
) -> Result<(AzureDevopsProfile, Vec<String>, AzureDevopsUserData)> {
let profile_url = Url::parse(AZURE_DEVOPS_PROFILE_API).expect("valid Azure DevOps profile URL");
let resp = client
.get(profile_url)
.header(header::AUTHORIZATION, auth_header)
.send()
.await
.context("Azure DevOps access-map: failed to fetch user profile")?;
if !resp.status().is_success() {
return Err(anyhow!(
"Azure DevOps access-map: profile lookup failed with HTTP {}",
resp.status()
));
}
let scopes = resp
.headers()
.get("x-vss-token-scopes")
.and_then(|val| val.to_str().ok())
.map(|value| {
value
.split(',')
.map(|scope| scope.trim().to_string())
.filter(|scope| !scope.is_empty())
.collect::<Vec<_>>()
})
.unwrap_or_default();
let user_data = parse_user_data(resp.headers().get("x-vss-userdata"));
let profile = resp.json().await.context("Azure DevOps access-map: invalid profile JSON")?;
Ok((profile, scopes, user_data))
}
fn normalize_org(raw: &str) -> String {
raw.trim().trim_matches('/').split('/').last().unwrap_or("").trim().to_string()
}
fn build_auth_header(token: &str) -> Result<header::HeaderValue> {
let encoded = b64.encode(format!(":{token}"));
header::HeaderValue::from_str(&format!("Basic {encoded}"))
.context("Failed to build Azure DevOps auth header")
}
fn parse_user_data(value: Option<&header::HeaderValue>) -> AzureDevopsUserData {
let Some(value) = value.and_then(|val| val.to_str().ok()) else {
return AzureDevopsUserData::default();
};
let mut parts = value.splitn(2, ':');
let user_id = parts.next().map(|item| item.trim().to_string());
let email = parts.next().map(|item| item.trim().to_string());
AzureDevopsUserData {
user_id: user_id.filter(|item| !item.is_empty()),
email: email.filter(|item| !item.is_empty()),
}
}
async fn fetch_pat_details(
client: &Client,
organization: &str,
auth_header: header::HeaderValue,
profile: &AzureDevopsProfile,
scopes: &[String],
) -> Option<AzureDevopsPat> {
let subject_descriptor =
fetch_subject_descriptor(client, organization, auth_header.clone(), profile).await?;
let mut url = Url::parse(&format!(
"https://vssps.dev.azure.com/{organization}/_apis/tokenadmin/personalaccesstokens/"
))
.ok()?;
url.path_segments_mut().ok()?.push(&subject_descriptor);
url.query_pairs_mut().append_pair("api-version", AZURE_DEVOPS_TOKEN_ADMIN_VERSION);
let resp = client
.get(url)
.header(header::ACCEPT, "application/json")
.header(header::AUTHORIZATION, auth_header)
.send()
.await
.ok()?;
if !resp.status().is_success() {
return None;
}
let payload: AzureDevopsListResponse<AzureDevopsPat> = resp.json().await.ok()?;
select_matching_pat(&payload.value, scopes, profile.id.as_deref())
}
async fn fetch_subject_descriptor(
client: &Client,
organization: &str,
auth_header: header::HeaderValue,
profile: &AzureDevopsProfile,
) -> Option<String> {
let mut attempts: Vec<(Option<&str>, Option<&str>)> = Vec::new();
if let Some(identity_id) = profile.id.as_deref().filter(|value| !value.trim().is_empty()) {
attempts.push((Some(identity_id), None));
}
if let Some(email) = profile.email_address.as_deref().filter(|value| !value.trim().is_empty()) {
attempts.push((None, Some(email)));
}
if let Some(alias) = profile.public_alias.as_deref().filter(|value| !value.trim().is_empty()) {
attempts.push((None, Some(alias)));
}
if let Some(display_name) =
profile.display_name.as_deref().filter(|value| !value.trim().is_empty())
{
attempts.push((None, Some(display_name)));
}
for (identity_id, search_value) in attempts {
let mut url =
Url::parse(&format!("https://vssps.dev.azure.com/{organization}/_apis/identities"))
.ok()?;
url.query_pairs_mut()
.append_pair("api-version", AZURE_DEVOPS_TOKEN_ADMIN_VERSION)
.append_pair("queryMembership", "None");
if let Some(identity_id) = identity_id {
url.query_pairs_mut().append_pair("identityIds", identity_id);
} else if let Some(search_value) = search_value {
url.query_pairs_mut()
.append_pair("searchFilter", "General")
.append_pair("filterValue", search_value);
}
let resp = client
.get(url)
.header(header::ACCEPT, "application/json")
.header(header::AUTHORIZATION, auth_header.clone())
.send()
.await
.ok()?;
if !resp.status().is_success() {
continue;
}
let payload: AzureDevopsListResponse<AzureDevopsIdentity> = resp.json().await.ok()?;
if let Some(descriptor) = payload
.value
.into_iter()
.filter_map(|identity| identity.subject_descriptor)
.find(|value| !value.trim().is_empty())
{
return Some(descriptor);
}
}
None
}
fn parse_pat_scopes(scope: Option<&str>) -> Vec<String> {
scope
.map(|value| {
value
.split_whitespace()
.map(|entry| entry.trim().to_string())
.filter(|entry| !entry.is_empty())
.collect::<Vec<_>>()
})
.unwrap_or_default()
}
fn select_matching_pat(
pats: &[AzureDevopsPat],
scopes: &[String],
user_id: Option<&str>,
) -> Option<AzureDevopsPat> {
if pats.is_empty() {
return None;
}
let mut candidates: Vec<&AzureDevopsPat> = pats
.iter()
.filter(|pat| {
if let Some(user_id) = user_id {
if let Some(pat_user_id) = pat.user_id.as_deref() {
return pat_user_id == user_id;
}
}
true
})
.collect();
let mut desired_scopes = scopes.to_vec();
desired_scopes.sort();
desired_scopes.dedup();
if !desired_scopes.is_empty() {
let scope_matches: Vec<&AzureDevopsPat> = candidates
.iter()
.copied()
.filter(|pat| {
let mut pat_scopes = parse_pat_scopes(pat.scope.as_deref());
pat_scopes.sort();
pat_scopes.dedup();
if pat_scopes.is_empty() {
return false;
}
pat_scopes == desired_scopes
|| desired_scopes.iter().all(|scope| pat_scopes.contains(scope))
})
.collect();
if !scope_matches.is_empty() {
candidates = scope_matches;
}
}
candidates.into_iter().max_by_key(|pat| pat.valid_from.as_deref().unwrap_or_default()).cloned()
}
async fn list_repositories(
client: &Client,
organization: &str,
auth_header: header::HeaderValue,
projects: &[AzureDevopsProject],
) -> Result<Vec<AzureDevopsRepo>> {
let url = format!(
"https://dev.azure.com/{organization}/_apis/git/repositories?api-version={AZURE_DEVOPS_API_VERSION}"
);
let resp = client
.get(url)
.header(header::ACCEPT, "application/json")
.header(header::AUTHORIZATION, auth_header.clone())
.send()
.await
.context("Azure DevOps access-map: failed to list repositories")?;
let mut repos = if resp.status().is_success() {
let payload: AzureDevopsListResponse<AzureDevopsRepo> =
resp.json().await.context("Azure DevOps access-map: invalid repo JSON")?;
payload.value
} else {
warn!("Azure DevOps access-map: repository enumeration failed with HTTP {}", resp.status());
Vec::new()
};
if !repos.is_empty() || projects.is_empty() {
return Ok(repos);
}
for project in projects {
let project_name = project.name.trim();
if project_name.is_empty() {
continue;
}
let mut project_repos =
list_project_repositories(client, organization, project_name, auth_header.clone())
.await
.unwrap_or_else(|err| {
warn!(
"Azure DevOps access-map: project repo enumeration failed for {project_name}: {err}"
);
Vec::new()
});
repos.append(&mut project_repos);
}
Ok(repos)
}
async fn list_project_repositories(
client: &Client,
organization: &str,
project: &str,
auth_header: header::HeaderValue,
) -> Result<Vec<AzureDevopsRepo>> {
let url = format!(
"https://dev.azure.com/{organization}/{project}/_apis/git/repositories?api-version={AZURE_DEVOPS_API_VERSION}"
);
let resp = client
.get(url)
.header(header::ACCEPT, "application/json")
.header(header::AUTHORIZATION, auth_header)
.send()
.await
.context("Azure DevOps access-map: failed to list project repositories")?;
if !resp.status().is_success() {
return Err(anyhow!(
"Azure DevOps access-map: project repository enumeration failed with HTTP {}",
resp.status()
));
}
let payload: AzureDevopsListResponse<AzureDevopsRepo> =
resp.json().await.context("Azure DevOps access-map: invalid repo JSON")?;
Ok(payload.value)
}
async fn list_projects(
client: &Client,
organization: &str,
auth_header: header::HeaderValue,
) -> Result<Vec<AzureDevopsProject>> {
let url = format!(
"https://dev.azure.com/{organization}/_apis/projects?api-version={AZURE_DEVOPS_API_VERSION}"
);
let resp = client
.get(url)
.header(header::ACCEPT, "application/json")
.header(header::AUTHORIZATION, auth_header)
.send()
.await
.context("Azure DevOps access-map: failed to list projects")?;
if !resp.status().is_success() {
warn!("Azure DevOps access-map: project enumeration failed with HTTP {}", resp.status());
return Ok(Vec::new());
}
let payload: AzureDevopsListResponse<AzureDevopsProject> =
resp.json().await.context("Azure DevOps access-map: invalid project JSON")?;
Ok(payload.value)
}
fn derive_severity(projects: &[AzureDevopsProject], repos: &[AzureDevopsRepo]) -> Severity {
if !repos.is_empty()
|| projects.iter().any(|project| {
project
.visibility
.as_deref()
.map(|v| v.eq_ignore_ascii_case("private"))
.unwrap_or(false)
})
{
Severity::Medium
} else {
Severity::Low
}
}
fn severity_to_str(severity: Severity) -> &'static str {
match severity {
Severity::Low => "low",
Severity::Medium => "medium",
Severity::High => "high",
Severity::Critical => "critical",
}
}

View file

@ -199,6 +199,8 @@ pub async fn map_access_from_json(data: &str) -> Result<AccessMapResult> {
severity,
recommendations,
risk_notes,
token_details: None,
provider_metadata: None,
})
}

420
src/access_map/github.rs Normal file
View file

@ -0,0 +1,420 @@
use anyhow::{anyhow, Context, Result};
use reqwest::{header, Client, Url};
use serde::Deserialize;
use tracing::warn;
use crate::{cli::commands::access_map::AccessMapArgs, validation::GLOBAL_USER_AGENT};
use super::{
build_recommendations, AccessMapResult, AccessSummary, AccessTokenDetails, PermissionSummary,
ResourceExposure, RoleBinding, Severity,
};
const DEFAULT_GITHUB_API: &str = "https://api.github.com";
#[derive(Deserialize)]
struct GitHubUser {
login: String,
id: u64,
#[serde(default)]
name: Option<String>,
#[serde(default)]
email: Option<String>,
#[serde(default)]
company: Option<String>,
#[serde(default)]
location: Option<String>,
#[serde(default)]
html_url: Option<String>,
#[serde(default)]
r#type: String,
}
#[derive(Deserialize)]
struct GitHubRepo {
full_name: String,
private: bool,
permissions: Option<GitHubRepoPermissions>,
}
#[derive(Deserialize)]
struct GitHubOrg {
login: String,
}
#[derive(Deserialize)]
struct GitHubOrgMembership {
organization: GitHubOrg,
#[serde(default)]
role: String,
#[serde(default)]
state: String,
}
#[derive(Clone, Deserialize)]
struct GitHubRepoPermissions {
admin: bool,
push: bool,
pull: bool,
}
pub async fn map_access(args: &AccessMapArgs) -> Result<AccessMapResult> {
let token = if let Some(path) = args.credential_path.as_deref() {
let raw = std::fs::read_to_string(path)
.with_context(|| format!("Failed to read GitHub token from {}", path.display()))?;
raw.trim().to_string()
} else {
return Err(anyhow!("GitHub access-map requires a validated token from scan results"));
};
map_access_from_token(&token).await
}
pub async fn map_access_from_token(token: &str) -> Result<AccessMapResult> {
let api_url = Url::parse(DEFAULT_GITHUB_API).expect("valid GitHub API URL");
let client = Client::builder()
.user_agent(GLOBAL_USER_AGENT.as_str())
.build()
.context("Failed to build GitHub HTTP client")?;
let user_resp = client
.get(api_url.join("user")?)
.header(header::AUTHORIZATION, format!("token {token}"))
.header(header::ACCEPT, "application/vnd.github+json")
.send()
.await
.context("GitHub access-map: failed to fetch user info")?;
if !user_resp.status().is_success() {
return Err(anyhow!(
"GitHub access-map: user lookup failed with HTTP {}",
user_resp.status()
));
}
let oauth_scopes = parse_csv_header(user_resp.headers().get("x-oauth-scopes"));
let token_expiration = user_resp
.headers()
.get("github-authentication-token-expiration")
.and_then(|val| val.to_str().ok())
.map(|value| value.trim().to_string())
.filter(|value| !value.is_empty());
let token_type = user_resp
.headers()
.get("github-authentication-token-type")
.and_then(|val| val.to_str().ok())
.map(|value| value.trim().to_string())
.filter(|value| !value.is_empty());
let user: GitHubUser =
user_resp.json().await.context("GitHub access-map: invalid user JSON")?;
let identity = AccessSummary {
id: user.login.clone(),
access_type: if user.r#type.is_empty() {
"user".into()
} else {
user.r#type.to_lowercase()
},
project: None,
tenant: None,
account_id: None,
};
let repos = list_accessible_repos(&client, &api_url, token).await?;
let mut risk_notes = Vec::new();
let mut resources = Vec::new();
let mut permissions = PermissionSummary::default();
let org_scopes = org_scopes(&oauth_scopes);
let org_memberships =
list_org_memberships(&client, &api_url, token).await.unwrap_or_else(|err| {
warn!("GitHub access-map: org membership lookup failed: {err}");
Vec::new()
});
for membership in org_memberships.into_iter().filter(|m| m.state == "active") {
let mut org_permissions = org_scopes.clone();
if !membership.role.trim().is_empty() {
org_permissions.push(format!("org_role:{}", membership.role.trim()));
}
org_permissions.sort();
org_permissions.dedup();
if org_permissions.is_empty() {
continue;
}
let risk = if org_permissions.iter().any(|perm| perm.contains("admin")) {
Severity::High
} else if org_permissions.iter().any(|perm| perm.contains("write")) {
Severity::Medium
} else {
Severity::Low
};
resources.push(ResourceExposure {
resource_type: "organization".into(),
name: membership.organization.login,
permissions: org_permissions.clone(),
risk: severity_to_str(risk).to_string(),
reason: "Organization membership available to the token".into(),
});
}
for repo in &repos {
let perms = repo.permissions.clone().unwrap_or(GitHubRepoPermissions {
admin: false,
push: false,
pull: true,
});
let mut repo_perms = Vec::new();
if perms.admin {
repo_perms.push("repo:admin".to_string());
}
if perms.push {
repo_perms.push("repo:write".to_string());
}
if perms.pull {
repo_perms.push("repo:read".to_string());
}
let risk = if perms.admin {
Severity::High
} else if perms.push {
Severity::Medium
} else {
Severity::Low
};
let reason = if repo.private {
"Accessible private repository".to_string()
} else {
"Accessible public repository".to_string()
};
resources.push(ResourceExposure {
resource_type: "repository".into(),
name: repo.full_name.clone(),
permissions: repo_perms.clone(),
risk: severity_to_str(risk).to_string(),
reason,
});
if perms.admin {
permissions.admin.push("repo:admin".to_string());
} else if perms.push {
permissions.risky.push("repo:write".to_string());
} else if perms.pull {
permissions.read_only.push("repo:read".to_string());
}
}
permissions.admin.sort();
permissions.admin.dedup();
permissions.risky.sort();
permissions.risky.dedup();
permissions.read_only.sort();
permissions.read_only.dedup();
let severity = derive_severity(&repos);
let mut roles = Vec::new();
if !oauth_scopes.is_empty() {
roles.push(RoleBinding {
name: "token_scopes".into(),
source: "github".into(),
permissions: oauth_scopes.clone(),
});
}
if repos.is_empty() {
resources.push(ResourceExposure {
resource_type: "account".into(),
name: user.login.clone(),
permissions: Vec::new(),
risk: severity_to_str(Severity::Low).to_string(),
reason: "GitHub account associated with the token".into(),
});
risk_notes.push("Token did not enumerate any repositories".into());
}
if roles.is_empty() {
risk_notes.push(
"GitHub did not report OAuth scopes; fine-grained tokens may omit scope headers".into(),
);
}
let user_display_name = user
.name
.clone()
.filter(|value| !value.trim().is_empty())
.or_else(|| Some(user.login.clone()));
let user_identifier = if let Some(email) = user.email.as_ref().filter(|v| !v.trim().is_empty())
{
format!("{} ({email})", user.login)
} else {
user.login.clone()
};
Ok(AccessMapResult {
cloud: "github".into(),
identity,
roles,
permissions,
resources,
severity,
recommendations: build_recommendations(severity),
risk_notes,
token_details: Some(AccessTokenDetails {
name: user_display_name,
username: Some(user.login.clone()),
account_type: Some(user.r#type.clone()).filter(|value| !value.trim().is_empty()),
company: user.company.clone().filter(|value| !value.trim().is_empty()),
location: user.location.clone().filter(|value| !value.trim().is_empty()),
email: user.email.clone().filter(|value| !value.trim().is_empty()),
url: user.html_url.clone().filter(|value| !value.trim().is_empty()),
token_type,
created_at: None,
last_used_at: None,
expires_at: token_expiration,
user_id: Some(user_identifier),
scopes: oauth_scopes.clone(),
}),
provider_metadata: None,
})
}
fn parse_csv_header(value: Option<&header::HeaderValue>) -> Vec<String> {
value
.and_then(|val| val.to_str().ok())
.map(|scopes| {
scopes
.split(',')
.map(|scope| scope.trim().to_string())
.filter(|scope| !scope.is_empty())
.collect::<Vec<_>>()
})
.unwrap_or_default()
}
async fn list_accessible_repos(
client: &Client,
api_url: &Url,
token: &str,
) -> Result<Vec<GitHubRepo>> {
let mut repos = Vec::new();
let mut page = 1u32;
let per_page = 100u32;
loop {
let mut url = api_url.join("user/repos")?;
url.query_pairs_mut()
.append_pair("per_page", &per_page.to_string())
.append_pair("page", &page.to_string());
let resp = client
.get(url)
.header(header::AUTHORIZATION, format!("token {token}"))
.header(header::ACCEPT, "application/vnd.github+json")
.send()
.await
.context("GitHub access-map: failed to list repositories")?;
if !resp.status().is_success() {
warn!("GitHub access-map: repo enumeration failed with HTTP {}", resp.status());
break;
}
let mut page_repos: Vec<GitHubRepo> =
resp.json().await.context("GitHub access-map: invalid repository JSON")?;
let count = page_repos.len();
repos.append(&mut page_repos);
if count < per_page as usize {
break;
}
page += 1;
}
Ok(repos)
}
async fn list_org_memberships(
client: &Client,
api_url: &Url,
token: &str,
) -> Result<Vec<GitHubOrgMembership>> {
let mut orgs = Vec::new();
let mut page = 1u32;
let per_page = 100u32;
loop {
let mut url = api_url.join("user/memberships/orgs")?;
url.query_pairs_mut()
.append_pair("per_page", &per_page.to_string())
.append_pair("page", &page.to_string());
let resp = client
.get(url)
.header(header::AUTHORIZATION, format!("token {token}"))
.header(header::ACCEPT, "application/vnd.github+json")
.send()
.await
.context("GitHub access-map: failed to list org memberships")?;
if !resp.status().is_success() {
warn!(
"GitHub access-map: org membership enumeration failed with HTTP {}",
resp.status()
);
break;
}
let mut page_orgs: Vec<GitHubOrgMembership> =
resp.json().await.context("GitHub access-map: invalid org JSON")?;
let count = page_orgs.len();
orgs.append(&mut page_orgs);
if count < per_page as usize {
break;
}
page += 1;
}
Ok(orgs)
}
fn derive_severity(repos: &[GitHubRepo]) -> Severity {
let mut severity = Severity::Low;
for repo in repos {
let perms = repo.permissions.as_ref();
if perms.map_or(false, |p| p.admin) {
return Severity::High;
}
if perms.map_or(false, |p| p.push) {
severity = Severity::Medium;
}
}
severity
}
fn org_scopes(scopes: &[String]) -> Vec<String> {
let mut result: Vec<String> = scopes
.iter()
.filter(|scope| scope.contains(":org") || scope.contains(":enterprise"))
.cloned()
.collect();
result.sort();
result.dedup();
result
}
fn severity_to_str(severity: Severity) -> &'static str {
match severity {
Severity::Low => "low",
Severity::Medium => "medium",
Severity::High => "high",
Severity::Critical => "critical",
}
}

309
src/access_map/gitlab.rs Normal file
View file

@ -0,0 +1,309 @@
use anyhow::{anyhow, Context, Result};
use reqwest::{header, Client, Url};
use serde::Deserialize;
use tracing::warn;
use crate::{cli::commands::access_map::AccessMapArgs, validation::GLOBAL_USER_AGENT};
use super::{
build_recommendations, AccessMapResult, AccessSummary, AccessTokenDetails, PermissionSummary,
ProviderMetadata, ResourceExposure, RoleBinding, Severity,
};
const DEFAULT_GITLAB_API: &str = "https://gitlab.com/api/v4/";
#[derive(Deserialize)]
struct GitLabProject {
path_with_namespace: String,
visibility: String,
permissions: Option<GitLabProjectPermissions>,
}
#[derive(Clone, Deserialize)]
struct GitLabProjectPermissions {
project_access: Option<GitLabAccess>,
group_access: Option<GitLabAccess>,
}
#[derive(Clone, Deserialize)]
struct GitLabAccess {
access_level: u32,
}
#[derive(Deserialize)]
struct GitLabTokenInfo {
_id: Option<u64>,
name: Option<String>,
created_at: Option<String>,
last_used_at: Option<String>,
expires_at: Option<String>,
scopes: Option<Vec<String>>,
user_id: Option<u64>,
}
#[derive(Deserialize)]
struct GitLabMetadata {
version: Option<String>,
enterprise: Option<bool>,
}
pub async fn map_access(args: &AccessMapArgs) -> Result<AccessMapResult> {
let token = if let Some(path) = args.credential_path.as_deref() {
let raw = std::fs::read_to_string(path)
.with_context(|| format!("Failed to read GitLab token from {}", path.display()))?;
raw.trim().to_string()
} else {
return Err(anyhow!("GitLab access-map requires a validated token from scan results"));
};
map_access_from_token(&token).await
}
pub async fn map_access_from_token(token: &str) -> Result<AccessMapResult> {
let api_url = Url::parse(DEFAULT_GITLAB_API).expect("valid GitLab API URL");
let client = Client::builder()
.user_agent(GLOBAL_USER_AGENT.as_str())
.build()
.context("Failed to build GitLab HTTP client")?;
let token_info = fetch_token_info(&client, &api_url, token).await;
let identity_label = token_info
.as_ref()
.and_then(|info| info.name.clone())
.or_else(|| {
token_info
.as_ref()
.and_then(|info| info.user_id)
.map(|user_id| format!("gitlab_user_{user_id}"))
})
.unwrap_or_else(|| "gitlab_token".to_string());
let identity = AccessSummary {
id: identity_label,
access_type: "token".into(),
project: None,
tenant: None,
account_id: None,
};
let scopes = token_info.as_ref().and_then(|info| info.scopes.clone());
let projects = list_accessible_projects(&client, &api_url, token).await?;
let metadata = fetch_instance_metadata(&client, &api_url, token).await;
let mut risk_notes = Vec::new();
let mut resources = Vec::new();
let mut permissions = PermissionSummary::default();
for project in &projects {
let access_level =
project.permissions.as_ref().map(effective_access_level).unwrap_or_default();
let (perm_label, severity) = access_level_to_risk(access_level);
resources.push(ResourceExposure {
resource_type: "project".into(),
name: project.path_with_namespace.clone(),
permissions: vec![perm_label.to_string()],
risk: severity_to_str(severity).to_string(),
reason: format!("Accessible {} project", project.visibility),
});
match severity {
Severity::High | Severity::Critical => permissions.admin.push(perm_label.to_string()),
Severity::Medium => permissions.risky.push(perm_label.to_string()),
Severity::Low => permissions.read_only.push(perm_label.to_string()),
}
}
permissions.admin.sort();
permissions.admin.dedup();
permissions.risky.sort();
permissions.risky.dedup();
permissions.read_only.sort();
permissions.read_only.dedup();
let severity = derive_severity(&projects);
let mut roles = Vec::new();
if let Some(ref scopes) = scopes {
if !scopes.is_empty() {
roles.push(RoleBinding {
name: "token_scopes".into(),
source: "gitlab".into(),
permissions: scopes.clone(),
});
}
}
if projects.is_empty() {
resources.push(ResourceExposure {
resource_type: "account".into(),
name: token_info
.as_ref()
.and_then(|info| info.name.clone())
.unwrap_or_else(|| identity.id.clone()),
permissions: Vec::new(),
risk: severity_to_str(Severity::Low).to_string(),
reason: "GitLab account associated with the token".into(),
});
risk_notes.push("Token did not enumerate any projects".into());
}
if roles.is_empty() {
risk_notes.push("GitLab did not report token scopes".into());
}
let token_details = token_info.as_ref().map(|info| AccessTokenDetails {
name: info.name.clone(),
username: None,
account_type: None,
company: None,
location: None,
email: None,
url: None,
token_type: None,
created_at: info.created_at.clone(),
last_used_at: info.last_used_at.clone(),
expires_at: info.expires_at.clone(),
user_id: info.user_id.map(|user_id| user_id.to_string()),
scopes: scopes.clone().unwrap_or_default(),
});
Ok(AccessMapResult {
cloud: "gitlab".into(),
identity,
roles,
permissions,
resources,
severity,
recommendations: build_recommendations(severity),
risk_notes,
token_details,
provider_metadata: metadata
.map(|info| ProviderMetadata { version: info.version, enterprise: info.enterprise }),
})
}
async fn fetch_token_info(client: &Client, api_url: &Url, token: &str) -> Option<GitLabTokenInfo> {
let resp = client
.get(api_url.join("personal_access_tokens/self").ok()?)
.header("PRIVATE-TOKEN", token)
.header(header::ACCEPT, "application/json")
.send()
.await
.ok()?;
if !resp.status().is_success() {
return None;
}
resp.json().await.ok()
}
async fn fetch_instance_metadata(
client: &Client,
api_url: &Url,
token: &str,
) -> Option<GitLabMetadata> {
let resp = client
.get(api_url.join("metadata").ok()?)
.header("PRIVATE-TOKEN", token)
.header(header::ACCEPT, "application/json")
.send()
.await
.ok()?;
if !resp.status().is_success() {
return None;
}
resp.json().await.ok()
}
async fn list_accessible_projects(
client: &Client,
api_url: &Url,
token: &str,
) -> Result<Vec<GitLabProject>> {
let mut projects = Vec::new();
let mut page = 1u32;
let per_page = 100u32;
loop {
let mut url = api_url.join("projects")?;
url.query_pairs_mut()
.append_pair("min_access_level", "10")
.append_pair("per_page", &per_page.to_string())
.append_pair("page", &page.to_string());
let resp = client
.get(url)
.header("PRIVATE-TOKEN", token)
.header(header::ACCEPT, "application/json")
.send()
.await
.context("GitLab access-map: failed to list projects")?;
if !resp.status().is_success() {
warn!("GitLab access-map: project enumeration failed with HTTP {}", resp.status());
break;
}
let next_page = resp
.headers()
.get("x-next-page")
.and_then(|value| value.to_str().ok())
.and_then(|value| value.parse::<u32>().ok());
let mut page_projects: Vec<GitLabProject> =
resp.json().await.context("GitLab access-map: invalid project JSON")?;
let count = page_projects.len();
projects.append(&mut page_projects);
if count < per_page as usize || next_page.is_none() {
break;
}
page = next_page.unwrap_or(page + 1);
}
Ok(projects)
}
fn effective_access_level(perms: &GitLabProjectPermissions) -> u32 {
let project_level = perms.project_access.as_ref().map(|access| access.access_level);
let group_level = perms.group_access.as_ref().map(|access| access.access_level);
project_level.max(group_level).unwrap_or_default()
}
fn access_level_to_risk(access_level: u32) -> (&'static str, Severity) {
match access_level {
50 => ("project:owner", Severity::High),
40 => ("project:maintainer", Severity::High),
30 => ("project:developer", Severity::Medium),
20 => ("project:reporter", Severity::Low),
10 => ("project:guest", Severity::Low),
_ => ("project:access", Severity::Low),
}
}
fn derive_severity(projects: &[GitLabProject]) -> Severity {
let mut severity = Severity::Low;
for project in projects {
let access_level =
project.permissions.as_ref().map(effective_access_level).unwrap_or_default();
let (_, project_severity) = access_level_to_risk(access_level);
match project_severity {
Severity::High | Severity::Critical => return Severity::High,
Severity::Medium => severity = Severity::Medium,
Severity::Low => {}
}
}
severity
}
fn severity_to_str(severity: Severity) -> &'static str {
match severity {
Severity::Low => "low",
Severity::Medium => "medium",
Severity::High => "high",
Severity::Critical => "critical",
}
}

View file

@ -5,7 +5,7 @@ use clap::{Args, ValueEnum};
/// Inspect a cloud credential and derive the effective identity and blast radius.
#[derive(Args, Debug)]
pub struct AccessMapArgs {
/// Cloud provider: aws | gcp | azure
/// Cloud provider: aws | gcp | azure | github | gitlab
#[clap(value_parser, value_name = "PROVIDER")]
pub provider: AccessMapProvider,
@ -31,4 +31,8 @@ pub enum AccessMapProvider {
Gcp,
/// Microsoft Azure
Azure,
/// GitHub
Github,
/// GitLab
Gitlab,
}

View file

@ -35,6 +35,22 @@ pub struct InputSpecifierArgs {
#[arg(long, value_hint = ValueHint::Url)]
pub git_url: Vec<GitUrl>,
/// Parent directory for cloned Git repositories and scan artifacts
#[arg(long = "git-clone-dir", value_hint = ValueHint::DirPath, help_heading = "Git Options")]
pub git_clone_dir: Option<PathBuf>,
/// Keep cloned Git repositories on disk after the scan completes
#[arg(long = "keep-clones", default_value_t = false, help_heading = "Git Options")]
pub keep_clones: bool,
/// Limit the number of GitHub/GitLab repositories cloned during enumeration
#[arg(long = "repo-clone-limit", value_name = "COUNT")]
pub repo_clone_limit: Option<usize>,
/// Include contributor repositories when scanning GitHub or GitLab git URLs
#[arg(long = "include-contributors", default_value_t = false)]
pub include_contributors: bool,
/// Scan repositories belonging to the specified GitHub user
#[arg(long, hide = true)]
pub github_user: Vec<String>,

View file

@ -82,6 +82,26 @@ pub struct ScanArgs {
#[arg(global = true, long, short = 'n', default_value_t = false)]
pub no_validate: bool,
/// Timeout for validation requests in seconds (1-60)
#[arg(
global = true,
long = "validation-timeout",
default_value_t = 10,
value_name = "SECONDS",
value_parser = clap::value_parser!(u64).range(1..=60)
)]
pub validation_timeout: u64,
/// Number of retries for validation requests (0-5)
#[arg(
global = true,
long = "validation-retries",
default_value_t = 1,
value_name = "N",
value_parser = clap::value_parser!(u32).range(0..=5)
)]
pub validation_retries: u32,
/// Map validated cloud credentials to their effective identities; use only when
/// authorized for the target account because this triggers additional network
/// requests to determine granted access
@ -107,6 +127,10 @@ pub struct ScanArgs {
#[arg(global = true, long, default_value_t = false)]
pub no_dedup: bool,
/// Serve a JSON report locally and open the browser (http://127.0.0.1:7890)
#[arg(skip)]
pub view_report: bool,
/// Redact findings values using a secure hash
#[arg(global = true, long, short = 'r', default_value_t = false)]
pub redact: bool,
@ -188,6 +212,10 @@ pub struct ScanCommandArgs {
#[command(flatten)]
pub scan_args: ScanArgs,
/// Serve a JSON report locally and open the browser (http://127.0.0.1:7890)
#[arg(global = true, long = "view-report", default_value_t = false)]
pub view_report: bool,
#[command(subcommand)]
pub provider: Option<ScanInputCommand>,
}
@ -213,6 +241,8 @@ impl ScanCommandArgs {
pub fn into_operation(mut self) -> anyhow::Result<ScanOperation> {
let mut used_provider_subcommand = false;
self.scan_args.view_report = self.view_report;
if let Some(provider) = self.provider.take() {
used_provider_subcommand = true;
let scan_args = &mut self.scan_args;
@ -246,6 +276,9 @@ impl ScanCommandArgs {
args.specifiers.all_organizations;
scan_args.input_specifier_args.github_repo_type = args.specifiers.repo_type;
scan_args.input_specifier_args.github_api_url = args.api_url;
scan_args.input_specifier_args.repo_clone_limit = args.repo_clone_limit;
scan_args.input_specifier_args.include_contributors =
args.include_contributors;
None
}
}
@ -271,6 +304,9 @@ impl ScanCommandArgs {
args.specifiers.include_subgroups;
scan_args.input_specifier_args.gitlab_repo_type = args.specifiers.repo_type;
scan_args.input_specifier_args.gitlab_api_url = args.api_url;
scan_args.input_specifier_args.repo_clone_limit = args.repo_clone_limit;
scan_args.input_specifier_args.include_contributors =
args.include_contributors;
None
}
}
@ -505,6 +541,14 @@ pub struct GithubScanArgs {
#[command(flatten)]
pub specifiers: GitHubRepoSpecifiers,
/// Include contributor repositories when scanning git URLs
#[arg(long = "include-contributors", default_value_t = false)]
pub include_contributors: bool,
/// Limit the number of repositories cloned (including contributor repos)
#[arg(long = "repo-clone-limit", value_name = "COUNT")]
pub repo_clone_limit: Option<usize>,
/// List matching repositories without scanning them
#[arg(long = "list-only")]
pub list_only: bool,
@ -524,6 +568,14 @@ pub struct GitLabScanArgs {
#[command(flatten)]
pub specifiers: GitLabRepoSpecifiers,
/// Include contributor repositories when scanning git URLs
#[arg(long = "include-contributors", default_value_t = false)]
pub include_contributors: bool,
/// Limit the number of repositories cloned (including contributor repos)
#[arg(long = "repo-clone-limit", value_name = "COUNT")]
pub repo_clone_limit: Option<usize>,
/// List matching repositories without scanning them
#[arg(long = "list-only")]
pub list_only: bool,

View file

@ -13,24 +13,29 @@ use axum::{
routing::get,
Router,
};
use clap::ValueHint;
use include_dir::{include_dir, Dir};
use tokio::net::TcpListener;
use tracing::info;
use tracing::{info, warn};
const DEFAULT_PORT: u16 = 7890;
pub const DEFAULT_PORT: u16 = 7890;
static VIEWER_ASSETS: Dir<'_> = include_dir!("$CARGO_MANIFEST_DIR/docs/access-map-viewer");
/// View a Kingfisher access-map report locally.
#[derive(clap::Args, Debug)]
pub struct ViewArgs {
/// Path to a JSON or JSONL access-map report to load automatically
#[arg(value_name = "REPORT", value_hint = ValueHint::FilePath)]
#[arg(value_name = "REPORT", value_hint = clap::ValueHint::FilePath)]
pub report: Option<PathBuf>,
/// Local port for the embedded viewer (default 7890)
#[arg(long, default_value_t = DEFAULT_PORT)]
pub port: u16,
#[arg(skip)]
pub open_browser: bool,
#[arg(skip)]
pub report_bytes: Option<Vec<u8>>,
}
#[derive(Clone)]
@ -40,7 +45,9 @@ struct AppState {
/// Run the `kingfisher view` subcommand.
pub async fn run(args: ViewArgs) -> Result<()> {
let report = if let Some(path) = args.report.as_ref() {
let report = if let Some(report_bytes) = args.report_bytes.as_ref() {
Some(report_bytes.clone())
} else if let Some(path) = args.report.as_ref() {
let expanded_path = expand_tilde(path)?;
let ext = path
.extension()
@ -73,12 +80,20 @@ pub async fn run(args: ViewArgs) -> Result<()> {
let address: SocketAddr =
listener.local_addr().context("Failed to read local listener address")?;
let url = format!("http://{}:{}", address.ip(), address.port());
info!(%address, "Starting access-map viewer");
eprintln!(
"Serving access-map viewer at http://{}:{} (Ctrl+C to stop)",
address.ip(),
address.port()
);
eprintln!("Serving access-map viewer at {} (Ctrl+C to stop)", url);
let open_browser = args.open_browser || args.report.is_some() || args.report_bytes.is_some();
if open_browser {
let url = url.clone();
tokio::task::spawn_blocking(move || {
if let Err(err) = webbrowser::open(&url) {
warn!(%err, "Failed to open browser for access-map viewer");
}
});
}
let state = Arc::new(AppState { report });

View file

@ -428,7 +428,7 @@ mod tests {
Some("owner/repo")
);
assert_eq!(
parse_excluded_repo("ssh://git@example.com:3000/Owner/Repo.git").as_deref(),
parse_excluded_repo("ssh://git@exmple.com:3000/Owner/Repo.git").as_deref(),
Some("owner/repo")
);
}

View file

@ -14,13 +14,25 @@ use octorust::{
types::{Order, ReposListOrgSort, ReposListOrgType, ReposListUserType},
Client,
};
use reqwest::StatusCode;
use serde::Deserialize;
use serde_json::Value;
use tracing::warn;
use tracing::{info, warn};
use url::Url;
use crate::{findings_store, git_url::GitUrl, validation::GLOBAL_USER_AGENT};
use std::str::FromStr;
#[derive(Deserialize)]
struct GitHubContributor {
login: Option<String>,
}
#[derive(Deserialize)]
struct GitHubRepo {
clone_url: String,
}
#[derive(Debug)]
pub struct RepoSpecifiers {
pub user: Vec<String>,
@ -214,6 +226,185 @@ fn create_github_client(github_url: &url::Url, ignore_certs: bool) -> Result<Arc
}
Ok(Arc::new(client))
}
fn normalize_api_base(api_url: &Url) -> Url {
let mut base = api_url.clone();
if !base.path().ends_with('/') {
let path = format!("{}/", base.path());
base.set_path(&path);
}
base
}
pub async fn enumerate_contributor_repo_urls(
repo_url: &GitUrl,
github_api_url: &Url,
ignore_certs: bool,
exclude_repos: &[String],
repo_clone_limit: Option<usize>,
progress_enabled: bool,
) -> Result<Vec<String>> {
let (_, owner, repo) = parse_repo(repo_url).context("invalid GitHub repo URL")?;
let exclude_set = build_exclude_matcher(exclude_repos);
let client = reqwest::Client::builder().danger_accept_invalid_certs(ignore_certs).build()?;
let token = env::var("KF_GITHUB_TOKEN").ok().filter(|t| !t.is_empty());
let api_base = normalize_api_base(github_api_url);
let mut contributor_logins = Vec::new();
let mut seen_contributors = HashSet::new();
let mut page = 1;
loop {
let mut url = api_base
.join(&format!("repos/{owner}/{repo}/contributors"))
.context("Failed to build GitHub contributors URL")?;
url.query_pairs_mut().append_pair("per_page", "100").append_pair("page", &page.to_string());
let mut req = client.get(url).header("User-Agent", GLOBAL_USER_AGENT.as_str());
if let Some(token) = token.as_ref() {
req = req.bearer_auth(token);
}
let resp = req.send().await?;
if !resp.status().is_success() {
warn_on_rate_limit("GitHub", resp.status(), "listing contributors");
break;
}
let contributors: Vec<GitHubContributor> = resp.json().await?;
if contributors.is_empty() {
break;
}
for contributor in contributors {
if let Some(login) = contributor.login {
if seen_contributors.insert(login.clone()) {
contributor_logins.push(login);
}
}
}
page += 1;
}
let (per_user_limit, total_limit) =
determine_contributor_repo_limits(repo_clone_limit, contributor_logins.len(), "GitHub");
let progress = build_contributor_progress_bar(
progress_enabled,
contributor_logins.len() as u64,
"Enumerating GitHub contributor repositories...",
);
let mut repo_urls = Vec::new();
let mut total_repo_count = 0usize;
for login in contributor_logins {
if let Some(total_limit) = total_limit {
if total_repo_count >= total_limit {
break;
}
}
let mut user_repo_count = 0usize;
page = 1;
loop {
if let Some(per_user_limit) = per_user_limit {
if user_repo_count >= per_user_limit {
break;
}
}
if let Some(total_limit) = total_limit {
if total_repo_count >= total_limit {
break;
}
}
let mut url = api_base
.join(&format!("users/{login}/repos"))
.context("Failed to build GitHub user repos URL")?;
url.query_pairs_mut()
.append_pair("per_page", "100")
.append_pair("page", &page.to_string())
.append_pair("type", "all")
.append_pair("sort", "updated")
.append_pair("direction", "desc");
let mut req = client.get(url).header("User-Agent", GLOBAL_USER_AGENT.as_str());
if let Some(token) = token.as_ref() {
req = req.bearer_auth(token);
}
let resp = req.send().await?;
if !resp.status().is_success() {
warn_on_rate_limit("GitHub", resp.status(), "listing user repositories");
break;
}
let repos: Vec<GitHubRepo> = resp.json().await?;
if repos.is_empty() {
break;
}
for repo in repos {
if let Some(per_user_limit) = per_user_limit {
if user_repo_count >= per_user_limit {
break;
}
}
if let Some(total_limit) = total_limit {
if total_repo_count >= total_limit {
break;
}
}
if should_exclude_repo(&repo.clone_url, &exclude_set) {
continue;
}
repo_urls.push(repo.clone_url);
user_repo_count += 1;
total_repo_count += 1;
}
page += 1;
}
progress.inc(1);
}
repo_urls.sort();
repo_urls.dedup();
progress.finish_and_clear();
Ok(repo_urls)
}
fn warn_on_rate_limit(service: &str, status: StatusCode, action: &str) {
if status == StatusCode::FORBIDDEN || status == StatusCode::TOO_MANY_REQUESTS {
warn!("{service} API rate limit or access restriction while {action}: HTTP {status}");
}
}
fn determine_contributor_repo_limits(
repo_clone_limit: Option<usize>,
user_count: usize,
service: &str,
) -> (Option<usize>, Option<usize>) {
let Some(limit) = repo_clone_limit else {
return (None, None);
};
if user_count == 0 {
return (Some(0), Some(limit));
}
if user_count > limit {
let per_user_limit = std::cmp::max(1, limit / 100);
info!(
"Found {user_count} {service} contributors which exceeds repo-clone-limit {limit}. \
Consider increasing repo-clone-limit; sampling {per_user_limit} repos per user until the limit is reached."
);
return (Some(per_user_limit), Some(limit));
}
let per_user_limit = std::cmp::max(1, limit / user_count);
(Some(per_user_limit), Some(limit))
}
fn build_contributor_progress_bar(
progress_enabled: bool,
length: u64,
message: &str,
) -> ProgressBar {
if progress_enabled {
let style = ProgressStyle::with_template("{spinner} {msg} {pos}/{len} [{elapsed_precise}]")
.expect("progress bar style template should compile");
let pb = ProgressBar::new(length).with_style(style).with_message(message.to_string());
pb.enable_steady_tick(Duration::from_millis(500));
pb
} else {
ProgressBar::hidden()
}
}
pub async fn enumerate_repo_urls(
repo_specifiers: &RepoSpecifiers,
github_url: url::Url,

View file

@ -18,10 +18,11 @@ use gitlab::{
};
use globset::{Glob, GlobSet, GlobSetBuilder};
use indicatif::{ProgressBar, ProgressStyle};
use reqwest::StatusCode;
use serde::Deserialize;
use serde_json::Value;
use tokio::task;
use tracing::warn;
use tracing::{info, warn};
use url::{form_urlencoded, Url};
use crate::{findings_store, git_url::GitUrl};
@ -42,6 +43,25 @@ struct SimpleGroup {
id: u64,
}
#[derive(Deserialize)]
struct GitLabProjectId {
id: u64,
}
#[derive(Deserialize)]
struct GitLabContributor {
name: String,
email: Option<String>,
}
#[derive(Deserialize)]
struct GitLabUser {
id: u64,
_username: String,
name: String,
email: Option<String>,
}
/// Repository filter types for GitLab
#[derive(Debug, Clone)]
pub enum RepoType {
@ -206,6 +226,15 @@ fn create_gitlab_client(gitlab_url: &Url, ignore_certs: bool) -> Result<Gitlab>
Ok(builder.build()?)
}
fn normalize_api_base(api_url: &Url) -> Url {
let mut base = api_url.clone();
if !base.path().ends_with('/') {
let path = format!("{}/", base.path());
base.set_path(&path);
}
base
}
pub async fn enumerate_repo_urls(
repo_specifiers: &RepoSpecifiers,
gitlab_url: Url,
@ -222,6 +251,217 @@ pub async fn enumerate_repo_urls(
Ok(repo_urls)
}
pub async fn enumerate_contributor_repo_urls(
repo_url: &GitUrl,
gitlab_url: &Url,
ignore_certs: bool,
exclude_repos: &[String],
repo_clone_limit: Option<usize>,
progress_enabled: bool,
) -> Result<Vec<String>> {
let (_, path) = parse_repo(repo_url).context("invalid GitLab repo URL")?;
let encoded = form_urlencoded::byte_serialize(path.as_bytes()).collect::<String>();
let exclude_set = build_exclude_matcher(exclude_repos);
let client = reqwest::Client::builder().danger_accept_invalid_certs(ignore_certs).build()?;
let token = env::var("KF_GITLAB_TOKEN").ok().filter(|t| !t.is_empty());
let api_base = normalize_api_base(gitlab_url);
let project_url = api_base
.join(&format!("api/v4/projects/{encoded}"))
.context("Failed to build GitLab project URL")?;
let mut project_req = client.get(project_url);
if let Some(token) = token.as_ref() {
project_req = project_req.header("PRIVATE-TOKEN", token);
}
let project_resp = project_req.send().await?;
if !project_resp.status().is_success() {
warn_on_rate_limit("GitLab", project_resp.status(), "fetching project metadata");
return Ok(Vec::new());
}
let project: GitLabProjectId = project_resp.json().await?;
let project_id = project.id;
let mut contributors = Vec::new();
let mut page = 1;
loop {
let mut url = api_base
.join(&format!("api/v4/projects/{project_id}/repository/contributors"))
.context("Failed to build GitLab contributors URL")?;
url.query_pairs_mut().append_pair("per_page", "100").append_pair("page", &page.to_string());
let mut req = client.get(url);
if let Some(token) = token.as_ref() {
req = req.header("PRIVATE-TOKEN", token);
}
let resp = req.send().await?;
if !resp.status().is_success() {
warn_on_rate_limit("GitLab", resp.status(), "listing contributors");
break;
}
let page_contributors: Vec<GitLabContributor> = resp.json().await?;
if page_contributors.is_empty() {
break;
}
contributors.extend(page_contributors);
page += 1;
}
let mut seen_users = HashSet::new();
let mut users = Vec::new();
for contributor in contributors {
let query = contributor.email.as_deref().unwrap_or(&contributor.name);
let mut url = api_base.join("api/v4/users").context("Failed to build GitLab users URL")?;
url.query_pairs_mut().append_pair("search", query);
let mut req = client.get(url);
if let Some(token) = token.as_ref() {
req = req.header("PRIVATE-TOKEN", token);
}
let resp = req.send().await?;
if !resp.status().is_success() {
warn_on_rate_limit("GitLab", resp.status(), "searching for contributor users");
continue;
}
let users_resp: Vec<GitLabUser> = resp.json().await?;
let matching = users_resp.into_iter().find(|user| {
contributor
.email
.as_ref()
.and_then(|email| user.email.as_ref().map(|u| (email, u)))
.map(|(email, user_email)| email.eq_ignore_ascii_case(user_email))
.unwrap_or_else(|| user.name.eq_ignore_ascii_case(&contributor.name))
});
let Some(user) = matching else {
continue;
};
if !seen_users.insert(user.id) {
continue;
}
users.push(user);
}
let (per_user_limit, total_limit) =
determine_contributor_repo_limits(repo_clone_limit, users.len(), "GitLab");
let progress = build_contributor_progress_bar(
progress_enabled,
users.len() as u64,
"Enumerating GitLab contributor repositories...",
);
let mut repo_urls = Vec::new();
let mut total_repo_count = 0usize;
for user in users {
if let Some(total_limit) = total_limit {
if total_repo_count >= total_limit {
break;
}
}
let mut user_repo_count = 0usize;
page = 1;
loop {
if let Some(per_user_limit) = per_user_limit {
if user_repo_count >= per_user_limit {
break;
}
}
if let Some(total_limit) = total_limit {
if total_repo_count >= total_limit {
break;
}
}
let mut url = api_base
.join(&format!("api/v4/users/{}/projects", user.id))
.context("Failed to build GitLab user projects URL")?;
url.query_pairs_mut()
.append_pair("per_page", "100")
.append_pair("page", &page.to_string())
.append_pair("order_by", "updated_at")
.append_pair("sort", "desc");
let mut req = client.get(url);
if let Some(token) = token.as_ref() {
req = req.header("PRIVATE-TOKEN", token);
}
let resp = req.send().await?;
if !resp.status().is_success() {
warn_on_rate_limit("GitLab", resp.status(), "listing user projects");
break;
}
let projects: Vec<SimpleProject> = resp.json().await?;
if projects.is_empty() {
break;
}
for proj in projects {
if let Some(per_user_limit) = per_user_limit {
if user_repo_count >= per_user_limit {
break;
}
}
if let Some(total_limit) = total_limit {
if total_repo_count >= total_limit {
break;
}
}
if should_exclude_repo(&proj.http_url_to_repo, &exclude_set) {
continue;
}
repo_urls.push(proj.http_url_to_repo);
user_repo_count += 1;
total_repo_count += 1;
}
page += 1;
}
progress.inc(1);
}
repo_urls.sort();
repo_urls.dedup();
progress.finish_and_clear();
Ok(repo_urls)
}
fn warn_on_rate_limit(service: &str, status: StatusCode, action: &str) {
if status == StatusCode::FORBIDDEN || status == StatusCode::TOO_MANY_REQUESTS {
warn!("{service} API rate limit or access restriction while {action}: HTTP {status}");
}
}
fn determine_contributor_repo_limits(
repo_clone_limit: Option<usize>,
user_count: usize,
service: &str,
) -> (Option<usize>, Option<usize>) {
let Some(limit) = repo_clone_limit else {
return (None, None);
};
if user_count == 0 {
return (Some(0), Some(limit));
}
if user_count > limit {
let per_user_limit = std::cmp::max(1, limit / 100);
info!(
"Found {user_count} {service} contributors which exceeds repo-clone-limit {limit}. \
Consider increasing repo-clone-limit; sampling {per_user_limit} repos per user until the limit is reached."
);
return (Some(per_user_limit), Some(limit));
}
let per_user_limit = std::cmp::max(1, limit / user_count);
(Some(per_user_limit), Some(limit))
}
fn build_contributor_progress_bar(
progress_enabled: bool,
length: u64,
message: &str,
) -> ProgressBar {
if progress_enabled {
let style = ProgressStyle::with_template("{spinner} {msg} {pos}/{len} [{elapsed_precise}]")
.expect("progress bar style template should compile");
let pb = ProgressBar::new(length).with_style(style).with_message(message.to_string());
pb.enable_steady_tick(Duration::from_millis(500));
pb
} else {
ProgressBar::hidden()
}
}
fn enumerate_repo_urls_blocking(
repo_specifiers: &RepoSpecifiers,
gitlab_url: Url,

View file

@ -51,6 +51,7 @@ use kingfisher::{
findings_store,
findings_store::FindingsStore,
gitea, github, huggingface,
reporter::{styles::Styles, DetailsReporter},
rule_loader::RuleLoader,
rules_database::RulesDatabase,
scanner::{load_and_record_rules, run_scan},
@ -197,14 +198,25 @@ async fn async_main(args: CommandLineArgs) -> Result<()> {
Command::View(view_args) => view::run(view_args).await,
Command::AccessMap(identity_args) => access_map::run(identity_args).await,
command => {
let temp_dir = TempDir::new().context("Failed to create temporary directory")?;
let clone_dir = temp_dir.path().to_path_buf();
let datastore = Arc::new(Mutex::new(FindingsStore::new(clone_dir)));
let update_status = check_for_update_async(&global_args, None).await;
match command {
Command::Scan(scan_command) => match scan_command.into_operation()? {
ScanOperation::Scan(mut scan_args) => {
let temp_dir =
TempDir::new().context("Failed to create temporary directory")?;
let temp_dir_path = temp_dir.path().to_path_buf();
let clone_dir = if let Some(clone_dir) =
scan_args.input_specifier_args.git_clone_dir.as_ref()
{
std::fs::create_dir_all(clone_dir)?;
clone_dir.to_path_buf()
} else {
temp_dir_path.clone()
};
let keep_clones = scan_args.input_specifier_args.keep_clones
&& scan_args.input_specifier_args.git_clone_dir.is_none();
let datastore = Arc::new(Mutex::new(FindingsStore::new(clone_dir)));
info!(
"Launching with {} concurrent scan jobs. Use --num-jobs to override.",
&scan_args.num_jobs
@ -214,7 +226,7 @@ async fn async_main(args: CommandLineArgs) -> Result<()> {
if (paths.is_empty() || is_dash) && !atty::is(atty::Stream::Stdin) {
let mut buf = Vec::new();
std::io::stdin().read_to_end(&mut buf)?;
let stdin_file = temp_dir.path().join("stdin_input");
let stdin_file = temp_dir_path.join("stdin_input");
std::fs::write(&stdin_file, buf)?;
scan_args.input_specifier_args.path_inputs = vec![stdin_file.into()];
}
@ -239,9 +251,29 @@ async fn async_main(args: CommandLineArgs) -> Result<()> {
}
let exit_code = determine_exit_code(&datastore);
if let Err(e) = temp_dir.close() {
if scan_args.view_report {
let reporter = DetailsReporter {
datastore: Arc::clone(&datastore),
styles: Styles::new(global_args.use_color(std::io::stdout())),
only_valid: scan_args.only_valid,
};
let envelope = reporter.build_report_envelope(&scan_args)?;
let report_bytes = serde_json::to_vec_pretty(&envelope)?;
let view_args = view::ViewArgs {
report: None,
port: view::DEFAULT_PORT,
open_browser: true,
report_bytes: Some(report_bytes),
};
view::run(view_args).await?;
}
if keep_clones {
let _kept_path = temp_dir.keep(); // consumes TempDir; prevents auto-delete
} else if let Err(e) = temp_dir.close() {
eprintln!("Failed to close temporary directory: {}", e);
}
std::process::exit(exit_code);
}
ScanOperation::ListRepositories(list_command) => match list_command {
@ -373,6 +405,10 @@ fn create_default_scan_args() -> cli::commands::scan::ScanArgs {
input_specifier_args: InputSpecifierArgs {
path_inputs: Vec::new(),
git_url: Vec::new(),
git_clone_dir: None,
keep_clones: false,
repo_clone_limit: None,
include_contributors: false,
github_user: Vec::new(),
github_organization: Vec::new(),
github_exclude: Vec::new(),
@ -467,6 +503,7 @@ fn create_default_scan_args() -> cli::commands::scan::ScanArgs {
redact: false,
git_repo_timeout: 1800,
no_dedup: false,
view_report: false,
baseline_file: None,
manage_baseline: false,
skip_regex: Vec::new(),
@ -477,6 +514,8 @@ fn create_default_scan_args() -> cli::commands::scan::ScanArgs {
no_base64: false,
no_inline_ignore: false,
no_ignore_if_contains: false,
validation_timeout: 10,
validation_retries: 1,
}
}
/// Run the rules check command

View file

@ -12,13 +12,13 @@ use serde::Serialize;
use url::Url;
use crate::{
access_map::{AccessSummary, ResourceExposure},
access_map::{AccessSummary, AccessTokenDetails, ProviderMetadata, ResourceExposure},
blob::BlobMetadata,
bstring_escape::Escaped,
cli,
cli::global::GlobalArgs,
finding_data, findings_store,
matcher::Match,
matcher::{compute_finding_fingerprint, Match},
origin::{Origin, OriginSet},
rules::rule::Confidence,
validation_body::{self, ValidationResponseBody},
@ -227,6 +227,57 @@ impl DetailsReporter {
}
}
fn normalized_finding_fingerprint(m: &Match, origin: &OriginSet) -> u64 {
let finding_value = m
.groups
.captures
.get(1)
.or_else(|| m.groups.captures.get(0))
.map(|capture| capture.raw_value())
.unwrap_or("");
let offset_start = m.location.offset_span.start as u64;
let offset_end = m.location.offset_span.end as u64;
let has_file = origin.iter().any(|o| matches!(o, Origin::File(_)));
let has_git = origin.iter().any(|o| matches!(o, Origin::GitRepo(_)));
let origin_key = if has_file || has_git { "file_git" } else { "ext" };
compute_finding_fingerprint(finding_value, origin_key, offset_start, offset_end)
}
fn origin_set_contains_git(origin: &OriginSet) -> bool {
origin.iter().any(|o| matches!(o, Origin::GitRepo(_)))
}
fn merge_origins_for_dedup(mut existing: ReportMatch, incoming: ReportMatch) -> ReportMatch {
let existing_has_git = Self::origin_set_contains_git(&existing.origin);
let incoming_has_git = Self::origin_set_contains_git(&incoming.origin);
let prefer_git = existing_has_git || incoming_has_git;
if incoming_has_git && !existing_has_git {
existing = incoming.clone();
}
let mut origins = Vec::new();
let mut push_unique = |origin: &Origin| {
if !origins.iter().any(|existing| existing == origin) {
origins.push(origin.clone());
}
};
for origin in existing.origin.iter().chain(incoming.origin.iter()) {
push_unique(origin);
}
if prefer_git {
origins.retain(|origin| matches!(origin, Origin::GitRepo(_)));
}
if let Some(origin_set) = OriginSet::try_from_iter(origins) {
existing.origin = origin_set;
}
existing
}
/// If the given file path corresponds to a Confluence page downloaded to disk,
/// return the URL for that page.
fn confluence_page_url(&self, path: &std::path::Path) -> Option<String> {
@ -339,23 +390,12 @@ impl DetailsReporter {
let mut by_fp: HashMap<(u64, String), ReportMatch> = HashMap::new();
for rm in matches {
let key = (rm.m.finding_fingerprint, rm.m.rule.id().to_string());
let key = (
Self::normalized_finding_fingerprint(&rm.m, &rm.origin),
rm.m.rule.id().to_string(),
);
if let Some(existing) = by_fp.get_mut(&key) {
// merge origin sets (keep first origin, append the rest)
for o in rm.origin.iter() {
if !existing.origin.iter().any(|e| e == o) {
existing.origin = OriginSet::new(
existing.origin.first().clone(),
existing
.origin
.iter()
.skip(1)
.cloned()
.chain(std::iter::once(o.clone()))
.collect(),
);
}
}
*existing = Self::merge_origins_for_dedup(existing.clone(), rm);
continue;
}
by_fp.insert(key, rm);
@ -617,6 +657,8 @@ impl DetailsReporter {
provider: result.cloud.clone(),
account: account.clone(),
groups,
token_details: result.token_details.clone(),
provider_metadata: result.provider_metadata.clone(),
});
}
@ -774,6 +816,10 @@ pub struct AccessMapEntry {
#[serde(skip_serializing_if = "Option::is_none")]
pub account: Option<String>,
pub groups: Vec<AccessMapResourceGroup>,
#[serde(default)]
pub token_details: Option<AccessTokenDetails>,
#[serde(default)]
pub provider_metadata: Option<ProviderMetadata>,
}
#[derive(Serialize, JsonSchema, Clone, Debug)]
@ -854,6 +900,10 @@ mod tests {
input_specifier_args: InputSpecifierArgs {
path_inputs: Vec::new(),
git_url: Vec::new(),
git_clone_dir: None,
keep_clones: false,
repo_clone_limit: None,
include_contributors: false,
github_user: Vec::new(),
github_organization: Vec::new(),
github_exclude: Vec::new(),
@ -934,6 +984,7 @@ mod tests {
min_entropy: None,
rule_stats: false,
no_dedup: false,
view_report: false,
redact: false,
no_base64: false,
git_repo_timeout: 1_800,
@ -946,6 +997,8 @@ mod tests {
skip_aws_account_file: None,
no_inline_ignore: false,
no_ignore_if_contains: false,
validation_timeout: 10,
validation_retries: 1,
}
}
@ -958,7 +1011,7 @@ mod tests {
let commit_metadata = Arc::new(CommitMetadata {
commit_id: ObjectId::from_hex(b"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa").unwrap(),
committer_name: "Alice".into(),
committer_email: "alice@example.com".into(),
committer_email: "alice@exmple.com".into(),
committer_timestamp: Time::new(0, 0),
});
let blob_path = "path/in/history.txt".to_string();

View file

@ -73,6 +73,7 @@ mod tests {
cli::commands::scan::ScanArgs {
num_jobs: 1,
no_dedup: false,
view_report: false,
rules: RuleSpecifierArgs {
rules_path: Vec::new(),
rule: vec!["all".into()],
@ -82,6 +83,10 @@ mod tests {
// local path / git URL inputs
path_inputs: Vec::new(),
git_url: Vec::new(),
git_clone_dir: None,
keep_clones: false,
repo_clone_limit: None,
include_contributors: false,
// GitHub
github_user: Vec::new(),
@ -190,6 +195,8 @@ mod tests {
no_base64: false,
no_inline_ignore: false,
no_ignore_if_contains: false,
validation_timeout: 10,
validation_retries: 1,
}
}

View file

@ -1117,7 +1117,7 @@ mod tests {
let temp = tempdir()?;
let repo_path = temp.path().join("repo");
let repo = Git2Repository::init(&repo_path)?;
let signature = Signature::now("tester", "tester@example.com")?;
let signature = Signature::now("tester", "tester@exmple.com")?;
let tracked_file = repo_path.join("secret.txt");
fs::create_dir_all(tracked_file.parent().unwrap())?;

View file

@ -34,6 +34,42 @@ use crate::{
pub type DatastoreMessage = (OriginSet, BlobMetadata, Vec<(Option<f64>, Match)>);
fn repo_host_contains(repo_url: &GitUrl, needle: &str) -> bool {
Url::parse(repo_url.as_str())
.ok()
.and_then(|url| url.host_str().map(|host| host.to_lowercase()))
.map(|host| host.contains(needle))
.unwrap_or(false)
}
fn apply_repo_clone_limit(
repo_urls: &mut Vec<GitUrl>,
limit: Option<usize>,
predicate: impl Fn(&GitUrl) -> bool,
) {
let Some(limit) = limit else {
return;
};
let mut limited = Vec::new();
let mut remaining = Vec::new();
for url in repo_urls.drain(..) {
if predicate(&url) {
limited.push(url);
} else {
remaining.push(url);
}
}
limited.sort();
limited.dedup();
if limited.len() > limit {
limited.truncate(limit);
}
limited.extend(remaining);
limited.sort();
limited.dedup();
*repo_urls = limited;
}
pub fn clone_or_update_git_repos_streaming<F>(
args: &scan::ScanArgs,
global_args: &global::GlobalArgs,
@ -173,6 +209,41 @@ pub async fn enumerate_github_repos(
exclude_repos: args.input_specifier_args.github_exclude.clone(),
};
let mut repo_urls = args.input_specifier_args.git_url.clone();
if args.input_specifier_args.include_contributors {
for repo_url in &args.input_specifier_args.git_url {
if !repo_host_contains(repo_url, "github") {
continue;
}
match github::enumerate_contributor_repo_urls(
repo_url,
&args.input_specifier_args.github_api_url,
global_args.ignore_certs,
&args.input_specifier_args.github_exclude,
args.input_specifier_args.repo_clone_limit,
global_args.use_progress(),
)
.await
{
Ok(contributor_urls) => {
for repo_string in contributor_urls {
match GitUrl::from_str(&repo_string) {
Ok(repo_url) => repo_urls.push(repo_url),
Err(e) => {
error!(
"Failed to parse contributor repo URL from {repo_string}: {e}"
);
}
}
}
}
Err(err) => {
error!(
"Failed to enumerate GitHub contributor repositories for {repo_url}: {err}"
);
}
}
}
}
if !repo_specifiers.is_empty() {
let mut progress = if global_args.use_progress() {
let style =
@ -214,6 +285,9 @@ pub async fn enumerate_github_repos(
HumanCount(num_found)
));
}
apply_repo_clone_limit(&mut repo_urls, args.input_specifier_args.repo_clone_limit, |url| {
repo_host_contains(url, "github")
});
repo_urls.sort();
repo_urls.dedup();
Ok(repo_urls)
@ -233,6 +307,41 @@ pub async fn enumerate_gitlab_repos(
};
let mut repo_urls = args.input_specifier_args.git_url.clone();
if args.input_specifier_args.include_contributors {
for repo_url in &args.input_specifier_args.git_url {
if !repo_host_contains(repo_url, "gitlab") {
continue;
}
match gitlab::enumerate_contributor_repo_urls(
repo_url,
&args.input_specifier_args.gitlab_api_url,
global_args.ignore_certs,
&args.input_specifier_args.gitlab_exclude,
args.input_specifier_args.repo_clone_limit,
global_args.use_progress(),
)
.await
{
Ok(contributor_urls) => {
for repo_string in contributor_urls {
match GitUrl::from_str(&repo_string) {
Ok(repo_url) => repo_urls.push(repo_url),
Err(e) => {
error!(
"Failed to parse contributor repo URL from {repo_string}: {e}"
);
}
}
}
}
Err(err) => {
error!(
"Failed to enumerate GitLab contributor repositories for {repo_url}: {err}"
);
}
}
}
}
if !repo_specifiers.is_empty() {
let progress = if global_args.use_progress() {
let style =
@ -277,6 +386,9 @@ pub async fn enumerate_gitlab_repos(
HumanCount(num_found)
));
}
apply_repo_clone_limit(&mut repo_urls, args.input_specifier_args.repo_clone_limit, |url| {
repo_host_contains(url, "gitlab")
});
repo_urls.sort();
repo_urls.dedup();
Ok(repo_urls)

View file

@ -359,6 +359,8 @@ pub async fn run_async_scan(
args.num_jobs,
None,
access_map_collector.clone(),
Duration::from_secs(args.validation_timeout),
args.validation_retries,
)
.await?;
}
@ -442,6 +444,8 @@ pub async fn run_async_scan(
args.num_jobs,
Some(0..initial_match_count),
access_map_collector.clone(),
Duration::from_secs(args.validation_timeout),
args.validation_retries,
)
.await?;
}
@ -523,6 +527,8 @@ pub async fn run_async_scan(
args.num_jobs,
Some(0..match_count),
access_map.clone(),
Duration::from_secs(args.validation_timeout),
args.validation_retries,
))?;
}
}
@ -591,6 +597,8 @@ pub async fn run_async_scan(
args.num_jobs,
None,
access_map_collector.clone(),
Duration::from_secs(args.validation_timeout),
args.validation_retries,
)
.await?;
}
@ -642,7 +650,7 @@ async fn finalize_access_map(
let requests = collector.into_requests();
if requests.is_empty() {
debug!("access-map enabled but no validated AWS or GCP credentials were collected; skipping report output");
debug!("access-map enabled but no validated AWS, GCP, or Azure credentials were collected; skipping report output");
let mut ds = datastore.lock().unwrap();
ds.set_access_map_results(Vec::new());
return Ok(());
@ -707,7 +715,9 @@ fn maybe_hint_access_map(datastore: &Arc<Mutex<FindingsStore>>, args: &scan::Sca
ds.get_matches().iter().any(|entry| {
let rule = &entry.2.rule;
entry.2.validation_success
&& matches!(rule.syntax().validation, Some(Validation::AWS | Validation::GCP))
&& (matches!(rule.syntax().validation, Some(Validation::AWS | Validation::GCP))
|| rule.id().starts_with("kingfisher.github.")
|| rule.id().starts_with("kingfisher.gitlab."))
})
};

View file

@ -51,6 +51,37 @@ impl AccessMapCollector {
});
}
pub fn record_azure(&self, credential_json: &str, containers: Option<Vec<String>>) {
let key = xxhash_rust::xxh3::xxh3_64(credential_json.as_bytes());
self.inner.entry(key).or_insert_with(|| AccessMapRequest::Azure {
credential_json: credential_json.to_string(),
containers,
});
}
pub fn record_azure_devops(&self, token: &str, organization: &str) {
let key =
xxhash_rust::xxh3::xxh3_64(format!("azure_devops|{organization}|{token}").as_bytes());
self.inner.entry(key).or_insert_with(|| AccessMapRequest::AzureDevops {
token: token.to_string(),
organization: organization.to_string(),
});
}
pub fn record_github(&self, token: &str) {
let key = xxhash_rust::xxh3::xxh3_64(format!("github|{token}").as_bytes());
self.inner
.entry(key)
.or_insert_with(|| AccessMapRequest::Github { token: token.to_string() });
}
pub fn record_gitlab(&self, token: &str) {
let key = xxhash_rust::xxh3::xxh3_64(format!("gitlab|{token}").as_bytes());
self.inner
.entry(key)
.or_insert_with(|| AccessMapRequest::Gitlab { token: token.to_string() });
}
pub fn into_requests(self) -> Vec<AccessMapRequest> {
self.inner.iter().map(|entry| entry.value().clone()).collect()
}
@ -65,6 +96,8 @@ pub async fn run_secret_validation(
num_jobs: usize,
range: Option<std::ops::Range<usize>>,
access_map: Option<AccessMapCollector>,
validation_timeout: Duration,
validation_retries: u32,
) -> Result<()> {
// ── 1. Concurrency & counters ───────────────────────────────────────────
let concurrency = if num_jobs > 0 { num_jobs } else { num_cpus::get() };
@ -185,6 +218,8 @@ pub async fn run_secret_validation(
&fail,
&cache_glob,
access_map.as_ref(),
validation_timeout,
validation_retries,
)
.await;
@ -258,6 +293,8 @@ pub async fn run_secret_validation(
let fail = fail_count.clone();
let cache_glob = cache.clone();
let access_map = access_map.clone();
let validation_timeout = validation_timeout;
let validation_retries = validation_retries;
async move {
let owned = matches_for_blob
@ -292,7 +329,6 @@ pub async fn run_secret_validation(
let fail = fail.clone();
let cache_glob = cache_glob.clone();
let access_map = access_map.clone();
async move {
validate_single(
&mut rep,
@ -306,6 +342,8 @@ pub async fn run_secret_validation(
&fail,
&cache_glob,
access_map.as_ref(),
validation_timeout,
validation_retries,
)
.await;
for d in &mut dups {
@ -388,6 +426,8 @@ async fn validate_single(
fail_count: &AtomicUsize,
cache2: &Arc<SkipMap<String, CachedResponse>>,
access_map: Option<&AccessMapCollector>,
validation_timeout: Duration,
validation_retries: u32,
) {
// Build key
let dep_vars_str = dep_vars
@ -438,8 +478,18 @@ async fn validate_single(
}
// If we reach here, we're the first task to validate this key
// Perform validation
let outcome = timeout(Duration::from_secs(30), async {
validate_single_match(om, parser, client, dep_vars, missing_deps, cache2).await
let outcome = timeout(validation_timeout, async {
validate_single_match(
om,
parser,
client,
dep_vars,
missing_deps,
cache2,
validation_timeout,
validation_retries,
)
.await
})
.await;
// Store result in cache
@ -497,8 +547,11 @@ fn build_cache_key(
}
fn maybe_record_access_map(om: &OwnedBlobMatch, collector: Option<&AccessMapCollector>) {
let is_gitlab_rule = om.rule.id().starts_with("kingfisher.gitlab.");
let validation_ok =
om.validation_success || (is_gitlab_rule && om.validation_response_status.is_success());
let collector = match collector {
Some(c) if om.validation_success => c,
Some(c) if validation_ok => c,
_ => return,
};
@ -530,7 +583,67 @@ fn maybe_record_access_map(om: &OwnedBlobMatch, collector: Option<&AccessMapColl
}
}
}
_ => {}
Some(Validation::AzureStorage) => {
let storage_key = captures
.iter()
.find(|(name, ..)| name == "TOKEN")
.map(|(_, value, ..)| value.clone())
.unwrap_or_default();
let storage_account =
utils::find_closest_variable(&captures, &storage_key, "TOKEN", "AZURENAME")
.unwrap_or_default();
let mut storage_account = storage_account;
if storage_account.is_empty() {
storage_account =
extract_azure_storage_account_from_body(&om.validation_response_body)
.unwrap_or_default();
}
let containers_hint =
extract_azure_storage_containers_from_body(&om.validation_response_body);
if !storage_account.is_empty() && !storage_key.is_empty() {
let creds_json = format!(
r#"{{"storage_account":"{}","storage_key":"{}"}}"#,
storage_account, storage_key
);
collector.record_azure(&creds_json, containers_hint);
}
}
_ => {
if om.rule.id().starts_with("kingfisher.github.") {
if let Some((_, value, ..)) = captures.iter().find(|(name, ..)| name == "TOKEN") {
if !value.is_empty() {
collector.record_github(value);
}
}
}
if om.rule.id().starts_with("kingfisher.azure.devops.") {
let token = captures
.iter()
.find(|(name, ..)| name == "TOKEN")
.map(|(_, value, ..)| value.clone())
.unwrap_or_default();
let mut organization =
utils::find_closest_variable(&captures, &token, "TOKEN", "AZURE_DEVOPS_ORG")
.unwrap_or_default();
if organization.is_empty() {
organization = extract_azure_devops_org_from_body(&om.validation_response_body)
.unwrap_or_default();
}
if !token.is_empty() && !organization.is_empty() {
collector.record_azure_devops(&token, &organization);
}
}
if is_gitlab_rule {
if let Some((_, value, ..)) = captures.iter().find(|(name, ..)| name == "TOKEN") {
if !value.is_empty() {
collector.record_gitlab(value);
}
}
}
}
}
}
@ -545,3 +658,40 @@ fn extract_akid_from_body(body: &validation_body::ValidationResponseBody) -> Opt
let text = validation_body::clone_as_string(body);
AKID_RE.find(&text).map(|m| m.as_str().to_string())
}
fn extract_azure_storage_account_from_body(
body: &validation_body::ValidationResponseBody,
) -> Option<String> {
static ACCOUNT_RE: once_cell::sync::Lazy<regex::Regex> = once_cell::sync::Lazy::new(|| {
regex::Regex::new(r"(?i)Account:\s*([a-z0-9]{3,24})").expect("valid regex")
});
let text = validation_body::clone_as_string(body);
ACCOUNT_RE.captures(&text).and_then(|caps| caps.get(1).map(|m| m.as_str().to_string()))
}
fn extract_azure_storage_containers_from_body(
body: &validation_body::ValidationResponseBody,
) -> Option<Vec<String>> {
static CONTAINERS_RE: once_cell::sync::Lazy<regex::Regex> = once_cell::sync::Lazy::new(|| {
regex::Regex::new(r"(?i)Containers:\s*(\\[[^\\]]*\\])").expect("valid regex")
});
let text = validation_body::clone_as_string(body);
let capture = CONTAINERS_RE
.captures(&text)
.and_then(|caps| caps.get(1).map(|m| m.as_str().to_string()))?;
serde_json::from_str::<Vec<String>>(&capture).ok()
}
fn extract_azure_devops_org_from_body(
body: &validation_body::ValidationResponseBody,
) -> Option<String> {
static ORG_RE: once_cell::sync::Lazy<regex::Regex> = once_cell::sync::Lazy::new(|| {
regex::Regex::new(r#"(?i)https?://dev\.azure\.com/([a-z0-9][a-z0-9-]{0,61}[a-z0-9])"#)
.expect("valid regex")
});
let text = validation_body::clone_as_string(body);
ORG_RE.captures(&text).and_then(|caps| caps.get(1).map(|m| m.as_str().to_string()))
}

View file

@ -245,7 +245,7 @@ async fn render_template(
})
}
/// Validate a single match with a timeout of 60 seconds.
/// Validate a single match with a configurable timeout.
pub async fn validate_single_match(
m: &mut OwnedBlobMatch,
parser: &liquid::Parser,
@ -253,8 +253,10 @@ pub async fn validate_single_match(
dependent_variables: &FxHashMap<String, Vec<(String, OffsetSpan)>>,
missing_dependencies: &FxHashMap<String, Vec<String>>,
cache: &Cache,
validation_timeout: Duration,
validation_retries: u32,
) {
let timeout_result = time::timeout(Duration::from_secs(60), async {
let timeout_result = time::timeout(validation_timeout, async {
timed_validate_single_match(
m,
parser,
@ -262,6 +264,8 @@ pub async fn validate_single_match(
dependent_variables,
missing_dependencies,
cache,
validation_timeout,
validation_retries,
)
.await
})
@ -269,8 +273,10 @@ pub async fn validate_single_match(
if timeout_result.is_err() {
m.validation_success = false;
m.validation_response_body =
validation_body::from_string("Validation timed out after 60 seconds");
m.validation_response_body = validation_body::from_string(format!(
"Validation timed out after {} seconds",
validation_timeout.as_secs()
));
m.validation_response_status = StatusCode::REQUEST_TIMEOUT;
}
}
@ -285,6 +291,8 @@ async fn timed_validate_single_match<'a>(
dependent_variables: &FxHashMap<String, Vec<(String, OffsetSpan)>>,
missing_dependencies: &FxHashMap<String, Vec<String>>,
cache: &Cache,
validation_timeout: Duration,
validation_retries: u32,
) {
// ──────────────────────────────────────────────────────────
// 1. process-wide fingerprint de-dup
@ -383,6 +391,9 @@ async fn timed_validate_single_match<'a>(
match &rule_syntax.validation {
// ---------------------------------------------------- HTTP validator
Some(Validation::Http(http_validation)) => {
let request_timeout = validation_timeout;
let multipart_timeout = validation_timeout;
let max_retries: u32 = validation_retries;
// render URL
let url = match render_and_parse_url(
parser,
@ -409,6 +420,7 @@ async fn timed_validate_single_match<'a>(
&url,
&http_validation.request.headers,
&http_validation.request.body,
request_timeout,
parser,
&globals,
) {
@ -462,7 +474,7 @@ async fn timed_validate_single_match<'a>(
let exec_single = |builder: reqwest::RequestBuilder| async {
httpvalidation::retry_request(
builder,
1,
max_retries,
Duration::from_millis(500),
Duration::from_secs(2),
)
@ -477,7 +489,7 @@ async fn timed_validate_single_match<'a>(
.unwrap_or(reqwest::Method::GET);
let mut fresh_builder =
client.request(method, url.clone()).timeout(Duration::from_secs(5));
client.request(method, url.clone()).timeout(multipart_timeout);
if let Ok(mut headers) = httpvalidation::process_headers(
&http_validation.request.headers,
@ -546,7 +558,7 @@ async fn timed_validate_single_match<'a>(
httpvalidation::retry_multipart_request(
build_request,
1,
max_retries as usize,
Duration::from_millis(500),
Duration::from_secs(2),
)

View file

@ -55,6 +55,7 @@ pub async fn validate_cdp_api_key(
&url,
&headers,
&None,
Duration::from_secs(10),
parser,
&liquid::Object::new(),
)

View file

@ -65,6 +65,7 @@ pub fn build_request_builder(
url: &Url,
headers: &BTreeMap<String, String>,
body: &Option<String>,
timeout: Duration,
parser: &liquid::Parser,
globals: &liquid::Object,
) -> Result<RequestBuilder, String> {
@ -72,7 +73,7 @@ pub fn build_request_builder(
debug!("{}", err_msg);
err_msg
})?;
let mut request_builder = client.request(method, url.clone()).timeout(Duration::from_secs(10));
let mut request_builder = client.request(method, url.clone()).timeout(timeout);
let custom_headers = process_headers(headers, parser, globals, url)
.map_err(|e| format!("Error processing headers: {}", e))?;
@ -199,6 +200,9 @@ where
return result;
}
retries += 1;
if retries > max_retries {
break;
}
let backoff = backoff_min.saturating_mul(2u32.pow(retries as u32)).min(backoff_max);
sleep(backoff).await;
}
@ -445,9 +449,17 @@ mod tests {
("Accept".to_string(), "application/custom".to_string()),
]);
let url = Url::from_str("https://example.com").unwrap();
let result =
build_request_builder(&client, "GET", &url, &headers, &None, &parser, &globals)
.expect("building request");
let result = build_request_builder(
&client,
"GET",
&url,
&headers,
&None,
Duration::from_secs(10),
&parser,
&globals,
)
.expect("building request");
let req = result.build().expect("finalizing request");
assert_eq!(
req.headers().get(header::ACCEPT).and_then(|v| v.to_str().ok()),

View file

@ -133,22 +133,22 @@ mod tests {
#[test]
fn parse_mysql_url_accepts_valid_urls() {
let url = "mysql://user:secret@example.com:3306/app";
let url = "mysql://user:secret@exmple.com:3306/app";
let opts = parse_mysql_url(url).expect("expected valid MySQL URL");
assert_eq!(opts.user(), Some("user"));
assert_eq!(opts.pass(), Some("secret"));
assert_eq!(opts.ip_or_hostname(), "example.com");
assert_eq!(opts.ip_or_hostname(), "exmple.com");
}
#[test]
fn parse_mysql_url_rejects_invalid_urls() {
for candidate in [
"", // empty
"mysql://user@example.com/app", // missing password
"mysql://:secret@example.com/app", // missing username
"mysql://user:secret@:3306/app", // missing host
"postgres://user:secret@example.com", // wrong scheme
"mysql://user:secret@example.com:70000/app", // invalid port
"", // empty
"mysql://user@exmple.com/app", // missing password
"mysql://:secret@exmple.com/app", // missing username
"mysql://user:secret@:3306/app", // missing host
"postgres://user:secret@exmple.com", // wrong scheme
"mysql://user:secret@exmple.com:70000/app", // invalid port
] {
assert!(
parse_mysql_url(candidate).is_err(),
@ -160,7 +160,7 @@ mod tests {
#[test]
fn parse_mysql_url_allows_trimming_whitespace() {
let opts =
parse_mysql_url(" mysql://user:secret@example.com:3306/app ").expect("trimmed URL");
parse_mysql_url(" mysql://user:secret@exmple.com:3306/app ").expect("trimmed URL");
assert_eq!(opts.user(), Some("user"));
assert_eq!(opts.pass(), Some("secret"));
}

View file

@ -242,13 +242,13 @@ mod tests {
#[test]
fn parse_accepts_postgis_scheme() {
let url = "postgis://postgres:secret@example.com:5432";
let url = "postgis://postgres:secret@exmple.com:5432";
assert!(parse_postgres_url(url).is_ok(), "postgis scheme should be accepted");
}
#[test]
fn parse_rejects_invalid_port() {
let url = "postgres://postgres:secret@example.com:70000";
let url = "postgres://postgres:secret@exmple.com:70000";
assert!(parse_postgres_url(url).is_err(), "invalid port should be rejected");
}
}

View file

@ -0,0 +1,63 @@
use clap::Parser;
use tempfile::tempdir;
use kingfisher::cli::{
commands::scan::ScanOperation,
global::{Command, CommandLineArgs},
};
#[test]
fn parse_git_clone_dir_and_keep_clones() -> anyhow::Result<()> {
let dir = tempdir()?;
let args = CommandLineArgs::try_parse_from([
"kingfisher",
"scan",
"--git-url",
"https://github.com/octocat/Hello-World.git",
"--git-clone-dir",
dir.path().to_str().unwrap(),
"--keep-clones",
"--no-update-check",
])?;
let command = match args.command {
Command::Scan(scan_args) => scan_args,
other => panic!("unexpected command parsed: {:?}", other),
};
let scan_args = match command.into_operation()? {
ScanOperation::Scan(scan_args) => scan_args,
op => panic!("expected scan operation, got {:?}", op),
};
assert_eq!(scan_args.input_specifier_args.git_clone_dir.as_deref(), Some(dir.path()));
assert!(scan_args.input_specifier_args.keep_clones);
Ok(())
}
#[test]
fn keep_clones_defaults_to_false() -> anyhow::Result<()> {
let args = CommandLineArgs::try_parse_from([
"kingfisher",
"scan",
"--git-url",
"https://github.com/octocat/Hello-World.git",
"--no-update-check",
])?;
let command = match args.command {
Command::Scan(scan_args) => scan_args,
other => panic!("unexpected command parsed: {:?}", other),
};
let scan_args = match command.into_operation()? {
ScanOperation::Scan(scan_args) => scan_args,
op => panic!("expected scan operation, got {:?}", op),
};
assert!(scan_args.input_specifier_args.git_clone_dir.is_none());
assert!(!scan_args.input_specifier_args.keep_clones);
Ok(())
}

View file

@ -77,7 +77,7 @@ fn dummy_commit(commit_id: &str) -> CommitMetadata {
CommitMetadata {
commit_id: oid,
committer_name: "tester".into(),
committer_email: "tester@example.com".into(),
committer_email: "tester@exmple.com".into(),
committer_timestamp: ts,
}
}

View file

@ -60,6 +60,10 @@ fn run_skiplist(skip_regex: Vec<String>, skip_skipword: Vec<String>) -> Result<u
input_specifier_args: InputSpecifierArgs {
path_inputs: vec![inputs_dir.join("a.txt")],
git_url: Vec::new(),
git_clone_dir: None,
keep_clones: false,
repo_clone_limit: None,
include_contributors: false,
github_user: Vec::new(),
github_organization: Vec::new(),
github_exclude: Vec::new(),
@ -143,6 +147,7 @@ fn run_skiplist(skip_regex: Vec<String>, skip_skipword: Vec<String>) -> Result<u
git_repo_timeout: 1800,
output_args: OutputArgs { output: None, format: ReportOutputFormat::Pretty },
no_dedup: false,
view_report: false,
baseline_file: None,
manage_baseline: false,
skip_regex: skip_regex,
@ -152,6 +157,8 @@ fn run_skiplist(skip_regex: Vec<String>, skip_skipword: Vec<String>) -> Result<u
no_base64: false,
no_inline_ignore: false,
no_ignore_if_contains: false,
validation_retries: 1,
validation_timeout: 10,
};
let global_args = GlobalArgs {

View file

@ -55,6 +55,10 @@ fn test_bitbucket_remote_scan() -> Result<()> {
input_specifier_args: InputSpecifierArgs {
path_inputs: Vec::new(),
git_url: vec![git_url],
git_clone_dir: None,
keep_clones: false,
repo_clone_limit: None,
include_contributors: false,
github_user: Vec::new(),
github_organization: Vec::new(),
github_exclude: Vec::new(),
@ -142,6 +146,7 @@ fn test_bitbucket_remote_scan() -> Result<()> {
git_repo_timeout: 1800,
output_args: OutputArgs { output: None, format: ReportOutputFormat::Pretty },
no_dedup: true,
view_report: false,
baseline_file: None,
manage_baseline: false,
skip_regex: Vec::new(),
@ -152,6 +157,8 @@ fn test_bitbucket_remote_scan() -> Result<()> {
extra_ignore_comments: Vec::new(),
no_inline_ignore: false,
no_ignore_if_contains: false,
validation_retries: 1,
validation_timeout: 10,
};
let global_args = GlobalArgs {

View file

@ -71,6 +71,10 @@ rules:
input_specifier_args: InputSpecifierArgs {
path_inputs: vec![inputs_dir.join("a.txt"), inputs_dir.join("b.txt")],
git_url: Vec::new(),
git_clone_dir: None,
keep_clones: false,
repo_clone_limit: None,
include_contributors: false,
github_user: Vec::new(),
github_organization: Vec::new(),
github_exclude: Vec::new(),
@ -162,6 +166,7 @@ rules:
git_repo_timeout: 1800, // 30 minutes
output_args: OutputArgs { output: None, format: ReportOutputFormat::Pretty },
no_dedup,
view_report: false,
baseline_file: None,
manage_baseline: false,
skip_regex: Vec::new(),
@ -172,6 +177,8 @@ rules:
extra_ignore_comments: Vec::new(),
no_inline_ignore: false,
no_ignore_if_contains: false,
validation_retries: 1,
validation_timeout: 10,
};
let global_args = GlobalArgs {

View file

@ -58,6 +58,10 @@ fn test_github_remote_scan() -> Result<()> {
input_specifier_args: InputSpecifierArgs {
path_inputs: Vec::new(),
git_url: vec![git_url],
git_clone_dir: None,
keep_clones: false,
repo_clone_limit: None,
include_contributors: false,
github_user: Vec::new(),
github_organization: Vec::new(),
github_exclude: Vec::new(),
@ -149,6 +153,7 @@ fn test_github_remote_scan() -> Result<()> {
git_repo_timeout: 1800, // 30 minutes
output_args: OutputArgs { output: None, format: ReportOutputFormat::Pretty },
no_dedup: true,
view_report: false,
baseline_file: None,
manage_baseline: false,
skip_regex: Vec::new(),
@ -159,6 +164,8 @@ fn test_github_remote_scan() -> Result<()> {
extra_ignore_comments: Vec::new(),
no_inline_ignore: false,
no_ignore_if_contains: false,
validation_retries: 1,
validation_timeout: 10,
};
// Create global arguments
let global_args = GlobalArgs {

View file

@ -58,6 +58,10 @@ fn test_gitlab_remote_scan() -> Result<()> {
input_specifier_args: InputSpecifierArgs {
path_inputs: Vec::new(),
git_url: vec![git_url],
git_clone_dir: None,
keep_clones: false,
repo_clone_limit: None,
include_contributors: false,
github_user: Vec::new(),
github_organization: Vec::new(),
github_exclude: Vec::new(),
@ -148,6 +152,7 @@ fn test_gitlab_remote_scan() -> Result<()> {
git_repo_timeout: 1800, // 30 minutes
output_args: OutputArgs { output: None, format: ReportOutputFormat::Pretty },
no_dedup: true,
view_report: false,
baseline_file: None,
manage_baseline: false,
skip_regex: Vec::new(),
@ -157,6 +162,8 @@ fn test_gitlab_remote_scan() -> Result<()> {
no_base64: false,
no_inline_ignore: false,
no_ignore_if_contains: false,
validation_retries: 1,
validation_timeout: 10,
};
let global_args = GlobalArgs {
@ -216,6 +223,10 @@ fn test_gitlab_remote_scan_no_history() -> Result<()> {
input_specifier_args: InputSpecifierArgs {
path_inputs: Vec::new(),
git_url: vec![git_url],
git_clone_dir: None,
keep_clones: false,
repo_clone_limit: None,
include_contributors: false,
github_user: Vec::new(),
github_organization: Vec::new(),
github_exclude: Vec::new(),
@ -313,6 +324,9 @@ fn test_gitlab_remote_scan_no_history() -> Result<()> {
extra_ignore_comments: Vec::new(),
no_inline_ignore: false,
no_ignore_if_contains: false,
view_report: false,
validation_retries: 1,
validation_timeout: 10,
};
let global_args = GlobalArgs {

View file

@ -43,6 +43,10 @@ async fn test_redact_hashes_finding_values() -> Result<()> {
input_specifier_args: InputSpecifierArgs {
path_inputs: vec![PathBuf::from("testdata/generic_secrets.py")],
git_url: Vec::new(),
git_clone_dir: None,
keep_clones: false,
repo_clone_limit: None,
include_contributors: false,
github_user: Vec::new(),
github_organization: Vec::new(),
github_exclude: Vec::new(),
@ -125,6 +129,7 @@ async fn test_redact_hashes_finding_values() -> Result<()> {
git_repo_timeout: 1800,
output_args: OutputArgs { output: None, format: ReportOutputFormat::Pretty },
no_dedup: true,
view_report: false,
baseline_file: None,
manage_baseline: false,
skip_regex: Vec::new(),
@ -135,6 +140,8 @@ async fn test_redact_hashes_finding_values() -> Result<()> {
extra_ignore_comments: Vec::new(),
no_inline_ignore: false,
no_ignore_if_contains: false,
validation_retries: 1,
validation_timeout: 10,
};
let global_args = GlobalArgs {

View file

@ -1,7 +1,4 @@
use std::{
env,
sync::{Arc, Mutex},
};
use std::sync::{Arc, Mutex};
use anyhow::Result;
use kingfisher::{
@ -49,6 +46,10 @@ impl TestContext {
input_specifier_args: InputSpecifierArgs {
path_inputs: Vec::new(),
git_url: Vec::new(),
git_clone_dir: None,
keep_clones: false,
repo_clone_limit: None,
include_contributors: false,
github_user: Vec::new(),
github_organization: Vec::new(),
github_exclude: Vec::new(),
@ -134,6 +135,7 @@ impl TestContext {
git_repo_timeout: 1800,
output_args: OutputArgs { output: None, format: ReportOutputFormat::Pretty },
no_dedup: true,
view_report: false,
baseline_file: None,
manage_baseline: false,
skip_regex: Vec::new(),
@ -143,6 +145,8 @@ impl TestContext {
no_base64: false,
no_inline_ignore: false,
no_ignore_if_contains: false,
validation_retries: 1,
validation_timeout: 10,
};
let loaded = RuleLoader::from_rule_specifiers(&scan_args.rules).load(&scan_args)?;
@ -191,6 +195,10 @@ async fn test_scan_slack_messages() -> Result<()> {
input_specifier_args: InputSpecifierArgs {
path_inputs: Vec::new(),
git_url: Vec::new(),
git_clone_dir: None,
keep_clones: false,
repo_clone_limit: None,
include_contributors: false,
github_user: Vec::new(),
github_organization: Vec::new(),
github_exclude: Vec::new(),
@ -286,6 +294,9 @@ async fn test_scan_slack_messages() -> Result<()> {
extra_ignore_comments: Vec::new(),
no_inline_ignore: false,
no_ignore_if_contains: false,
view_report: false,
validation_retries: 1,
validation_timeout: 10,
};
let global_args = GlobalArgs {

View file

@ -7,8 +7,8 @@ use tempfile::tempdir;
fn filters_invalid_mongodb_uri_even_without_validation() -> anyhow::Result<()> {
let dir = tempdir()?;
let file_path = dir.path().join("mongo.txt");
let valid = "mongodb://usr:pass@example.com:27017/db";
let invalid = "mongodb://usr:pass@example.com:abc/db";
let valid = "mongodb://usr:pass@exmple.com:27017/db";
let invalid = "mongodb://usr:pass@exmple.com:abc/db";
fs::write(&file_path, format!("{valid}\n{invalid}\n"))?;
Command::new(assert_cmd::cargo::cargo_bin!("kingfisher"))
@ -35,8 +35,8 @@ fn filters_invalid_mongodb_uri_even_without_validation() -> anyhow::Result<()> {
fn filters_invalid_postgres_uri_even_without_validation() -> anyhow::Result<()> {
let dir = tempdir()?;
let file_path = dir.path().join("postgres.txt");
let valid = "postgres://postgres:secret@example.com:5432";
let invalid = "postgres://postgres:secret@example.com:70000";
let valid = "postgres://postgres:secret@exmple.com:5432";
let invalid = "postgres://postgres:secret@exmple.com:70000";
fs::write(&file_path, format!("{valid}\n{invalid}\n"))?;
Command::new(assert_cmd::cargo::cargo_bin!("kingfisher"))
@ -63,8 +63,8 @@ fn filters_invalid_postgres_uri_even_without_validation() -> anyhow::Result<()>
fn filters_invalid_mysql_uri_even_without_validation() -> anyhow::Result<()> {
let dir = tempdir()?;
let file_path = dir.path().join("mysql.txt");
let valid = "mysql://user:secret@example.com:3306/app";
let invalid = "mysql://user:secret@example.com:70000/app";
let valid = "mysql://user:secret@exmple.com:3306/app";
let invalid = "mysql://user:secret@exmple.com:70000/app";
fs::write(&file_path, format!("{valid}\n{invalid}\n"))?;
Command::new(assert_cmd::cargo::cargo_bin!("kingfisher"))

View file

@ -113,6 +113,10 @@ async fn test_validation_cache_and_depvars() -> Result<()> {
input_specifier_args: InputSpecifierArgs {
path_inputs: vec![secret_file.clone()],
git_url: Vec::new(),
git_clone_dir: None,
keep_clones: false,
repo_clone_limit: None,
include_contributors: false,
github_user: Vec::new(),
github_organization: Vec::new(),
github_exclude: Vec::new(),
@ -205,6 +209,7 @@ async fn test_validation_cache_and_depvars() -> Result<()> {
git_repo_timeout: 1800, // 30 minutes
output_args: OutputArgs { output: None, format: ReportOutputFormat::Pretty },
no_dedup: true, // keep duplicates so the cache is stressed
view_report: false,
baseline_file: None,
manage_baseline: false,
skip_regex: Vec::new(),
@ -215,6 +220,8 @@ async fn test_validation_cache_and_depvars() -> Result<()> {
extra_ignore_comments: Vec::new(),
no_inline_ignore: false,
no_ignore_if_contains: false,
validation_retries: 1,
validation_timeout: 10,
};
/* --------------------------------------------------------- *

View file

@ -57,6 +57,10 @@ impl TestContext {
input_specifier_args: InputSpecifierArgs {
path_inputs: Vec::new(),
git_url: Vec::new(),
git_clone_dir: None,
keep_clones: false,
repo_clone_limit: None,
include_contributors: false,
github_user: Vec::new(),
github_organization: Vec::new(),
github_exclude: Vec::new(),
@ -148,6 +152,7 @@ impl TestContext {
git_repo_timeout: 1800, // 30 minutes
output_args: OutputArgs { output: None, format: ReportOutputFormat::Pretty },
no_dedup: true,
view_report: false,
baseline_file: None,
manage_baseline: false,
skip_regex: Vec::new(),
@ -158,6 +163,8 @@ impl TestContext {
extra_ignore_comments: Vec::new(),
no_inline_ignore: false,
no_ignore_if_contains: false,
validation_retries: 1,
validation_timeout: 10,
};
let loaded = RuleLoader::from_rule_specifiers(&scan_args.rules)
@ -186,6 +193,10 @@ impl TestContext {
input_specifier_args: InputSpecifierArgs {
path_inputs: vec![file_path.to_path_buf()],
git_url: Vec::new(),
git_clone_dir: None,
keep_clones: false,
repo_clone_limit: None,
include_contributors: false,
github_user: Vec::new(),
github_organization: Vec::new(),
github_exclude: Vec::new(),
@ -279,6 +290,7 @@ impl TestContext {
git_repo_timeout: 1800, // 30 minutes
output_args: OutputArgs { output: None, format: ReportOutputFormat::Pretty },
no_dedup: true,
view_report: false,
baseline_file: None,
manage_baseline: false,
skip_regex: Vec::new(),
@ -288,6 +300,8 @@ impl TestContext {
no_base64: false,
no_inline_ignore: false,
no_ignore_if_contains: false,
validation_retries: 1,
validation_timeout: 10,
};
let global_args = GlobalArgs {

View file

@ -35,7 +35,7 @@ fn scan_by_commit_and_branch_diff() -> anyhow::Result<()> {
let dir = tempdir()?;
let repo_dir = dir.path().join("repo");
let repo = Repository::init(&repo_dir)?;
let signature = Signature::now("tester", "tester@example.com")?;
let signature = Signature::now("tester", "tester@exmple.com")?;
// Commit an initial config file packed with known test secrets. We'll scan
// this commit directly via `--branch <commit-hash>` in the first assertion.
@ -147,7 +147,7 @@ fn setup_linear_repo_with_secrets() -> Result<(TempDir, std::path::PathBuf, Vec<
let dir = tempdir()?;
let repo_dir = dir.path().join("repo");
let repo = Repository::init(&repo_dir)?;
let sig = Signature::now("tester", "tester@example.com")?;
let sig = Signature::now("tester", "tester@exmple.com")?;
let secrets_path = repo_dir.join("secrets.txt");

Some files were not shown because too many files have changed in this diff Show more