diff --git a/CHANGELOG.md b/CHANGELOG.md index 77cd937..4e1349d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,11 @@ All notable changes to this project will be documented in this file. +## [v1.55.0] +- Added first-class Azure Repos support, including CLI commands, enumeration, and documentation updates +- Improved performance of tree-sitter parsing +- Updated Windows build script to ensure static binary is produced + ## [v1.54.0] - Added first-class Gitea support, including CLI commands, environment-based authentication, documentation, and integration with scans and repository enumeration. - Populate the finding path from git blob metadata so history-derived secrets display their file location instead of an empty path diff --git a/Cargo.toml b/Cargo.toml index b743646..b8dd860 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -10,8 +10,8 @@ publish = false [package] name = "kingfisher" -version = "1.54.0" -description = "MongoDB's blazingly fast secret scanning and validation tool" +version = "1.55.0" +description = "MongoDB's blazingly fast and accurate secret scanning and validation tool" edition.workspace = true rust-version.workspace = true license.workspace = true @@ -32,7 +32,7 @@ assets = [ [package.metadata.generate-rpm] package = "kingfisher" -summary = "MongoDB's blazingly fast secret scanning and validation tool" +summary = "MongoDB's blazingly fast and accurate secret scanning and validation tool" license = "Apache-2.0" url = "https://github.com/mongodb/kingfisher" assets = [ @@ -229,7 +229,7 @@ incremental = false [profile.dev] opt-level = 0 -debug = true +# debug = true incremental = true codegen-units = 256 diff --git a/README.md b/README.md index ec89f63..14faccf 100644 --- a/README.md +++ b/README.md @@ -5,29 +5,26 @@ [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) -Kingfisher is a blazingly fast secret‑scanning and live validation tool built in Rust. It combines Intel’s hardware‑accelerated Hyperscan regex engine with language‑aware parsing via Tree‑Sitter, and **ships with hundreds of built‑in rules** to detect, validate, and triage secrets before they ever reach production +Kingfisher is a blazingly fast secret‑scanning and live validation tool built in Rust. It combines Intel’s hardware‑accelerated Hyperscan regex engine with language‑aware source code parsing, and **ships with hundreds of built‑in rules** to detect, validate, and triage secrets before they ever reach production

Originally forked from Praetorian’s Nosey Parker, Kingfisher **adds** live cloud-API validation; many more targets (GitLab, BitBucket, Gitea, S3, Docker, Jira, Confluence, Slack); compressed-file extraction and scanning; baseline and allowlist controls; language-aware detection (~20 languages); and a native Windows binary. See [Origins and Divergence](#origins-and-divergence) for details. - ## Key Features -- **Multiple Scan Targets**: -

- Files & Dirs - Local Git - GitHub - GitLab - Bitbucket - Gitea -
- Docker - Jira - Confluence - Slack - AWS S3 -

+### Multiple Scan Targets +
+| Files / Dirs | Local Git | GitHub | GitLab | Azure DevOps | Bitbucket | Gitea | +|:-------------:|:----------:|:------:|:------:|:-------------:|:----------:|:------:| +| Files / Dirs
Files / Dirs | Local Git
Local Git | GitHub
GitHub | GitLab
GitLab | Azure DevOps
Azure DevOps | Bitbucket
Bitbucket | Gitea
Gitea | + +| Docker | Jira | Confluence | Slack | AWS S3 | +|:------:|:----:|:-----------:|:-----:|:------:| +| Docker
Docker | Jira
Jira | Confluence
Confluence | Slack
Slack | AWS S3
AWS S3 | + +
+ +### Performance, Accuracy, and Hundreds of Rules - **Performance**: multithreaded, Hyperscan‑powered scanning built for huge codebases - **Extensible rules**: hundreds of built-in detectors plus YAML-defined custom rules ([docs/RULES.md](/docs/RULES.md)) - **Broad AI SaaS coverage**: finds and validates tokens for OpenAI, Anthropic, Google Gemini, Cohere, Mistral, Stability AI, Replicate, xAI (Grok), Ollama, Langchain, Perplexity, Weights & Biases, Cerebras, Friendli, Fireworks.ai, NVIDIA NIM, Together.ai, Zhipu, and many more @@ -46,6 +43,8 @@ See ([docs/COMPARISON.md](docs/COMPARISON.md)) - [Kingfisher](#kingfisher) - [Key Features](#key-features) + - [Multiple Scan Targets](#multiple-scan-targets) + - [Performance, Accuracy, and Hundreds of Rules](#performance-accuracy-and-hundreds-of-rules) - [Benchmark Results](#benchmark-results) - [Getting Started](#getting-started) - [Installation](#installation) @@ -67,25 +66,30 @@ See ([docs/COMPARISON.md](docs/COMPARISON.md)) - [Scan while ignoring likely test files](#scan-while-ignoring-likely-test-files) - [Exclude specific paths](#exclude-specific-paths) - [Scan changes in CI pipelines](#scan-changes-in-ci-pipelines) - - [Scan an S3 bucket](#scan-an-s3-bucket) - - [Scanning Docker Images](#scanning-docker-images) - - [Scanning GitHub](#scanning-github) - - [Scan GitHub organisation (requires `KF_GITHUB_TOKEN`)](#scan-github-organisation-requires-kf_github_token) + - [ Scanning an AWS S3 Bucket](#-scanning-an-aws-s3-bucket) + - [ Scanning Docker Images](#-scanning-docker-images) + - [ Scanning GitHub](#-scanning-github) + - [Scan GitHub organization (requires `KF_GITHUB_TOKEN`)](#scan-github-organization-requires-kf_github_token) - [Skip specific GitHub repositories during enumeration](#skip-specific-github-repositories-during-enumeration) - [Scan remote GitHub repository](#scan-remote-github-repository) - - [Scanning GitLab](#scanning-gitlab) + - [ Scanning GitLab](#-scanning-gitlab) - [Scan GitLab group (requires `KF_GITLAB_TOKEN`)](#scan-gitlab-group-requires-kf_gitlab_token) - [Scan GitLab user](#scan-gitlab-user) - [Skip specific GitLab projects during enumeration](#skip-specific-gitlab-projects-during-enumeration) - [Scan remote GitLab repository by URL](#scan-remote-gitlab-repository-by-url) - [List GitLab repositories](#list-gitlab-repositories) - - [Scanning Gitea](#scanning-gitea) + - [ Scanning Azure Repos](#-scanning-azure-repos) + - [Scan Azure DevOps organization or collection (requires `KF_AZURE_TOKEN` or `KF_AZURE_PAT`)](#scan-azure-devops-organization-or-collection-requires-kf_azure_token-or-kf_azure_pat) + - [Scan specific Azure DevOps projects](#scan-specific-azure-devops-projects) + - [Skip specific Azure repositories during enumeration](#skip-specific-azure-repositories-during-enumeration) + - [List Azure repositories](#list-azure-repositories) + - [ Scanning Gitea](#-scanning-gitea) - [Scan Gitea organization (requires `KF_GITEA_TOKEN`)](#scan-gitea-organization-requires-kf_gitea_token) - [Scan Gitea user](#scan-gitea-user) - [Skip specific Gitea repositories during enumeration](#skip-specific-gitea-repositories-during-enumeration) - [Scan remote Gitea repository by URL](#scan-remote-gitea-repository-by-url) - [List Gitea repositories](#list-gitea-repositories) - - [Scanning Bitbucket](#scanning-bitbucket) + - [ Scanning Bitbucket](#-scanning-bitbucket) - [Scan Bitbucket workspace](#scan-bitbucket-workspace) - [Scan Bitbucket user](#scan-bitbucket-user) - [Skip specific Bitbucket repositories during enumeration](#skip-specific-bitbucket-repositories-during-enumeration) @@ -93,12 +97,12 @@ See ([docs/COMPARISON.md](docs/COMPARISON.md)) - [List Bitbucket repositories](#list-bitbucket-repositories) - [Authenticate to Bitbucket](#authenticate-to-bitbucket) - [Self-hosted Bitbucket Server](#self-hosted-bitbucket-server) - - [Scanning Jira](#scanning-jira) + - [ Scanning Jira](#-scanning-jira) - [Scan Jira issues matching a JQL query](#scan-jira-issues-matching-a-jql-query) - [Scan the last 1,000 Jira issues:](#scan-the-last-1000-jira-issues) - - [Scanning Confluence](#scanning-confluence) + - [ Scanning Confluence](#-scanning-confluence) - [Scan Confluence pages matching a CQL query](#scan-confluence-pages-matching-a-cql-query) - - [Scanning Slack](#scanning-slack) + - [ Scanning Slack](#-scanning-slack) - [Scan Slack messages matching a search query](#scan-slack-messages-matching-a-search-query) - [Environment Variables for Tokens](#environment-variables-for-tokens) - [Exit Codes](#exit-codes) @@ -394,7 +398,8 @@ kingfisher scan ./my-project \ --exclude tests \ -v ``` -## Scan an S3 bucket + +## GitHub Scanning an AWS S3 Bucket You can scan S3 objects directly: ```bash @@ -445,7 +450,8 @@ docker run --rm \ ghcr.io/mongodb/kingfisher:latest \ scan --s3-bucket bucket-name ``` -## Scanning Docker Images + +## Docker Scanning Docker Images Kingfisher will first try to use any locally available image, then fall back to pulling via OCI. @@ -475,9 +481,9 @@ kingfisher scan --docker-image some-private-registry.dkr.ecr.us-east-1.amazonaws kingfisher scan --docker-image private.registry.example.com/my-image:tag ``` -## Scanning GitHub +## GitHub Scanning GitHub -### Scan GitHub organisation (requires `KF_GITHUB_TOKEN`) +### Scan GitHub organization (requires `KF_GITHUB_TOKEN`) ```bash kingfisher scan --github-organization my-org @@ -517,7 +523,7 @@ KF_GITHUB_TOKEN="ghp_…" kingfisher scan --git-url https://github.com/org/priva --- -## Scanning GitLab +## GitLab Scanning GitLab ### Scan GitLab group (requires `KF_GITLAB_TOKEN`) @@ -573,8 +579,48 @@ kingfisher gitlab repos list --group my-group --include-subgroups # skip specific projects when listing or scanning (supports glob patterns) kingfisher gitlab repos list --group my-group --gitlab-exclude my-group/**/legacy-* ``` +## Azure Repos Scanning Azure Repos -## Scanning Gitea +### Scan Azure DevOps organization or collection (requires `KF_AZURE_TOKEN` or `KF_AZURE_PAT`) + +```bash +kingfisher scan --azure-organization my-org + +# Azure DevOps Server example +KF_AZURE_PAT="pat" kingfisher scan --azure-organization DefaultCollection --azure-base-url https://ado.internal.example/tfs/ +``` + +### Scan specific Azure DevOps projects + +Projects are specified as `ORGANIZATION/PROJECT`. Repeat the flag for multiple projects. + +```bash +kingfisher scan --azure-project my-org/payments --azure-project my-org/core-platform +``` + +### Skip specific Azure repositories during enumeration + +Repeat `--azure-exclude` to ignore repositories when scanning organizations or projects. +Use identifiers like `ORGANIZATION/PROJECT/REPOSITORY`. Repositories that share the same +name as their project can be excluded with `ORGANIZATION/PROJECT`, and gitignore-style +patterns such as `my-org/*/archive-*` are also supported. + +```bash +kingfisher scan --azure-organization my-org \ + --azure-exclude my-org/payments/legacy-service \ + --azure-exclude my-org/**/archive-* +``` + +### List Azure repositories + +```bash +kingfisher azure repos list --organization my-org +# list repositories for specific projects +kingfisher azure repos list --project my-org/app --project my-org/api +# skip specific repositories while listing (supports glob patterns) +kingfisher azure repos list --organization my-org --azure-exclude my-org/**/experimental-* +``` +## Gitea Scanning Gitea ### Scan Gitea organization (requires `KF_GITEA_TOKEN`) @@ -626,9 +672,7 @@ KF_GITEA_TOKEN="gtoken" kingfisher gitea repos list --all-gitea-organizations # self-hosted example KF_GITEA_TOKEN="gtoken" kingfisher gitea repos list --user johndoe --gitea-api-url https://gitea.internal.example/api/v1/ ``` - -## Scanning Bitbucket - +## Bitbucket Scanning Bitbucket ### Scan Bitbucket workspace ```bash @@ -700,8 +744,7 @@ Use `--bitbucket-api-url` to point Kingfisher at your server's REST endpoint, fo `https://bitbucket.example.com/rest/api/1.0/`. Provide credentials with `--bitbucket-username` and `--bitbucket-token`, and pass `--ignore-certs` when connecting to HTTP or otherwise insecure instances. - -## Scanning Jira +## Jira Scanning Jira ### Scan Jira issues matching a JQL query @@ -720,8 +763,7 @@ KF_JIRA_TOKEN="token" kingfisher scan \ --max-results 1000 ``` -## Scanning Confluence - +## Confluence Scanning Confluence ### Scan Confluence pages matching a CQL query ```bash @@ -746,8 +788,7 @@ Generate a personal access token and set it in the `KF_CONFLUENCE_TOKEN` environ To use basic authentication instead, also set `KF_CONFLUENCE_USER` to your Confluence email address; Kingfisher will then send the username and `KF_CONFLUENCE_TOKEN` as a Basic auth header. If the server responds with a redirect to a login page, the credentials are invalid or lack the required permissions. -## Scanning Slack - +## Slack Scanning Slack ### Scan Slack messages matching a search query ```bash @@ -769,6 +810,8 @@ KF_SLACK_TOKEN="xoxp-1234..." kingfisher scan \ | `KF_GITLAB_TOKEN` | GitLab Personal Access Token | | `KF_GITEA_TOKEN` | Gitea Personal Access Token | | `KF_GITEA_USERNAME` | Username for private Gitea clones (used with `KF_GITEA_TOKEN`) | +| `KF_AZURE_TOKEN` / `KF_AZURE_PAT` | Azure DevOps Personal Access Token | +| `KF_AZURE_USERNAME` | Username to use with Azure DevOps PATs (defaults to `pat` when unset) | | `KF_BITBUCKET_USERNAME` | Bitbucket username for basic authentication | | `KF_BITBUCKET_APP_PASSWORD` / `KF_BITBUCKET_TOKEN` | Bitbucket app password or server token | | `KF_BITBUCKET_OAUTH_TOKEN` | Bitbucket OAuth or PAT token | @@ -971,14 +1014,16 @@ kingfisher scan --help Kingfisher began as a fork of Praetorian’s Nosey Parker, as our experiment with adding live validation support and embedding that validation directly inside each rule. Since that initial fork, it has diverged heavily from Nosey Parker: -- Replaced the SQLite datastore with an in-memory store + Bloom filter -- Collapsed the workflow into a single scan-and-report phase with direct JSON/BSON/SARIF outputs -- Added Tree-Sitter parsing on top of Hyperscan for deeper language-aware detection -- Removed datastore-driven reporting/annotations in favor of live validation, baselines, allowlists, and compressed-file extraction +- Added support for live validation of discovered secrets +- Added hundreds of new rules +- Added support for analyzing compressed files +- Added support for building "baselines" to allow for only reporting on newly discovered secrets +- Added Tree-Sitter based source code parsing on top of Hyperscan for deeper language-aware detection - Expanded support for new targets (GitLab, BitBucket, Gitea, Jira, Confluence, Slack, S3, Docker, etc.) +- Replaced the SQLite datastore with an in-memory store + Bloom filter +- Collapsed the workflow into a single scan-and-report phase with direct JSON/BSON/SARIF outputs - Delivered cross-platform builds, including native Windows - # Roadmap - More rules diff --git a/buildwin.bat b/buildwin.bat index deed257..ded614e 100644 --- a/buildwin.bat +++ b/buildwin.bat @@ -86,9 +86,10 @@ if /I not "%LOCALAPPDATA:~1,1%"==":" ( ) REM ── Install Hyperscan ------------------------------------------------------ -echo Installing Hyperscan via vcpkg... +set "VCPKG_TRIPLET=x64-windows-static" +echo Installing Hyperscan (%VCPKG_TRIPLET%) via vcpkg... pushd "%HOMEDRIVE%\vcpkg" REM ► work inside the vcpkg root -"%VCPKG_EXE%" install hyperscan:x64-windows || ( +"%VCPKG_EXE%" install hyperscan:%VCPKG_TRIPLET% || ( echo ERROR: vcpkg install failed. popd exit /b 1 @@ -97,7 +98,7 @@ popd set "LIBHS_NO_PKG_CONFIG=1" REM Point vectorscan‑rs‑sys at the Hyperscan install -set "HYPERSCAN_ROOT=%HOMEDRIVE%\vcpkg\installed\x64-windows" +set "HYPERSCAN_ROOT=%HOMEDRIVE%\vcpkg\installed\%VCPKG_TRIPLET%" set "LIB=%HYPERSCAN_ROOT%\lib;%LIB%" set "INCLUDE=%HYPERSCAN_ROOT%\include;%INCLUDE%" @@ -113,7 +114,9 @@ if %ERRORLEVEL% NEQ 0 ( echo Rust is already installed. ) -echo Building for Windows x64... +set "RUSTFLAGS=%RUSTFLAGS% -C target-feature=+crt-static" + +echo Building static Windows x64 binary... cargo build --release --target x86_64-pc-windows-msvc || ( echo Cargo build failed. exit /b 1 @@ -144,4 +147,4 @@ echo Archives in target\release: dir /b *.zip 2>nul || echo None found. endlocal -exit /b 0 \ No newline at end of file +exit /b 0 diff --git a/data/rules/azuredevops.yml b/data/rules/azuredevops.yml index 4188999..a607bc9 100644 --- a/data/rules/azuredevops.yml +++ b/data/rules/azuredevops.yml @@ -1,13 +1,27 @@ rules: - - name: Azure DevOps Personal Access Token + - name: Azure DevOps Organization id: kingfisher.azure.devops.1 pattern: | (?xi) \b - azure - (?:.|[\n\r]){0,32}? + dev\.azure\.com/ ( - [a-z0-9]{75}AZDO[a-z0-9]{5} + [a-z0-9][a-z0-9-]{0,61}[a-z0-9] + ) + confidence: medium + min_entropy: 2.5 + visible: false + examples: + - https://dev.azure.com/contoso + - dev.azure.com/somebody123 + + - name: Azure DevOps Personal Access Token + id: kingfisher.azure.devops.2 + pattern: | + (?xi) + \b + ( + [a-z0-9]{75,76}AZDO[a-z0-9]{4,5} ) \b min_entropy: 3 @@ -17,16 +31,20 @@ rules: references: - https://learn.microsoft.com/en-us/rest/api/azure/devops/profile/profiles/get?view=azure-devops-rest-7.1&tabs=HTTP - https://learn.microsoft.com/en-us/azure/devops/release-notes/2024/general/sprint-241-update + depends_on_rule: + - rule_id: kingfisher.azure.devops.1 + variable: AZURE_DEVOPS_ORG validation: type: Http content: request: headers: Authorization: 'Basic {{ ":" | append: TOKEN | b64enc }}' + Accept: application/json method: GET - url: https://app.vssps.visualstudio.com/_apis/profile/profiles/me?api-version=7.1-preview.1 + url: "https://dev.azure.com/{{ AZURE_DEVOPS_ORG | split: '/' | last }}/_apis/projects?api-version=7.1-preview.1" response_matcher: - report_response: true - type: StatusMatch status: - - 200 + - 200 \ No newline at end of file diff --git a/data/rules/datadog.yml b/data/rules/datadog.yml index 718b282..79ff171 100644 --- a/data/rules/datadog.yml +++ b/data/rules/datadog.yml @@ -4,8 +4,8 @@ rules: pattern: | (?xi) \b - (?:datadog|dd-|dd_) - (?:.|[\n\r]){0,16}? + datadog + (?:.|[\n\r]){0,64}? (?:SECRET|PRIVATE|ACCESS|KEY|TOKEN) (?:.|[\n\r]){0,32}? \b @@ -16,7 +16,6 @@ rules: min_entropy: 3.3 confidence: medium examples: - - dd-apikey-dd52c29224affe29d163c6bf99e5c34f - datadog-secrettoken-0024a29224affe29d173c0bf99e5a89d references: - https://docs.datadoghq.com/account_management/api-app-keys/ @@ -45,7 +44,7 @@ rules: (?xi) \b datadog - (?:.|[\n\r]){0,16}? + (?:.|[\n\r]){0,64}? (?:SECRET|PRIVATE|ACCESS|KEY|TOKEN) (?:.|[\n\r]){0,16}? \b diff --git a/docs/assets/icons/aws-s3.svg b/docs/assets/icons/aws-s3.svg new file mode 100644 index 0000000..3f63be5 --- /dev/null +++ b/docs/assets/icons/aws-s3.svg @@ -0,0 +1,34 @@ + + + + + + + + + + + + + + + + + + diff --git a/docs/assets/icons/azure-devops.svg b/docs/assets/icons/azure-devops.svg new file mode 100644 index 0000000..d5db277 --- /dev/null +++ b/docs/assets/icons/azure-devops.svg @@ -0,0 +1,2 @@ + + \ No newline at end of file diff --git a/docs/assets/icons/bitbucket.svg b/docs/assets/icons/bitbucket.svg new file mode 100644 index 0000000..38af1ce --- /dev/null +++ b/docs/assets/icons/bitbucket.svg @@ -0,0 +1,15 @@ + + + + + + + + Bitbucket-blue + + + + + + + \ No newline at end of file diff --git a/docs/assets/icons/confluence.svg b/docs/assets/icons/confluence.svg new file mode 100644 index 0000000..22249e1 --- /dev/null +++ b/docs/assets/icons/confluence.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/docs/assets/icons/docker.svg b/docs/assets/icons/docker.svg new file mode 100644 index 0000000..0a9c6b0 --- /dev/null +++ b/docs/assets/icons/docker.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/docs/assets/icons/files.svg b/docs/assets/icons/files.svg new file mode 100644 index 0000000..1ebd008 --- /dev/null +++ b/docs/assets/icons/files.svg @@ -0,0 +1,67 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/docs/assets/icons/gitea.svg b/docs/assets/icons/gitea.svg new file mode 100644 index 0000000..7ed0012 --- /dev/null +++ b/docs/assets/icons/gitea.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/docs/assets/icons/github.svg b/docs/assets/icons/github.svg new file mode 100644 index 0000000..a8d1174 --- /dev/null +++ b/docs/assets/icons/github.svg @@ -0,0 +1,3 @@ + + + diff --git a/docs/assets/icons/gitlab.svg b/docs/assets/icons/gitlab.svg new file mode 100644 index 0000000..abe3f37 --- /dev/null +++ b/docs/assets/icons/gitlab.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/docs/assets/icons/jira.svg b/docs/assets/icons/jira.svg new file mode 100644 index 0000000..57a68f0 --- /dev/null +++ b/docs/assets/icons/jira.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/docs/assets/icons/local-git.svg b/docs/assets/icons/local-git.svg new file mode 100644 index 0000000..994fb2c --- /dev/null +++ b/docs/assets/icons/local-git.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/docs/assets/icons/slack.svg b/docs/assets/icons/slack.svg new file mode 100644 index 0000000..fb55f72 --- /dev/null +++ b/docs/assets/icons/slack.svg @@ -0,0 +1,6 @@ + + + + + + diff --git a/src/azure.rs b/src/azure.rs new file mode 100644 index 0000000..9a3b6d5 --- /dev/null +++ b/src/azure.rs @@ -0,0 +1,675 @@ +use std::{ + collections::{HashMap, HashSet}, + env, + path::{Path, PathBuf}, + sync::{Arc, Mutex}, + time::Duration, +}; + +// NOTE: We continue to issue the small number of Azure DevOps Git REST calls we need +// directly through `reqwest` instead of depending on the `azure_devops_rust_api` +// crate. The SDK does not yet expose stable coverage for wiki repositories or the +// preview API surfaces we rely on, while the raw requests keep the binary lean and +// let us opt into newer API versions as Microsoft rolls them out. + +use anyhow::{anyhow, Context, Result}; +use globset::{Glob, GlobSet, GlobSetBuilder}; +use indicatif::{ProgressBar, ProgressStyle}; +use serde::Deserialize; +use tracing::warn; +use url::{form_urlencoded, Url}; + +use crate::{findings_store, git_url::GitUrl}; + +const API_VERSION: &str = "7.1-preview.1"; + +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +pub enum RepoType { + All, + Source, + Fork, +} + +impl RepoType { + fn allows(self, is_fork: bool) -> bool { + match self { + RepoType::All => true, + RepoType::Source => !is_fork, + RepoType::Fork => is_fork, + } + } +} + +#[derive(Debug, Clone)] +pub struct RepoSpecifiers { + pub organization: Vec, + pub project: Vec, + pub all_projects: bool, + pub repo_filter: RepoType, + pub exclude_repos: Vec, +} + +impl RepoSpecifiers { + pub fn is_empty(&self) -> bool { + self.organization.is_empty() && self.project.is_empty() + } +} + +#[derive(Debug)] +struct ExcludeMatcher { + exact: HashSet, + globs: Option, +} + +impl ExcludeMatcher { + fn matches(&self, name: &str) -> bool { + let candidate = name.to_lowercase(); + if self.exact.contains(&candidate) { + return true; + } + if let Some(globs) = &self.globs { + return globs.is_match(&candidate); + } + false + } + + fn is_empty(&self) -> bool { + self.exact.is_empty() && self.globs.is_none() + } +} + +fn looks_like_glob(pattern: &str) -> bool { + pattern.contains('*') || pattern.contains('?') || pattern.contains('[') +} + +fn encode_segment(segment: &str) -> String { + form_urlencoded::byte_serialize(segment.as_bytes()).collect::() +} + +fn normalize_repo_identifier(parts: &[String]) -> Option { + if parts.len() < 3 { + return None; + } + let repo = parts.last()?.trim().trim_matches('/'); + let project = parts.get(parts.len() - 2)?.trim().trim_matches('/'); + if repo.is_empty() || project.is_empty() { + return None; + } + let owner_segments = &parts[..parts.len() - 2]; + let mut normalized: Vec = + owner_segments.iter().map(|s| s.trim().trim_matches('/').to_lowercase()).collect(); + normalized.retain(|s| !s.is_empty()); + normalized.push(project.to_lowercase()); + normalized.push(repo.trim_end_matches(".git").to_lowercase()); + if normalized.is_empty() { + None + } else { + Some(normalized.join("/")) + } +} + +fn parse_repo_identifier_from_path(path: &str) -> Option { + let segments: Vec = path + .trim_matches('/') + .split('/') + .filter(|s| !s.is_empty()) + .map(|s| s.to_string()) + .collect(); + + if segments.is_empty() { + return None; + } + + if segments.len() == 2 { + let org = segments.first()?.trim().trim_matches('/'); + let project = segments.last()?.trim().trim_matches('/'); + if org.is_empty() || project.is_empty() { + return None; + } + + let org = org.to_lowercase(); + let project_raw = project.to_string(); + if looks_like_glob(&project_raw) { + let pattern = format!("{org}/{}/**", project_raw.to_lowercase()); + return Some(pattern); + } + + let project_normalized = project_raw.trim_end_matches(".git").to_lowercase(); + let repo = project_normalized.clone(); + return Some(format!("{org}/{project_normalized}/{repo}")); + } + + if segments.len() < 3 { + return None; + } + + // Case 1: Azure URL-style with "_git" marker: ...//_git/ + if segments[segments.len().saturating_sub(2)] == "_git" { + let mut trimmed = segments.clone(); + let repo = trimmed.pop()?; // + trimmed.pop()?; // drop "_git" + trimmed.push(repo); // ...// + return normalize_repo_identifier(&trimmed); + } + + // Case 2: Simple path (and glob-friendly): ...// + // Accept as-is so things like "org/*/repo" work. + normalize_repo_identifier(&segments) +} + +fn parse_repo_identifier_from_url(remote_url: &str) -> Option { + let url = Url::parse(remote_url).ok()?; + if let Some(path) = url.path_segments() { + let segments: Vec = + path.filter(|segment| !segment.is_empty()).map(|segment| segment.to_string()).collect(); + if segments.len() < 3 { + return None; + } + let mut trimmed = segments.clone(); + let repo = trimmed.pop()?; + let marker = trimmed.pop()?; + if marker != "_git" { + return None; + } + trimmed.push(repo); + normalize_repo_identifier(&trimmed) + } else { + None + } +} + +fn parse_excluded_repo(raw: &str) -> Option { + let trimmed = raw.trim(); + if trimmed.is_empty() { + return None; + } + + if let Some(name) = parse_repo_identifier_from_url(trimmed) { + return Some(name); + } + + if let Some(idx) = trimmed.rfind(':') { + if let Some(name) = parse_repo_identifier_from_path(&trimmed[idx + 1..]) { + return Some(name); + } + } + + parse_repo_identifier_from_path(trimmed) +} + +fn build_exclude_matcher(exclude_repos: &[String]) -> ExcludeMatcher { + let mut exact = HashSet::new(); + let mut glob_builder = GlobSetBuilder::new(); + let mut has_glob = false; + + for raw in exclude_repos { + match parse_excluded_repo(raw) { + Some(name) => { + let normalized = name.to_lowercase(); + if looks_like_glob(&normalized) { + match Glob::new(&normalized) { + Ok(glob) => { + glob_builder.add(glob); + has_glob = true; + } + Err(err) => { + warn!("Ignoring invalid Azure exclusion pattern '{raw}': {err}"); + exact.insert(normalized); + } + } + } else { + exact.insert(normalized); + } + } + None => { + warn!("Ignoring invalid Azure exclusion '{raw}' (expected organization/project[/repository])"); + } + } + } + + let globs = if has_glob { + match glob_builder.build() { + Ok(set) => Some(set), + Err(err) => { + warn!("Failed to build Azure exclusion patterns: {err}"); + None + } + } + } else { + None + }; + + ExcludeMatcher { exact, globs } +} + +fn should_exclude_repo(repo_url: &str, excludes: &ExcludeMatcher) -> bool { + if excludes.is_empty() { + return false; + } + if let Some(name) = parse_repo_identifier_from_url(repo_url) { + return excludes.matches(&name); + } + false +} + +#[derive(Debug, Deserialize, Default)] +struct AzureRepository { + #[serde(rename = "remoteUrl")] + remote_url: Option, + #[serde(rename = "webUrl")] + web_url: Option, + #[serde(rename = "isFork", default)] + is_fork: bool, + #[serde(default)] + project: AzureProjectRef, +} + +#[derive(Debug, Deserialize, Default)] +struct AzureProjectRef { + name: Option, +} + +#[derive(Debug, Deserialize, Default)] +struct AzureListResponse { + value: Vec, +} + +struct AzureAuth { + username: Option, + token: Option, +} + +impl AzureAuth { + fn from_environment() -> Self { + let token = env::var("KF_AZURE_TOKEN").or_else(|_| env::var("KF_AZURE_PAT")).ok(); + let username = env::var("KF_AZURE_USERNAME").ok(); + Self { username, token } + } + + fn apply(&self, request: reqwest::RequestBuilder) -> reqwest::RequestBuilder { + if let Some(token) = &self.token { + let username = self.username.as_deref().unwrap_or("pat"); + request.basic_auth(username, Some(token)) + } else { + request + } + } +} + +fn sanitize_remote_url(raw: &str) -> Option { + let mut url = Url::parse(raw).ok()?; + if !url.username().is_empty() { + url.set_username("").ok()?; + } + if url.password().is_some() { + url.set_password(None).ok()?; + } + Some(url.to_string()) +} + +async fn fetch_repositories_for_org( + client: &reqwest::Client, + base_url: &Url, + organization: &str, + auth: &AzureAuth, +) -> Result> { + let base = base_url.as_str().trim_end_matches('/'); + let encoded_org = encode_segment(organization); + let url = format!("{base}/{encoded_org}/_apis/git/repositories?api-version={API_VERSION}"); + let request = auth.apply(client.get(&url)); + let response = request.send().await?; + let status = response.status(); + let headers = response.headers().clone(); + let body_bytes = response.bytes().await?; + + if !status.is_success() { + let body = String::from_utf8_lossy(&body_bytes).trim().to_string(); + let auth_hint = if matches!( + status, + reqwest::StatusCode::UNAUTHORIZED | reqwest::StatusCode::FORBIDDEN + ) { + if auth.token.is_some() { + "Verify that the Azure token or PAT has access to the requested organization and has not expired." + } else { + "Set KF_AZURE_TOKEN or KF_AZURE_PAT with an Azure DevOps Personal Access Token that can read repositories." + } + } else { + "" + }; + + let mut message = format!( + "Azure Repos API request failed for organization '{organization}' ({status}): {body}" + ); + if !auth_hint.is_empty() { + message.push_str(&format!("\n{auth_hint}")); + } + return Err(anyhow!(message)); + } + + let is_json = headers + .get(reqwest::header::CONTENT_TYPE) + .and_then(|value| value.to_str().ok()) + .map(|value| { + value.split(';').next().unwrap_or("").trim().eq_ignore_ascii_case("application/json") + }) + .unwrap_or(false); + + if !is_json { + let body = String::from_utf8_lossy(&body_bytes); + return Err(anyhow!( + "Azure Repos API response for organization '{organization}' did not include JSON: {body}" + )); + } + + let payload: AzureListResponse = serde_json::from_slice(&body_bytes)?; + Ok(payload.value) +} + +fn parse_project_specifiers(projects: &[String]) -> HashMap> { + let mut map: HashMap> = HashMap::new(); + for raw in projects { + let trimmed = raw.trim(); + if trimmed.is_empty() { + continue; + } + let parts: Vec<&str> = trimmed.split('/').filter(|segment| !segment.is_empty()).collect(); + if parts.len() < 2 { + warn!( + "Ignoring Azure project specifier '{raw}' (expected format ORGANIZATION/PROJECT)" + ); + continue; + } + let project = parts.last().unwrap().to_lowercase(); + let organization = parts[..parts.len() - 1].join("/").to_lowercase(); + map.entry(organization).or_default().insert(project); + } + map +} + +fn canonicalize_organizations(spec: &RepoSpecifiers) -> HashMap { + let mut org_lookup: HashMap = HashMap::new(); + for org in &spec.organization { + let key = org.to_lowercase(); + org_lookup.entry(key).or_insert_with(|| org.clone()); + } + let project_map = parse_project_specifiers(&spec.project); + for (org_lower, _projects) in project_map { + org_lookup.entry(org_lower.clone()).or_insert(org_lower); + } + org_lookup +} + +pub async fn enumerate_repo_urls( + repo_specifiers: &RepoSpecifiers, + base_url: Url, + ignore_certs: bool, + mut progress: Option<&mut ProgressBar>, +) -> Result> { + let auth = AzureAuth::from_environment(); + let client = reqwest::Client::builder() + .danger_accept_invalid_certs(ignore_certs) + .timeout(Duration::from_secs(30)) + .build()?; + + let exclude_matcher = build_exclude_matcher(&repo_specifiers.exclude_repos); + let project_filters = parse_project_specifiers(&repo_specifiers.project); + let has_project_filters = !project_filters.is_empty(); + + let org_lookup = canonicalize_organizations(repo_specifiers); + if org_lookup.is_empty() { + return Ok(Vec::new()); + } + + let mut organizations: Vec = org_lookup.values().cloned().collect(); + organizations.sort(); + organizations.dedup(); + + let mut repo_urls = Vec::new(); + + for org in organizations { + if let Some(pb) = &mut progress { + pb.set_message(format!("Fetching Azure repositories for {org}...")); + } + let repos = + fetch_repositories_for_org(&client, &base_url, &org, &auth).await.with_context( + || format!("Failed to fetch repositories for Azure organization '{org}'"), + )?; + + let org_key = org.to_lowercase(); + let project_filter = project_filters.get(&org_key); + + for repo in repos { + if !repo_specifiers.repo_filter.allows(repo.is_fork) { + continue; + } + + let project_name = repo + .project + .name + .as_deref() + .map(|s| s.trim()) + .filter(|s| !s.is_empty()) + .unwrap_or(""); + + if !repo_specifiers.all_projects { + if let Some(filters) = project_filter { + if project_name.is_empty() || !filters.contains(&project_name.to_lowercase()) { + continue; + } + } else if has_project_filters + && !repo_specifiers + .organization + .iter() + .any(|candidate| candidate.eq_ignore_ascii_case(&org)) + { + // Organization derived solely from project filters without an explicit match + continue; + } + } + + let remote = repo + .remote_url + .as_deref() + .or(repo.web_url.as_deref()) + .ok_or_else(|| anyhow!("Missing remote URL for Azure repository"))?; + let sanitized = match sanitize_remote_url(remote) { + Some(url) => url, + None => { + warn!("Skipping Azure repository with unparsable URL: {remote}"); + continue; + } + }; + if should_exclude_repo(&sanitized, &exclude_matcher) { + continue; + } + repo_urls.push(sanitized); + } + } + + repo_urls.sort(); + repo_urls.dedup(); + Ok(repo_urls) +} + +pub async fn list_repositories( + base_url: Url, + ignore_certs: bool, + progress_enabled: bool, + organizations: &[String], + projects: &[String], + all_projects: bool, + exclude_repos: &[String], + repo_filter: RepoType, +) -> Result<()> { + let repo_specifiers = RepoSpecifiers { + organization: organizations.to_vec(), + project: projects.to_vec(), + all_projects, + repo_filter, + exclude_repos: exclude_repos.to_vec(), + }; + + if repo_specifiers.is_empty() { + anyhow::bail!("Provide at least one --organization or --project to enumerate Azure Repos"); + } + + let mut progress = if progress_enabled { + let style = ProgressStyle::with_template("{spinner} {msg} [{elapsed_precise}]") + .expect("progress bar style template should compile"); + let pb = ProgressBar::new_spinner() + .with_style(style) + .with_message("Fetching Azure repositories"); + pb.enable_steady_tick(Duration::from_millis(500)); + pb + } else { + ProgressBar::hidden() + }; + + let repo_urls = + enumerate_repo_urls(&repo_specifiers, base_url, ignore_certs, Some(&mut progress)).await?; + + for url in repo_urls { + println!("{}", url); + } + + Ok(()) +} + +fn parse_repo(repo_url: &GitUrl) -> Option { + Url::parse(repo_url.as_str()).ok() +} + +pub fn wiki_url(repo_url: &GitUrl) -> Option { + let url = parse_repo(repo_url)?; + let mut segments: Vec = url + .path_segments() + .map(|segments| segments.filter(|s| !s.is_empty()).map(|s| s.to_string()).collect()) + .unwrap_or_default(); + if segments.len() < 3 { + return None; + } + let mut repo_name = segments.pop()?; + if repo_name.ends_with(".wiki") { + return None; + } + let marker = segments.pop()?; + if marker != "_git" { + return None; + } + repo_name.push_str(".wiki"); + segments.push("_git".to_string()); + segments.push(repo_name); + let mut new_url = url.clone(); + { + let mut path_segments = new_url.path_segments_mut().ok()?; + path_segments.clear(); + for segment in segments { + path_segments.push(&segment); + } + } + GitUrl::try_from(new_url).ok() +} + +pub async fn fetch_repo_items( + _repo_url: &GitUrl, + _ignore_certs: bool, + _output_root: &Path, + _datastore: &Arc>, +) -> Result> { + // Azure DevOps exposes work items and wiki content via additional APIs. For now we + // skip fetching extra artifacts and simply return an empty set so callers can rely + // on the function existing just like the other git host modules. + Ok(Vec::new()) +} + +#[cfg(test)] +mod tests { + use super::*; + use std::str::FromStr; + + #[test] + fn sanitize_remote_url_strips_username() { + let raw = "https://example@dev.azure.com/example/project/_git/repo"; + let sanitized = sanitize_remote_url(raw).expect("sanitize"); + assert_eq!(sanitized, "https://dev.azure.com/example/project/_git/repo"); + } + + #[test] + fn parse_repo_identifier_from_url_handles_basic_path() { + let remote = "https://dev.azure.com/org/project/_git/repo"; + let ident = parse_repo_identifier_from_url(remote).expect("identifier"); + assert_eq!(ident, "org/project/repo"); + } + + #[test] + fn parse_repo_identifier_from_url_handles_nested_org() { + let remote = "https://ado.example.com/collection/team/project/_git/repo"; + let ident = parse_repo_identifier_from_url(remote).expect("identifier"); + assert_eq!(ident, "collection/team/project/repo"); + } + + #[test] + fn parse_excluded_repo_accepts_url() { + let raw = "https://dev.azure.com/org/project/_git/repo"; + let ident = parse_excluded_repo(raw).expect("identifier"); + assert_eq!(ident, "org/project/repo"); + } + + #[test] + fn parse_excluded_repo_accepts_path() { + let raw = "org/project/repo"; + let ident = parse_excluded_repo(raw).expect("identifier"); + assert_eq!(ident, "org/project/repo"); + } + + #[test] + fn parse_excluded_repo_allows_project_alias() { + let raw = "Org/Project"; + let ident = parse_excluded_repo(raw).expect("identifier"); + assert_eq!(ident, "org/project/project"); + } + + #[test] + fn parse_excluded_repo_allows_project_glob() { + let raw = "org/*"; + let ident = parse_excluded_repo(raw).expect("identifier"); + assert_eq!(ident, "org/*/**"); + } + + #[test] + fn exclude_matcher_matches_glob() { + let matcher = build_exclude_matcher(&["org/*/repo".to_string()]); + assert!(should_exclude_repo("https://dev.azure.com/org/project/_git/repo", &matcher)); + } + + #[test] + fn exclude_matcher_matches_project_alias() { + let matcher = build_exclude_matcher(&["org/project".to_string()]); + assert!(should_exclude_repo("https://dev.azure.com/org/project/_git/project", &matcher)); + } + + #[test] + fn exclude_matcher_matches_project_glob() { + let matcher = build_exclude_matcher(&["org/*".to_string()]); + assert!(should_exclude_repo("https://dev.azure.com/org/project/_git/repo", &matcher)); + } + + #[test] + fn exclude_matcher_is_case_insensitive_for_exact_matches() { + let matcher = build_exclude_matcher(&["Org/Project/Repo".to_string()]); + assert!(should_exclude_repo("https://dev.azure.com/org/project/_git/repo", &matcher)); + } + + #[test] + fn exclude_matcher_is_case_insensitive_for_globs() { + let matcher = build_exclude_matcher(&["ORG/*".to_string()]); + assert!(should_exclude_repo("https://dev.azure.com/org/project/_git/repo", &matcher)); + } + + #[test] + fn wiki_url_appends_suffix() { + let url = GitUrl::from_str("https://dev.azure.com/org/project/_git/repo").unwrap(); + let wiki = wiki_url(&url).expect("wiki url"); + assert_eq!(wiki.as_str(), "https://dev.azure.com/org/project/_git/repo.wiki"); + } +} diff --git a/src/cli/commands/azure.rs b/src/cli/commands/azure.rs new file mode 100644 index 0000000..28e240e --- /dev/null +++ b/src/cli/commands/azure.rs @@ -0,0 +1,98 @@ +use clap::{Args, Subcommand, ValueEnum, ValueHint}; +use strum_macros::Display; +use url::Url; + +use crate::cli::commands::output::OutputArgs; + +#[derive(Args, Debug)] +pub struct AzureArgs { + #[command(subcommand)] + pub command: AzureCommand, + + /// Override Azure DevOps base URL (e.g. for Azure DevOps Server) + #[arg(global = true, long, default_value = "https://dev.azure.com/", value_hint = ValueHint::Url)] + pub azure_base_url: Url, +} + +#[derive(Subcommand, Debug)] +pub enum AzureCommand { + /// Interact with Azure DevOps repositories + #[command(subcommand)] + Repos(AzureReposCommand), +} + +#[derive(Subcommand, Debug)] +pub enum AzureReposCommand { + /// List repositories for organizations or projects + List(AzureReposListArgs), +} + +#[derive(Args, Debug, Clone)] +pub struct AzureReposListArgs { + #[command(flatten)] + pub repo_specifiers: AzureRepoSpecifiers, + + #[command(flatten)] + pub output_args: OutputArgs, +} + +#[derive(Args, Debug, Clone)] +pub struct AzureRepoSpecifiers { + /// Repositories belonging to these Azure DevOps organizations or collections + #[arg(long = "azure-organization", alias = "organization", value_name = "ORGANIZATION")] + pub organization: Vec, + + /// Repositories belonging to the specified Azure DevOps projects (format: ORGANIZATION/PROJECT) + #[arg(long = "azure-project", alias = "project", value_name = "ORGANIZATION/PROJECT")] + pub project: Vec, + + /// Include repositories from all projects within the specified organizations + #[arg(long = "azure-all-projects", alias = "all-azure-projects")] + pub all_projects: bool, + + /// Skip repositories when enumerating Azure sources (format: ORGANIZATION/PROJECT/REPOSITORY) + #[arg( + long = "azure-exclude", + alias = "azure-exclude-repo", + value_name = "ORGANIZATION/PROJECT/REPOSITORY" + )] + pub exclude_repos: Vec, + + /// Filter by repository type + #[arg(long = "azure-repo-type", default_value_t = AzureRepoType::Source)] + pub repo_type: AzureRepoType, +} + +impl AzureRepoSpecifiers { + pub fn is_empty(&self) -> bool { + self.organization.is_empty() && self.project.is_empty() + } +} + +#[derive(Copy, Clone, Debug, Display, PartialEq, Eq, PartialOrd, Ord, ValueEnum)] +#[strum(serialize_all = "kebab-case")] +pub enum AzureRepoType { + Source, + Fork, + All, +} + +impl From for crate::azure::RepoType { + fn from(value: AzureRepoType) -> Self { + match value { + AzureRepoType::Source => crate::azure::RepoType::Source, + AzureRepoType::Fork => crate::azure::RepoType::Fork, + AzureRepoType::All => crate::azure::RepoType::All, + } + } +} + +#[derive(Copy, Clone, Debug, ValueEnum, Display)] +#[strum(serialize_all = "kebab-case")] +pub enum AzureOutputFormat { + Pretty, + Json, + Jsonl, + Bson, + Sarif, +} diff --git a/src/cli/commands/inputs.rs b/src/cli/commands/inputs.rs index 6c6f81b..4bab9d1 100644 --- a/src/cli/commands/inputs.rs +++ b/src/cli/commands/inputs.rs @@ -5,6 +5,7 @@ use url::Url; use crate::{ cli::commands::{ + azure::AzureRepoType, bitbucket::{BitbucketAuthArgs, BitbucketRepoType}, gitea::GiteaRepoType, github::{GitCloneMode, GitHistoryMode, GitHubRepoType}, @@ -30,11 +31,14 @@ pub struct InputSpecifierArgs { "bitbucket_user", "bitbucket_workspace", "bitbucket_project", + "azure_organization", + "azure_project", "git_url", "all_github_organizations", "all_gitlab_groups", "all_gitea_organizations", "all_bitbucket_workspaces", + "all_azure_projects", "jira_url", "confluence_url", "docker_image", @@ -176,6 +180,38 @@ pub struct InputSpecifierArgs { #[command(flatten)] pub bitbucket_auth: BitbucketAuthArgs, + // Azure DevOps Options + /// Scan repositories belonging to the specified Azure DevOps organizations or collections + #[arg(long = "azure-organization")] + pub azure_organization: Vec, + + /// Scan repositories belonging to the specified Azure DevOps projects (format: ORGANIZATION/PROJECT) + #[arg(long = "azure-project", value_name = "ORGANIZATION/PROJECT")] + pub azure_project: Vec, + + /// Skip repositories when enumerating Azure Repos sources (format: ORGANIZATION/PROJECT/REPOSITORY) + #[arg( + long = "azure-exclude", + alias = "azure-exclude-repo", + value_name = "ORGANIZATION/PROJECT/REPOSITORY" + )] + pub azure_exclude: Vec, + + /// Include repositories from every project within the specified Azure organizations + #[arg(long = "all-azure-projects")] + pub all_azure_projects: bool, + + /// Use the specified base URL for Azure DevOps (e.g. Azure DevOps Server) + #[arg( + long = "azure-base-url", + default_value = "https://dev.azure.com/", + value_hint = ValueHint::Url + )] + pub azure_base_url: Url, + + #[arg(long = "azure-repo-type", default_value_t = AzureRepoType::Source)] + pub azure_repo_type: AzureRepoType, + /// Jira base URL (e.g. https://jira.example.com) #[arg(long, value_hint = ValueHint::Url, requires = "jql")] pub jira_url: Option, diff --git a/src/cli/commands/mod.rs b/src/cli/commands/mod.rs index b7717bd..0434af9 100644 --- a/src/cli/commands/mod.rs +++ b/src/cli/commands/mod.rs @@ -1,3 +1,4 @@ +pub mod azure; pub mod bitbucket; pub mod gitea; pub mod github; diff --git a/src/cli/global.rs b/src/cli/global.rs index edd79dc..a03d3d4 100644 --- a/src/cli/global.rs +++ b/src/cli/global.rs @@ -7,8 +7,8 @@ use sysinfo::{MemoryRefreshKind, RefreshKind, System}; use tracing::Level; use crate::cli::commands::{ - bitbucket::BitbucketArgs, gitea::GiteaArgs, github::GitHubArgs, gitlab::GitLabArgs, - rules::RulesArgs, scan::ScanArgs, + azure::AzureArgs, bitbucket::BitbucketArgs, gitea::GiteaArgs, github::GitHubArgs, + gitlab::GitLabArgs, rules::RulesArgs, scan::ScanArgs, }; #[deny(missing_docs)] @@ -77,6 +77,10 @@ pub enum Command { #[command(name = "bitbucket")] Bitbucket(BitbucketArgs), + /// Interact with the Azure DevOps API + #[command(name = "azure")] + Azure(AzureArgs), + /// Manage rules #[command(alias = "rule")] Rules(RulesArgs), diff --git a/src/git_binary.rs b/src/git_binary.rs index 09f6658..82fd990 100644 --- a/src/git_binary.rs +++ b/src/git_binary.rs @@ -31,6 +31,15 @@ const GITEA_CREDENTIAL_HELPER: &str = r#"credential.helper=!_gteacreds() { fi }; _gteacreds"#; +const AZURE_CREDENTIAL_HELPER: &str = r#"credential.helper=!_azcreds() { + token="${KF_AZURE_TOKEN:-${KF_AZURE_PAT:-}}"; + if [ -n "$token" ]; then + user="${KF_AZURE_USERNAME:-pat}"; + echo username="$user"; + echo password="$token"; + fi +}; _azcreds"#; + /// Represents errors that can occur when interacting with the `git` CLI. #[derive(Debug, thiserror::Error)] pub enum GitError { @@ -79,9 +88,17 @@ impl Git { matches!(std::env::var("KF_BITBUCKET_OAUTH_TOKEN"), Ok(value) if !value.is_empty()); let has_bitbucket_credentials = has_bitbucket_oauth_token || (has_bitbucket_username && has_bitbucket_password); + let has_azure_token = ["KF_AZURE_TOKEN", "KF_AZURE_PAT"] + .iter() + .any(|key| matches!(std::env::var(key), Ok(value) if !value.is_empty())); // If credentials are provided via environment variables, clear existing helpers first. - if has_github_token || has_gitlab_token || has_gitea_token || has_bitbucket_credentials { + if has_github_token + || has_gitlab_token + || has_gitea_token + || has_bitbucket_credentials + || has_azure_token + { credentials.push("-c".into()); credentials.push(r#"credential.helper="#.into()); } @@ -114,6 +131,11 @@ impl Git { credentials.push(BITBUCKET_CREDENTIAL_HELPER.into()); } + if has_azure_token { + credentials.push("-c".into()); + credentials.push(AZURE_CREDENTIAL_HELPER.into()); + } + Self { credentials, ignore_certs } } diff --git a/src/lib.rs b/src/lib.rs index 598c278..3ceed02 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -1,3 +1,4 @@ +pub mod azure; pub mod baseline; pub mod binary; pub mod bitbucket; diff --git a/src/main.rs b/src/main.rs index d73bcc1..b6bb1fd 100644 --- a/src/main.rs +++ b/src/main.rs @@ -33,7 +33,7 @@ use std::{ use anyhow::{Context, Result}; use kingfisher::{ - bitbucket, + azure, bitbucket, cli::{ self, commands::{ @@ -71,6 +71,7 @@ use tracing_subscriber::{ use url::Url; use crate::cli::commands::{ + azure::{AzureCommand, AzureRepoType, AzureReposCommand}, bitbucket::{BitbucketAuthArgs, BitbucketCommand, BitbucketRepoType, BitbucketReposCommand}, gitea::{GiteaCommand, GiteaRepoType, GiteaReposCommand}, gitlab::{GitLabCommand, GitLabRepoType, GitLabReposCommand}, @@ -91,6 +92,7 @@ fn main() -> anyhow::Result<()> { Command::GitLab(_) => num_cpus::get(), // Default for GitLab commands Command::Bitbucket(_) => num_cpus::get(), // Default for Bitbucket commands Command::Gitea(_) => num_cpus::get(), // Default for Gitea commands + Command::Azure(_) => num_cpus::get(), // Default for Azure commands Command::Rules(_) => num_cpus::get(), // Default for Rules commands }; @@ -267,6 +269,23 @@ async fn async_main(args: CommandLineArgs) -> Result<()> { } }, }, + Command::Azure(azure_args) => match azure_args.command { + AzureCommand::Repos(repos_command) => match repos_command { + AzureReposCommand::List(list_args) => { + azure::list_repositories( + azure_args.azure_base_url.clone(), + global_args.ignore_certs, + global_args.use_progress(), + &list_args.repo_specifiers.organization, + &list_args.repo_specifiers.project, + list_args.repo_specifiers.all_projects, + &list_args.repo_specifiers.exclude_repos, + list_args.repo_specifiers.repo_type.into(), + ) + .await?; + } + }, + }, Command::Gitea(gitea_args) => match gitea_args.command { GiteaCommand::Repos(repos_command) => match repos_command { GiteaReposCommand::List(list_args) => { @@ -364,6 +383,13 @@ fn create_default_scan_args() -> cli::commands::scan::ScanArgs { bitbucket_repo_type: BitbucketRepoType::Source, bitbucket_auth: BitbucketAuthArgs::default(), + azure_organization: Vec::new(), + azure_project: Vec::new(), + azure_exclude: Vec::new(), + all_azure_projects: false, + azure_base_url: Url::parse("https://dev.azure.com/").unwrap(), + azure_repo_type: AzureRepoType::Source, + jira_url: None, jql: None, confluence_url: None, diff --git a/src/matcher.rs b/src/matcher.rs index 3112a9e..08045d8 100644 --- a/src/matcher.rs +++ b/src/matcher.rs @@ -1,5 +1,4 @@ use std::{ - borrow::Cow, hash::{Hash, Hasher}, str, sync::{Arc, Mutex}, @@ -40,6 +39,7 @@ use crate::{ const MAX_CHUNK_SIZE: usize = 1 << 30; // 1 GiB per scan segment const CHUNK_OVERLAP: usize = 64 * 1024; // 64 KiB overlap to catch boundary matches const BASE64_SCAN_LIMIT: usize = 64 * 1024 * 1024; // skip expensive Base64 pass on huge blobs +const TREE_SITTER_SCAN_LIMIT: usize = 64 * 1024; // only run tree-sitter on blobs ≤64 KiB // ------------------------------------------------------------------------------------------------- // RawMatch @@ -320,18 +320,22 @@ impl<'a> Matcher<'a> { get_base64_strings(blob.bytes()) }; - if self.user_data.raw_matches_scratch.is_empty() && b64_items.is_empty() { + let lang_hint = lang.as_deref(); + let has_raw_matches = !self.user_data.raw_matches_scratch.is_empty(); + let has_base64_items = !b64_items.is_empty(); + + if !has_raw_matches && !has_base64_items && !(no_base64 && lang_hint.is_some()) { return Ok(ScanResult::New(Vec::new())); } let rules_db = self.rules_db; let mut seen_matches = FxHashSet::default(); let mut previous_matches: FxHashMap> = FxHashMap::default(); - let tree_sitter_result = if self.user_data.raw_matches_scratch.is_empty() { - None - } else { - lang.and_then(|lang_str| { - get_language_and_queries(&lang_str).and_then(|(language, queries)| { + let should_run_tree_sitter = blob.len() <= TREE_SITTER_SCAN_LIMIT + && (has_raw_matches || (no_base64 && lang_hint.is_some())); + let tree_sitter_result = if should_run_tree_sitter { + lang_hint.and_then(|lang_str| { + get_language_and_queries(lang_str).and_then(|(language, queries)| { let checker = Checker { language, rules: queries }; match checker.check(&blob.bytes()) { Ok(results) => Some(results), @@ -342,6 +346,8 @@ impl<'a> Matcher<'a> { } }) }) + } else { + None }; // Process matches let mut matches = Vec::new(); @@ -407,7 +413,7 @@ impl<'a> Matcher<'a> { rule_id_usize, &mut seen_matches, origin, - Some(ts_match.clone()), + Some(ts_match.as_bytes()), *is_base64_decoded, redact, &filename, @@ -437,7 +443,7 @@ impl<'a> Matcher<'a> { rule_id_usize, &mut seen_matches, origin, - Some(item.decoded.clone()), + Some(item.decoded.as_bytes()), true, redact, &filename, @@ -540,7 +546,7 @@ fn filter_match<'b>( rule_id: usize, seen_matches: &mut FxHashSet, _origin: &OriginSet, - ts_match: Option, + ts_match: Option<&[u8]>, is_base64: bool, redact: bool, filename: &str, @@ -551,12 +557,11 @@ fn filter_match<'b>( let initial_len = matches.len(); - // Use Cow to avoid unnecessary copying when ts_match is None - let byte_slice: Cow<[u8]> = match ts_match { - Some(ts_match_value) => Cow::Owned(ts_match_value.into_bytes()), - None => Cow::Borrowed(&blob.bytes()[start..end]), - }; - for captures in re.captures_iter(byte_slice.as_ref()) { + let blob_bytes = blob.bytes(); + let default_slice = &blob_bytes[start..end]; + let haystack = ts_match.unwrap_or(default_slice); + + for captures in re.captures_iter(haystack) { let full_capture = captures.get(0).unwrap(); let matching_input = captures.get(1).unwrap_or(full_capture); let min_entropy = rule.min_entropy(); @@ -590,8 +595,7 @@ fn filter_match<'b>( } let only_matching_input = &blob.bytes()[matching_input_offset_span.start..matching_input_offset_span.end]; - let groups = - SerializableCaptures::from_captures(&captures, byte_slice.as_ref(), re, redact); + let groups = SerializableCaptures::from_captures(&captures, haystack, re, redact); matches.push(BlobMatch { rule: Arc::clone(&rule), blob_id: blob.id_ref(), diff --git a/src/reporter.rs b/src/reporter.rs index caa6aa8..73bc541 100644 --- a/src/reporter.rs +++ b/src/reporter.rs @@ -48,6 +48,19 @@ const BITBUCKET_FRAGMENT_ENCODE_SET: &AsciiSet = &CONTROLS .add(b'}') .add(b'|'); +const AZURE_QUERY_ENCODE_SET: &AsciiSet = &CONTROLS + .add(b' ') + .add(b'"') + .add(b'#') + .add(b'%') + .add(b'<') + .add(b'>') + .add(b'?') + .add(b'`') + .add(b'{') + .add(b'}') + .add(b'|'); + fn build_git_urls( repo_url: &str, commit_id: &str, @@ -94,6 +107,19 @@ fn build_git_urls( commit_url = format!("{base}/commits/{commit_id}"); file_url = format!("{base}/commits/{commit_id}#L{anchor}F{line}"); } + } else if host.eq_ignore_ascii_case("dev.azure.com") || host.ends_with(".visualstudio.com") + { + let normalized = file_path.replace('\\', "/"); + let trimmed = normalized.trim_start_matches('/'); + let encoded_path = utf8_percent_encode(trimmed, AZURE_QUERY_ENCODE_SET).to_string(); + repository_url = repo_url.to_string(); + commit_url = format!("{repo_url}/commit/{commit_id}"); + if line > 0 { + file_url = + format!("{repo_url}/commit/{commit_id}?path=/{}&line={line}", encoded_path); + } else { + file_url = format!("{repo_url}/commit/{commit_id}?path=/{}", encoded_path); + } } } @@ -667,6 +693,7 @@ mod tests { cli::commands::output::OutputArgs, cli::commands::scan::{ConfidenceLevel, ScanArgs}, cli::commands::{ + azure::AzureRepoType, bitbucket::{BitbucketAuthArgs, BitbucketRepoType}, gitea::GiteaRepoType, github::{GitCloneMode, GitHistoryMode, GitHubRepoType}, @@ -789,6 +816,12 @@ mod tests { bitbucket_api_url: Url::parse("https://api.bitbucket.org/2.0/").unwrap(), bitbucket_repo_type: BitbucketRepoType::Source, bitbucket_auth: BitbucketAuthArgs::default(), + azure_organization: Vec::new(), + azure_project: Vec::new(), + azure_exclude: Vec::new(), + all_azure_projects: false, + azure_base_url: Url::parse("https://dev.azure.com/").unwrap(), + azure_repo_type: AzureRepoType::Source, jira_url: None, jql: None, confluence_url: None, @@ -844,6 +877,28 @@ mod tests { .unwrap(); assert_eq!(git_file_path, "path/in/history.txt"); } + + use super::build_git_urls; + + #[test] + fn azure_commit_links_use_query_paths() { + let (repo_url, commit_url, file_url) = build_git_urls( + "https://dev.azure.com/org/project/_git/repo", + "0123456789abcdef", + "dir/file.txt", + 7, + ); + + assert_eq!(repo_url, "https://dev.azure.com/org/project/_git/repo"); + assert_eq!( + commit_url, + "https://dev.azure.com/org/project/_git/repo/commit/0123456789abcdef" + ); + assert_eq!( + file_url, + "https://dev.azure.com/org/project/_git/repo/commit/0123456789abcdef?path=/dir/file.txt&line=7" + ); + } } impl From for ReportMatch { diff --git a/src/reporter/json_format.rs b/src/reporter/json_format.rs index 4149469..8b4f59c 100644 --- a/src/reporter/json_format.rs +++ b/src/reporter/json_format.rs @@ -39,6 +39,7 @@ mod tests { use crate::util::intern; use crate::{ blob::BlobId, + cli::commands::azure::AzureRepoType, cli::commands::bitbucket::{BitbucketAuthArgs, BitbucketRepoType}, cli::commands::gitea::GiteaRepoType, cli::commands::github::GitHubRepoType, @@ -109,6 +110,13 @@ mod tests { bitbucket_api_url: Url::parse("https://api.bitbucket.org/2.0/").unwrap(), bitbucket_repo_type: BitbucketRepoType::Source, bitbucket_auth: BitbucketAuthArgs::default(), + // Azure DevOps + azure_organization: Vec::new(), + azure_project: Vec::new(), + azure_exclude: Vec::new(), + all_azure_projects: false, + azure_base_url: Url::parse("https://dev.azure.com/").unwrap(), + azure_repo_type: AzureRepoType::Source, // Jira options jira_url: None, jql: None, diff --git a/src/scanner/mod.rs b/src/scanner/mod.rs index d80160c..a6e0b6a 100644 --- a/src/scanner/mod.rs +++ b/src/scanner/mod.rs @@ -2,7 +2,8 @@ pub(crate) use docker::save_docker_images; pub(crate) use enumerate::enumerate_filesystem_inputs; pub(crate) use repos::{ - clone_or_update_git_repos, enumerate_bitbucket_repos, enumerate_github_repos, + clone_or_update_git_repos, enumerate_azure_repos, enumerate_bitbucket_repos, + enumerate_github_repos, }; pub use runner::{load_and_record_rules, run_async_scan, run_scan}; pub(crate) use validation::run_secret_validation; diff --git a/src/scanner/processing.rs b/src/scanner/processing.rs index 5132209..3461eed 100644 --- a/src/scanner/processing.rs +++ b/src/scanner/processing.rs @@ -31,7 +31,12 @@ impl<'a> BlobProcessor<'a> { ) -> Result> { let _span = debug_span!("matcher", temp_id = blob.temp_id()).entered(); let t1 = Instant::now(); - let res = self.matcher.scan_blob(&blob, &origin, None, redact, no_dedup, no_base64)?; + let language_hint = origin + .iter() + .find_map(|p| p.blob_path()) + .and_then(|path| ContentInspector::default().guess_language(path, blob.bytes())); + let res = + self.matcher.scan_blob(&blob, &origin, language_hint, redact, no_dedup, no_base64)?; let scan_us = t1.elapsed().as_micros(); match res { // blob already seen, but with no matches; nothing to do! diff --git a/src/scanner/repos.rs b/src/scanner/repos.rs index 95144a7..eb4ad10 100644 --- a/src/scanner/repos.rs +++ b/src/scanner/repos.rs @@ -11,7 +11,7 @@ use url::Url; use crate::blob::BlobIdMap; use crate::{ - bitbucket, + azure, bitbucket, blob::BlobMetadata, cli::{ commands::{github::GitCloneMode, github::GitHistoryMode, scan}, @@ -370,6 +370,69 @@ pub async fn enumerate_bitbucket_repos( Ok(repo_urls) } +pub async fn enumerate_azure_repos( + args: &scan::ScanArgs, + global_args: &global::GlobalArgs, +) -> Result> { + let repo_specifiers = azure::RepoSpecifiers { + organization: args.input_specifier_args.azure_organization.clone(), + project: args.input_specifier_args.azure_project.clone(), + all_projects: args.input_specifier_args.all_azure_projects, + repo_filter: args.input_specifier_args.azure_repo_type.into(), + exclude_repos: args.input_specifier_args.azure_exclude.clone(), + }; + + let mut repo_urls = args.input_specifier_args.git_url.clone(); + if !repo_specifiers.is_empty() { + let mut progress = if global_args.use_progress() { + let style = + ProgressStyle::with_template("{spinner} {msg} {human_len} [{elapsed_precise}]") + .expect("progress bar style template should compile"); + let pb = ProgressBar::new_spinner() + .with_style(style) + .with_message("Enumerating Azure Repos repositories..."); + pb.enable_steady_tick(Duration::from_millis(500)); + pb + } else { + ProgressBar::hidden() + }; + + let mut num_found: u64 = 0; + let base_url = args.input_specifier_args.azure_base_url.clone(); + let repo_strings = azure::enumerate_repo_urls( + &repo_specifiers, + base_url, + global_args.ignore_certs, + Some(&mut progress), + ) + .await + .context("Failed to enumerate Azure repositories")?; + + for repo_string in repo_strings { + match GitUrl::from_str(&repo_string) { + Ok(repo_url) => { + repo_urls.push(repo_url); + num_found += 1; + } + Err(e) => { + progress.suspend(|| { + error!("Failed to parse repo URL from {repo_string}: {e}"); + }); + } + } + } + + progress.finish_with_message(format!( + "Found {} repositories from Azure Repos", + HumanCount(num_found) + )); + } + + repo_urls.sort(); + repo_urls.dedup(); + Ok(repo_urls) +} + pub async fn fetch_jira_issues( args: &scan::ScanArgs, global_args: &global::GlobalArgs, @@ -519,6 +582,16 @@ pub async fn fetch_git_host_artifacts( ) .await?, ); + } else if host.contains("dev.azure") || host.contains("visualstudio.com") { + dirs.extend( + azure::fetch_repo_items( + repo_url, + global_args.ignore_certs, + &output_root, + datastore, + ) + .await?, + ); } } Ok(dirs) diff --git a/src/scanner/runner.rs b/src/scanner/runner.rs index 9d394dc..9de4a00 100644 --- a/src/scanner/runner.rs +++ b/src/scanner/runner.rs @@ -7,7 +7,7 @@ use tokio::time::{Duration, Instant}; use tracing::{debug, error, error_span, info, trace}; use crate::{ - bitbucket, + azure, bitbucket, cli::{commands::scan, global}, findings_store, findings_store::{FindingsStore, FindingsStoreMessage}, @@ -20,8 +20,8 @@ use crate::{ rules_database::RulesDatabase, safe_list, scanner::{ - clone_or_update_git_repos, enumerate_bitbucket_repos, enumerate_filesystem_inputs, - enumerate_github_repos, + clone_or_update_git_repos, enumerate_azure_repos, enumerate_bitbucket_repos, + enumerate_filesystem_inputs, enumerate_github_repos, repos::{ enumerate_gitea_repos, enumerate_gitlab_repos, fetch_confluence_pages, fetch_git_host_artifacts, fetch_jira_issues, fetch_s3_objects, fetch_slack_messages, @@ -75,11 +75,13 @@ pub async fn run_async_scan( let gitlab_repo_urls = enumerate_gitlab_repos(args, global_args).await?; let gitea_repo_urls = enumerate_gitea_repos(args, global_args).await?; let bitbucket_repo_urls = enumerate_bitbucket_repos(args, global_args).await?; + let azure_repo_urls = enumerate_azure_repos(args, global_args).await?; // Combine repository URLs repo_urls.extend(gitlab_repo_urls); repo_urls.extend(gitea_repo_urls); repo_urls.extend(bitbucket_repo_urls); + repo_urls.extend(azure_repo_urls); repo_urls.sort(); repo_urls.dedup(); @@ -99,6 +101,9 @@ pub async fn run_async_scan( if let Some(w) = bitbucket::wiki_url(url) { wiki_urls.push(w); } + if let Some(w) = azure::wiki_url(url) { + wiki_urls.push(w); + } } repo_urls.extend(wiki_urls); repo_urls.sort(); diff --git a/tests/int_allowlist.rs b/tests/int_allowlist.rs index 5e119f3..72bd950 100644 --- a/tests/int_allowlist.rs +++ b/tests/int_allowlist.rs @@ -7,6 +7,7 @@ use anyhow::Result; use kingfisher::{ cli::{ commands::{ + azure::AzureRepoType, bitbucket::{BitbucketAuthArgs, BitbucketRepoType}, gitea::GiteaRepoType, github::{GitCloneMode, GitHistoryMode, GitHubRepoType}, @@ -85,6 +86,12 @@ fn run_skiplist(skip_regex: Vec, skip_skipword: Vec) -> Result anyhow::Result<()> { dir.close()?; Ok(()) } + +// Ensure tree-sitter based decoding works even when the standalone base64 scanner is disabled +#[test] +fn detects_base64_in_code_with_tree_sitter() -> anyhow::Result<()> { + let dir = tempdir()?; + let file_path = dir.path().join("secret.py"); + // Base64 for ghp_1wuHFikBKQtCcH3EB2FBUkyn8krXhP2qLqPa + let encoded = "Z2hwXzF3dUhGaWtCS1F0Q2NIM0VCMkZCVWt5bjhrclhoUDJxTHFQYQ=="; + fs::write(&file_path, format!("token = \"{}\"\n", encoded))?; + + Command::cargo_bin("kingfisher")? + .args([ + "scan", + dir.path().to_str().unwrap(), + "--no-binary", + "--confidence=low", + "--format", + "json", + "--no-validate", + "--no-update-check", + ]) + .assert() + .code(200) + .stdout( + predicate::str::contains("ghp_1wuHFikBKQtCcH3EB2FBUkyn8krXhP2qLqPa") + .and(predicate::str::contains("\"encoding\": \"base64\"")), + ); + + dir.close()?; + Ok(()) +} diff --git a/tests/int_bitbucket.rs b/tests/int_bitbucket.rs index 092ab19..373f11b 100644 --- a/tests/int_bitbucket.rs +++ b/tests/int_bitbucket.rs @@ -7,6 +7,7 @@ use anyhow::{Context, Result}; use kingfisher::{ cli::{ commands::{ + azure::AzureRepoType, bitbucket::{BitbucketAuthArgs, BitbucketRepoType}, gitea::GiteaRepoType, github::{GitCloneMode, GitHistoryMode, GitHubRepoType}, @@ -83,6 +84,13 @@ fn test_bitbucket_remote_scan() -> Result<()> { bitbucket_repo_type: BitbucketRepoType::Source, bitbucket_auth: BitbucketAuthArgs::default(), + azure_organization: Vec::new(), + azure_project: Vec::new(), + azure_exclude: Vec::new(), + all_azure_projects: false, + azure_base_url: Url::parse("https://dev.azure.com/")?, + azure_repo_type: AzureRepoType::Source, + jira_url: None, jql: None, confluence_url: None, diff --git a/tests/int_dedup.rs b/tests/int_dedup.rs index 0e243f8..cd83a7f 100644 --- a/tests/int_dedup.rs +++ b/tests/int_dedup.rs @@ -11,6 +11,7 @@ use anyhow::Result; use kingfisher::{ cli::{ commands::{ + azure::AzureRepoType, bitbucket::{BitbucketAuthArgs, BitbucketRepoType}, gitea::GiteaRepoType, github::{GitCloneMode, GitHistoryMode, GitHubRepoType}, @@ -100,6 +101,13 @@ rules: bitbucket_repo_type: BitbucketRepoType::Source, bitbucket_auth: BitbucketAuthArgs::default(), + azure_organization: Vec::new(), + azure_project: Vec::new(), + azure_exclude: Vec::new(), + all_azure_projects: false, + azure_base_url: Url::parse("https://dev.azure.com/").unwrap(), + azure_repo_type: AzureRepoType::Source, + jira_url: None, jql: None, confluence_url: None, diff --git a/tests/int_github.rs b/tests/int_github.rs index 180a441..06c67a7 100644 --- a/tests/int_github.rs +++ b/tests/int_github.rs @@ -8,6 +8,7 @@ use anyhow::{Context, Result}; use kingfisher::{ cli::{ commands::{ + azure::AzureRepoType, bitbucket::{BitbucketAuthArgs, BitbucketRepoType}, gitea::GiteaRepoType, github::{GitCloneMode, GitHistoryMode, GitHubRepoType}, @@ -87,6 +88,13 @@ fn test_github_remote_scan() -> Result<()> { bitbucket_repo_type: BitbucketRepoType::Source, bitbucket_auth: BitbucketAuthArgs::default(), + azure_organization: Vec::new(), + azure_project: Vec::new(), + azure_exclude: Vec::new(), + all_azure_projects: false, + azure_base_url: Url::parse("https://dev.azure.com/").unwrap(), + azure_repo_type: AzureRepoType::Source, + jira_url: None, jql: None, confluence_url: None, diff --git a/tests/int_gitlab.rs b/tests/int_gitlab.rs index d295660..e55655a 100644 --- a/tests/int_gitlab.rs +++ b/tests/int_gitlab.rs @@ -8,6 +8,7 @@ use anyhow::{Context, Result}; use kingfisher::{ cli::{ commands::{ + azure::AzureRepoType, bitbucket::{BitbucketAuthArgs, BitbucketRepoType}, gitea::GiteaRepoType, github::{GitCloneMode, GitHistoryMode, GitHubRepoType}, @@ -86,6 +87,13 @@ fn test_gitlab_remote_scan() -> Result<()> { bitbucket_repo_type: BitbucketRepoType::Source, bitbucket_auth: BitbucketAuthArgs::default(), + azure_organization: Vec::new(), + azure_project: Vec::new(), + azure_exclude: Vec::new(), + all_azure_projects: false, + azure_base_url: Url::parse("https://dev.azure.com/")?, + azure_repo_type: AzureRepoType::Source, + jira_url: None, jql: None, confluence_url: None, @@ -216,6 +224,13 @@ fn test_gitlab_remote_scan_no_history() -> Result<()> { bitbucket_repo_type: BitbucketRepoType::Source, bitbucket_auth: BitbucketAuthArgs::default(), + azure_organization: Vec::new(), + azure_project: Vec::new(), + azure_exclude: Vec::new(), + all_azure_projects: false, + azure_base_url: Url::parse("https://dev.azure.com/")?, + azure_repo_type: AzureRepoType::Source, + jira_url: None, jql: None, confluence_url: None, diff --git a/tests/int_redact.rs b/tests/int_redact.rs index 1e7f9b5..48247af 100644 --- a/tests/int_redact.rs +++ b/tests/int_redact.rs @@ -8,6 +8,7 @@ use anyhow::Result; use kingfisher::{ cli::{ commands::{ + azure::AzureRepoType, bitbucket::{BitbucketAuthArgs, BitbucketRepoType}, gitea::GiteaRepoType, github::{GitCloneMode, GitHistoryMode, GitHubRepoType}, @@ -68,6 +69,12 @@ async fn test_redact_hashes_finding_values() -> Result<()> { bitbucket_api_url: Url::parse("https://api.bitbucket.org/2.0/").unwrap(), bitbucket_repo_type: BitbucketRepoType::Source, bitbucket_auth: BitbucketAuthArgs::default(), + azure_organization: Vec::new(), + azure_project: Vec::new(), + azure_exclude: Vec::new(), + all_azure_projects: false, + azure_base_url: Url::parse("https://dev.azure.com/").unwrap(), + azure_repo_type: AzureRepoType::Source, jira_url: None, jql: None, confluence_url: None, diff --git a/tests/int_slack.rs b/tests/int_slack.rs index d7b3118..2575a3c 100644 --- a/tests/int_slack.rs +++ b/tests/int_slack.rs @@ -7,6 +7,7 @@ use anyhow::Result; use kingfisher::{ cli::{ commands::{ + azure::AzureRepoType, bitbucket::{BitbucketAuthArgs, BitbucketRepoType}, gitea::GiteaRepoType, github::{GitCloneMode, GitHistoryMode, GitHubRepoType}, @@ -75,6 +76,12 @@ impl TestContext { bitbucket_api_url: Url::parse("https://api.bitbucket.org/2.0/").unwrap(), bitbucket_repo_type: BitbucketRepoType::Source, bitbucket_auth: BitbucketAuthArgs::default(), + azure_organization: Vec::new(), + azure_project: Vec::new(), + azure_exclude: Vec::new(), + all_azure_projects: false, + azure_base_url: Url::parse("https://dev.azure.com/").unwrap(), + azure_repo_type: AzureRepoType::Source, jira_url: None, jql: None, confluence_url: None, @@ -191,6 +198,12 @@ async fn test_scan_slack_messages() -> Result<()> { bitbucket_api_url: Url::parse("https://api.bitbucket.org/2.0/").unwrap(), bitbucket_repo_type: BitbucketRepoType::Source, bitbucket_auth: BitbucketAuthArgs::default(), + azure_organization: Vec::new(), + azure_project: Vec::new(), + azure_exclude: Vec::new(), + all_azure_projects: false, + azure_base_url: Url::parse("https://dev.azure.com/").unwrap(), + azure_repo_type: AzureRepoType::Source, jira_url: None, jql: None, confluence_url: None, diff --git a/tests/int_validation_cache.rs b/tests/int_validation_cache.rs index 28c7bda..ea1c809 100644 --- a/tests/int_validation_cache.rs +++ b/tests/int_validation_cache.rs @@ -11,6 +11,7 @@ use anyhow::Result; use kingfisher::{ cli::{ commands::{ + azure::AzureRepoType, bitbucket::{BitbucketAuthArgs, BitbucketRepoType}, gitea::GiteaRepoType, github::{GitCloneMode, GitHistoryMode, GitHubRepoType}, @@ -143,6 +144,13 @@ async fn test_validation_cache_and_depvars() -> Result<()> { bitbucket_repo_type: BitbucketRepoType::Source, bitbucket_auth: BitbucketAuthArgs::default(), + azure_organization: Vec::new(), + azure_project: Vec::new(), + azure_exclude: Vec::new(), + all_azure_projects: false, + azure_base_url: Url::parse("https://dev.azure.com/").unwrap(), + azure_repo_type: AzureRepoType::Source, + jira_url: None, jql: None, confluence_url: None, diff --git a/tests/int_vulnerable_files.rs b/tests/int_vulnerable_files.rs index 6141037..b87d721 100644 --- a/tests/int_vulnerable_files.rs +++ b/tests/int_vulnerable_files.rs @@ -9,6 +9,7 @@ use anyhow::{Context, Result}; use kingfisher::{ cli::{ commands::{ + azure::AzureRepoType, bitbucket::{BitbucketAuthArgs, BitbucketRepoType}, gitea::GiteaRepoType, github::{GitCloneMode, GitHistoryMode, GitHubRepoType}, @@ -86,6 +87,13 @@ impl TestContext { bitbucket_repo_type: BitbucketRepoType::Source, bitbucket_auth: BitbucketAuthArgs::default(), + azure_organization: Vec::new(), + azure_project: Vec::new(), + azure_exclude: Vec::new(), + all_azure_projects: false, + azure_base_url: Url::parse("https://dev.azure.com/").unwrap(), + azure_repo_type: AzureRepoType::Source, + jira_url: None, jql: None, confluence_url: None, @@ -189,6 +197,13 @@ impl TestContext { bitbucket_repo_type: BitbucketRepoType::Source, bitbucket_auth: BitbucketAuthArgs::default(), + azure_organization: Vec::new(), + azure_project: Vec::new(), + azure_exclude: Vec::new(), + all_azure_projects: false, + azure_base_url: Url::parse("https://dev.azure.com/").unwrap(), + azure_repo_type: AzureRepoType::Source, + jira_url: None, jql: None, confluence_url: None,