Added support for Gitea

This commit is contained in:
Mick Grove 2025-09-23 13:07:45 -07:00
commit 04bb3b74d0
24 changed files with 865 additions and 15 deletions

View file

@ -2,6 +2,9 @@
All notable changes to this project will be documented in this file.
## [v1.54.0]
- Added first-class Gitea support, including CLI commands, environment-based authentication, documentation, and integration with scans and repository enumeration.
## [v1.53.0]
- Added first-class Bitbucket support, including CLI commands, authentication helpers, documentation, and integration testing.

View file

@ -10,7 +10,7 @@ publish = false
[package]
name = "kingfisher"
version = "1.53.0"
version = "1.54.0"
description = "MongoDB's blazingly fast secret scanning and validation tool"
edition.workspace = true
rust-version.workspace = true

View file

@ -8,15 +8,15 @@
Kingfisher is a blazingly fast secretscanning and live validation tool built in Rust. It combines Intels hardwareaccelerated Hyperscan regex engine with languageaware parsing via TreeSitter, and **ships with hundreds of builtin rules** to detect, validate, and triage secrets before they ever reach production
</p>
Originally forked from Praetorians Nosey Parker, Kingfisher adds live cloud-API validation; many more targets (GitLab, S3, Docker, Jira, Confluence, Slack); compressed-file extraction and scanning; baseline and allowlist controls; language-aware detection (~20 languages); and a native Windows binary. See [Origins and Divergence](#origins-and-divergence) for details.
Originally forked from Praetorians Nosey Parker, Kingfisher **adds** live cloud-API validation; many more targets (GitLab, BitBucket, Gitea, S3, Docker, Jira, Confluence, Slack); compressed-file extraction and scanning; baseline and allowlist controls; language-aware detection (~20 languages); and a native Windows binary. See [Origins and Divergence](#origins-and-divergence) for details.
## Key Features
- **Performance**: multithreaded, Hyperscanpowered scanning built for huge codebases
- **Extensible rules**: hundreds of built-in detectors plus YAML-defined custom rules ([docs/RULES.md](/docs/RULES.md))
- **Broad AI SaaS coverage**: finds and validates tokens for OpenAI, Anthropic, Google Gemini, Cohere, Mistral, Stability AI, Replicate, xAI (Grok), Ollama, Langchain, Perplexity, Weights & Biases, Cerebras, Friendli, Fireworks.ai, NVIDIA NIM, Together.ai, Zhipu, and many more
- **Multiple targets**:
- **Git history**: local repos or GitHub/GitLab/Bitbucket orgs, users, and workspaces
- **Repository artifacts**: with `--repo-artifacts`, scan GitHub/GitLab/Bitbucket repository artifacts such as issues, pull/merge requests, wikis, snippets, and owner gists in addition to code
- **Git history**: local repos or GitHub/GitLab/Gitea/Bitbucket orgs, users, and workspaces
- **Repository artifacts**: with `--repo-artifacts`, scan GitHub/GitLab/Bitbucket repository artifacts such as issues, pull/merge requests, wikis, snippets, and owner gists in addition to code (Gitea wikis are also cloned when available)
- **Docker images**: public or private via `--docker-image`
- **Jira issues**: JQLdriven scans with `--jira-url` and `--jql`
- **Confluence pages**: CQLdriven scans with `--confluence-url` and `--cql`
@ -71,6 +71,12 @@ See ([docs/COMPARISON.md](docs/COMPARISON.md))
- [Skip specific GitLab projects during enumeration](#skip-specific-gitlab-projects-during-enumeration)
- [Scan remote GitLab repository by URL](#scan-remote-gitlab-repository-by-url)
- [List GitLab repositories](#list-gitlab-repositories)
- [Scanning Gitea](#scanning-gitea)
- [Scan Gitea organization (requires `KF_GITEA_TOKEN`)](#scan-gitea-organization-requires-kf_gitea_token)
- [Scan Gitea user](#scan-gitea-user)
- [Skip specific Gitea repositories during enumeration](#skip-specific-gitea-repositories-during-enumeration)
- [Scan remote Gitea repository by URL](#scan-remote-gitea-repository-by-url)
- [List Gitea repositories](#list-gitea-repositories)
- [Scanning Bitbucket](#scanning-bitbucket)
- [Scan Bitbucket workspace](#scan-bitbucket-workspace)
- [Scan Bitbucket user](#scan-bitbucket-user)
@ -560,6 +566,59 @@ kingfisher gitlab repos list --group my-group --include-subgroups
kingfisher gitlab repos list --group my-group --gitlab-exclude my-group/**/legacy-*
```
## Scanning Gitea
### Scan Gitea organization (requires `KF_GITEA_TOKEN`)
```bash
kingfisher scan --gitea-organization my-org
# self-hosted example
KF_GITEA_TOKEN="gtoken" kingfisher scan --gitea-organization platform --gitea-api-url https://gitea.internal.example/api/v1/
```
### Scan Gitea user
```bash
kingfisher scan --gitea-user johndoe
```
### Skip specific Gitea repositories during enumeration
Repeat `--gitea-exclude` for each repository you want to ignore when scanning users
or organizations. Accepts `owner/repo` identifiers or gitignore-style glob patterns
like `team/**/archive-*`.
```bash
kingfisher scan --gitea-organization my-org \
--gitea-exclude my-org/legacy-repo \
--gitea-exclude my-org/**/archive-*
```
### Scan remote Gitea repository by URL
`--git-url` clones the repository and scans its history. Adding `--repo-artifacts`
also clones the repository wiki if one exists. Private repositories and wikis
require `KF_GITEA_TOKEN` (and `KF_GITEA_USERNAME` when cloning via HTTPS).
```bash
# Scan the repository only
kingfisher scan --git-url https://gitea.com/org/repo.git
# Include the repository wiki (if present)
KF_GITEA_TOKEN="gtoken" KF_GITEA_USERNAME="org" \
kingfisher scan --git-url https://gitea.com/org/repo.git --repo-artifacts
```
### List Gitea repositories
```bash
kingfisher gitea repos list --gitea-organization my-org
# enumerate every organization visible to the authenticated user
KF_GITEA_TOKEN="gtoken" kingfisher gitea repos list --all-gitea-organizations
# self-hosted example
KF_GITEA_TOKEN="gtoken" kingfisher gitea repos list --user johndoe --gitea-api-url https://gitea.internal.example/api/v1/
```
## Scanning Bitbucket
### Scan Bitbucket workspace
@ -700,6 +759,8 @@ KF_SLACK_TOKEN="xoxp-1234..." kingfisher scan \
| ----------------- | ---------------------------- |
| `KF_GITHUB_TOKEN` | GitHub Personal Access Token |
| `KF_GITLAB_TOKEN` | GitLab Personal Access Token |
| `KF_GITEA_TOKEN` | Gitea Personal Access Token |
| `KF_GITEA_USERNAME` | Username for private Gitea clones (used with `KF_GITEA_TOKEN`) |
| `KF_BITBUCKET_USERNAME` | Bitbucket username for basic authentication |
| `KF_BITBUCKET_APP_PASSWORD` / `KF_BITBUCKET_TOKEN` | Bitbucket app password or server token |
| `KF_BITBUCKET_OAUTH_TOKEN` | Bitbucket OAuth or PAT token |

96
src/cli/commands/gitea.rs Normal file
View file

@ -0,0 +1,96 @@
use clap::{Args, Subcommand, ValueEnum, ValueHint};
use strum_macros::Display;
use url::Url;
use crate::cli::commands::output::OutputArgs;
use super::github::GitHubOutputFormat;
/// Top-level Gitea command group
#[derive(Args, Debug)]
pub struct GiteaArgs {
#[command(subcommand)]
pub command: GiteaCommand,
/// Override Gitea API URL (e.g. self-hosted)
#[arg(global = true, long, default_value = "https://gitea.com/api/v1/", value_hint = ValueHint::Url)]
pub gitea_api_url: Url,
}
#[derive(Subcommand, Debug)]
pub enum GiteaCommand {
/// Interact with Gitea repositories
#[command(subcommand)]
Repos(GiteaReposCommand),
}
#[derive(Subcommand, Debug)]
pub enum GiteaReposCommand {
/// List repositories for a user or organization
List(GiteaReposListArgs),
}
/// `kingfisher gitea repos`
#[derive(Args, Debug, Clone)]
pub struct GiteaReposListArgs {
#[command(flatten)]
pub repo_specifiers: GiteaRepoSpecifiers,
#[command(flatten)]
pub output_args: OutputArgs<GiteaOutputFormat>,
}
/// Options for selecting Gitea repos
#[derive(Args, Debug, Clone)]
pub struct GiteaRepoSpecifiers {
/// Repositories belonging to these users
#[arg(long, alias = "gitea-user")]
pub user: Vec<String>,
/// Repositories belonging to these organizations
#[arg(long, alias = "org", alias = "gitea-organization", alias = "gitea-org")]
pub organization: Vec<String>,
/// Skip repositories when enumerating Gitea users or organizations (format: owner/repo)
#[arg(long = "gitea-exclude", alias = "gitea-exclude-repo", value_name = "OWNER/REPO")]
pub exclude_repos: Vec<String>,
/// Repositories for all organizations accessible to the authenticated user
#[arg(long, alias = "all-gitea-organizations", alias = "all-gitea-orgs")]
pub all_organizations: bool,
/// Filter by repository type
#[arg(long, default_value_t = GiteaRepoType::Source, alias = "gitea-repo-type")]
pub repo_type: GiteaRepoType,
}
impl GiteaRepoSpecifiers {
pub fn is_empty(&self) -> bool {
self.user.is_empty() && self.organization.is_empty() && !self.all_organizations
}
}
/// Gitea repository type filter
#[derive(Copy, Clone, Debug, Display, PartialEq, Eq, PartialOrd, Ord, ValueEnum)]
#[strum(serialize_all = "kebab-case")]
pub enum GiteaRepoType {
/// Only source repositories (not forks)
Source,
/// Only fork repositories
#[value(alias = "forks")]
Fork,
/// Include all repositories
All,
}
pub type GiteaOutputFormat = GitHubOutputFormat;
impl From<GiteaRepoType> for crate::gitea::RepoType {
fn from(val: GiteaRepoType) -> Self {
match val {
GiteaRepoType::Source => crate::gitea::RepoType::Source,
GiteaRepoType::Fork => crate::gitea::RepoType::Fork,
GiteaRepoType::All => crate::gitea::RepoType::All,
}
}
}

View file

@ -6,6 +6,7 @@ use url::Url;
use crate::{
cli::commands::{
bitbucket::{BitbucketAuthArgs, BitbucketRepoType},
gitea::GiteaRepoType,
github::{GitCloneMode, GitHistoryMode, GitHubRepoType},
gitlab::GitLabRepoType,
},
@ -24,12 +25,15 @@ pub struct InputSpecifierArgs {
"github_organization",
"gitlab_user",
"gitlab_group",
"gitea_user",
"gitea_organization",
"bitbucket_user",
"bitbucket_workspace",
"bitbucket_project",
"git_url",
"all_github_organizations",
"all_gitlab_groups",
"all_gitea_organizations",
"all_bitbucket_workspaces",
"jira_url",
"confluence_url",
@ -112,6 +116,35 @@ pub struct InputSpecifierArgs {
#[arg(long, alias = "include-subgroups")]
pub gitlab_include_subgroups: bool,
// Gitea Options
/// Scan repositories belonging to the specified Gitea user
#[arg(long)]
pub gitea_user: Vec<String>,
/// Scan repositories belonging to the specified Gitea organization
#[arg(long, alias = "gitea-org")]
pub gitea_organization: Vec<String>,
/// Skip repositories when enumerating Gitea users or organizations (format: owner/repo)
#[arg(long = "gitea-exclude", alias = "gitea-exclude-repo", value_name = "OWNER/REPO")]
pub gitea_exclude: Vec<String>,
/// Scan repositories from all accessible Gitea organizations (requires KF_GITEA_TOKEN)
#[arg(long, alias = "all-gitea-orgs")]
pub all_gitea_organizations: bool,
/// Use the specified URL for Gitea API access (e.g. for self-hosted instances)
#[arg(
long,
alias="gitea-api-url",
default_value = "https://gitea.com/api/v1/",
value_hint = ValueHint::Url
)]
pub gitea_api_url: Url,
#[arg(long, default_value_t = GiteaRepoType::Source)]
pub gitea_repo_type: GiteaRepoType,
// Bitbucket Options
/// Scan repositories belonging to the specified Bitbucket users
#[arg(long)]

View file

@ -1,4 +1,5 @@
pub mod bitbucket;
pub mod gitea;
pub mod github;
pub mod gitlab;
pub mod inputs;

View file

@ -7,8 +7,8 @@ use sysinfo::{MemoryRefreshKind, RefreshKind, System};
use tracing::Level;
use crate::cli::commands::{
bitbucket::BitbucketArgs, github::GitHubArgs, gitlab::GitLabArgs, rules::RulesArgs,
scan::ScanArgs,
bitbucket::BitbucketArgs, gitea::GiteaArgs, github::GitHubArgs, gitlab::GitLabArgs,
rules::RulesArgs, scan::ScanArgs,
};
#[deny(missing_docs)]
@ -69,6 +69,10 @@ pub enum Command {
#[command(name = "gitlab")]
GitLab(GitLabArgs),
/// Interact with the Gitea API
#[command(name = "gitea")]
Gitea(GiteaArgs),
/// Interact with the Bitbucket API
#[command(name = "bitbucket")]
Bitbucket(BitbucketArgs),

View file

@ -23,6 +23,14 @@ const BITBUCKET_CREDENTIAL_HELPER: &str = r#"credential.helper=!_bbcreds() {
fi
}; _bbcreds"#;
const GITEA_CREDENTIAL_HELPER: &str = r#"credential.helper=!_gteacreds() {
if [ -n "$KF_GITEA_TOKEN" ]; then
user="${KF_GITEA_USERNAME:-gitea}";
echo username="$user";
echo password="$KF_GITEA_TOKEN";
fi
}; _gteacreds"#;
/// Represents errors that can occur when interacting with the `git` CLI.
#[derive(Debug, thiserror::Error)]
pub enum GitError {
@ -40,7 +48,7 @@ pub enum GitError {
/// A helper struct for running `git` commands.
///
/// It supports optional GitHub, GitLab, and Bitbucket credentials passed via
/// It supports optional GitHub, GitLab, Gitea, and Bitbucket credentials passed via
/// environment variables and optionally ignores TLS certificate validation if
/// requested.
pub struct Git {
@ -59,6 +67,8 @@ impl Git {
matches!(std::env::var("KF_GITHUB_TOKEN"), Ok(token) if !token.is_empty());
let has_gitlab_token =
matches!(std::env::var("KF_GITLAB_TOKEN"), Ok(token) if !token.is_empty());
let has_gitea_token =
matches!(std::env::var("KF_GITEA_TOKEN"), Ok(token) if !token.is_empty());
let has_bitbucket_username =
matches!(std::env::var("KF_BITBUCKET_USERNAME"), Ok(value) if !value.is_empty());
let has_bitbucket_password =
@ -71,7 +81,7 @@ impl Git {
has_bitbucket_oauth_token || (has_bitbucket_username && has_bitbucket_password);
// If credentials are provided via environment variables, clear existing helpers first.
if has_github_token || has_gitlab_token || has_bitbucket_credentials {
if has_github_token || has_gitlab_token || has_gitea_token || has_bitbucket_credentials {
credentials.push("-c".into());
credentials.push(r#"credential.helper="#.into());
}
@ -92,6 +102,12 @@ impl Git {
);
}
// Inject Gitea token helper
if has_gitea_token {
credentials.push("-c".into());
credentials.push(GITEA_CREDENTIAL_HELPER.into());
}
// Inject Bitbucket credential helper for OAuth tokens or basic auth.
if has_bitbucket_credentials {
credentials.push("-c".into());

View file

@ -64,8 +64,8 @@ impl TryFrom<Url> for GitUrl {
type Error = &'static str;
fn try_from(url: Url) -> Result<Self, Self::Error> {
if url.scheme() != "https"
|| url.host().is_none()
// if url.scheme() != "https"
if url.host().is_none()
|| !url.username().is_empty()
|| url.password().is_some()
|| url.query().is_some()

440
src/gitea.rs Normal file
View file

@ -0,0 +1,440 @@
use std::{collections::HashSet, env, str::FromStr, time::Duration};
use anyhow::{anyhow, Result};
use globset::{Glob, GlobSet, GlobSetBuilder};
use indicatif::{ProgressBar, ProgressStyle};
use reqwest::StatusCode;
use serde::Deserialize;
use tracing::warn;
use url::Url;
use crate::{git_url::GitUrl, validation::GLOBAL_USER_AGENT};
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum RepoType {
All,
Source,
Fork,
}
impl RepoType {
fn allows(self, is_fork: bool) -> bool {
match self {
RepoType::All => true,
RepoType::Source => !is_fork,
RepoType::Fork => is_fork,
}
}
}
#[derive(Debug, Clone)]
pub struct RepoSpecifiers {
pub user: Vec<String>,
pub organization: Vec<String>,
pub all_organizations: bool,
pub repo_filter: RepoType,
pub exclude_repos: Vec<String>,
}
impl RepoSpecifiers {
pub fn is_empty(&self) -> bool {
self.user.is_empty() && self.organization.is_empty() && !self.all_organizations
}
}
#[derive(Debug, Deserialize)]
struct GiteaRepository {
full_name: String,
clone_url: String,
#[serde(default)]
fork: bool,
}
#[derive(Debug, Deserialize)]
struct GiteaOrganization {
username: String,
}
struct ExcludeMatcher {
exact: HashSet<String>,
globs: Option<GlobSet>,
}
impl ExcludeMatcher {
fn matches(&self, name: &str) -> bool {
if self.exact.contains(name) {
return true;
}
if let Some(globs) = &self.globs {
return globs.is_match(name);
}
false
}
fn is_empty(&self) -> bool {
self.exact.is_empty() && self.globs.is_none()
}
}
fn looks_like_glob(pattern: &str) -> bool {
pattern.contains('*') || pattern.contains('?') || pattern.contains('[')
}
fn normalize_repo_identifier(raw: &str) -> Option<String> {
let trimmed = raw.trim().trim_matches('/');
if trimmed.is_empty() {
return None;
}
let without_git = trimmed.strip_suffix(".git").unwrap_or(trimmed);
let mut parts = without_git.split('/').filter(|segment| !segment.is_empty());
let owner = parts.next()?;
let repo = parts.next()?;
Some(format!("{}/{}", owner.to_lowercase(), repo.to_lowercase()))
}
fn parse_excluded_repo(raw: &str) -> Option<String> {
let trimmed = raw.trim();
if trimmed.is_empty() {
return None;
}
if let Ok(url) = Url::parse(trimmed) {
if let Some(name) = normalize_repo_identifier(url.path()) {
return Some(name);
}
}
if let Some(idx) = trimmed.rfind(':') {
if let Some(name) = normalize_repo_identifier(&trimmed[idx + 1..]) {
return Some(name);
}
}
normalize_repo_identifier(trimmed)
}
fn build_exclude_matcher(excludes: &[String]) -> ExcludeMatcher {
let mut exact = HashSet::new();
let mut glob_builder = GlobSetBuilder::new();
let mut has_glob = false;
for raw in excludes {
match parse_excluded_repo(raw) {
Some(name) => {
if looks_like_glob(&name) {
match Glob::new(&name) {
Ok(glob) => {
glob_builder.add(glob);
has_glob = true;
}
Err(err) => {
warn!("Ignoring invalid Gitea exclusion pattern '{raw}': {err}");
exact.insert(name);
}
}
} else {
exact.insert(name);
}
}
None => {
warn!("Ignoring invalid Gitea exclusion '{raw}' (expected owner/repo)");
}
}
}
let globs = if has_glob {
match glob_builder.build() {
Ok(set) => Some(set),
Err(err) => {
warn!("Failed to build Gitea exclusion patterns: {err}");
None
}
}
} else {
None
};
ExcludeMatcher { exact, globs }
}
fn should_exclude_repo(repo: &GiteaRepository, excludes: &ExcludeMatcher) -> bool {
if excludes.is_empty() {
return false;
}
excludes.matches(&repo.full_name.to_lowercase())
}
async fn fetch_paginated_repos(
client: &reqwest::Client,
token: Option<&str>,
mut url: Url,
repo_filter: RepoType,
excludes: &ExcludeMatcher,
progress: Option<&ProgressBar>,
) -> Result<Vec<String>> {
let mut page = 1u32;
let mut repos = Vec::new();
loop {
url.query_pairs_mut()
.clear()
.append_pair("page", &page.to_string())
.append_pair("limit", "50");
if let Some(pb) = progress {
pb.set_message(format!("Fetching Gitea repositories (page {page})"));
}
let mut req = client.get(url.clone()).header("User-Agent", GLOBAL_USER_AGENT.as_str());
if let Some(token) = token {
req = req.header("Authorization", format!("token {token}"));
}
let resp = req.send().await?;
match resp.status() {
StatusCode::OK => {}
StatusCode::NOT_FOUND => {
warn!("Gitea endpoint {} returned 404", url);
break;
}
status => {
return Err(anyhow!("Failed to fetch repositories from {} (status {status})", url));
}
}
let page_repos: Vec<GiteaRepository> = resp.json().await?;
if page_repos.is_empty() {
break;
}
for repo in page_repos {
if !repo_filter.allows(repo.fork) {
continue;
}
if should_exclude_repo(&repo, excludes) {
continue;
}
repos.push(repo.clone_url);
}
page += 1;
}
Ok(repos)
}
async fn fetch_user_repos(
client: &reqwest::Client,
token: Option<&str>,
api_url: &Url,
username: &str,
repo_filter: RepoType,
excludes: &ExcludeMatcher,
progress: Option<&ProgressBar>,
) -> Result<Vec<String>> {
let endpoint = format!("users/{}/repos", username);
let url = api_url.join(&endpoint)?;
fetch_paginated_repos(client, token, url, repo_filter, excludes, progress).await
}
async fn fetch_org_repos(
client: &reqwest::Client,
token: Option<&str>,
api_url: &Url,
org: &str,
repo_filter: RepoType,
excludes: &ExcludeMatcher,
progress: Option<&ProgressBar>,
) -> Result<Vec<String>> {
let endpoint = format!("orgs/{}/repos", org);
let url = api_url.join(&endpoint)?;
fetch_paginated_repos(client, token, url, repo_filter, excludes, progress).await
}
async fn fetch_authenticated_orgs(
client: &reqwest::Client,
token: Option<&str>,
api_url: &Url,
) -> Result<Vec<String>> {
let Some(token) = token else {
return Err(anyhow!("KF_GITEA_TOKEN must be set to enumerate all organizations"));
};
let url = api_url.join("user/orgs")?;
let resp = client
.get(url.clone())
.header("User-Agent", GLOBAL_USER_AGENT.as_str())
.header("Authorization", format!("token {token}"))
.send()
.await?;
match resp.status() {
StatusCode::OK => {}
StatusCode::NOT_FOUND => {
warn!("Gitea endpoint {} returned 404", url);
return Ok(Vec::new());
}
status => {
return Err(anyhow!(
"Failed to enumerate organizations from {} (status {status})",
url
));
}
}
let orgs: Vec<GiteaOrganization> = resp.json().await?;
Ok(orgs.into_iter().map(|org| org.username).collect())
}
pub async fn enumerate_repo_urls(
specifiers: &RepoSpecifiers,
api_url: Url,
ignore_certs: bool,
mut progress: Option<&mut ProgressBar>,
) -> Result<Vec<String>> {
let excludes = build_exclude_matcher(&specifiers.exclude_repos);
let client = reqwest::Client::builder()
.timeout(Duration::from_secs(30))
.danger_accept_invalid_certs(ignore_certs)
.build()?;
let token = env::var("KF_GITEA_TOKEN").ok().filter(|t| !t.is_empty());
let mut repos = Vec::new();
let mut seen = HashSet::new();
for user in &specifiers.user {
if let Some(pb) = progress.as_mut() {
pb.set_message(format!("Enumerating Gitea user {user}"));
}
match fetch_user_repos(
&client,
token.as_deref(),
&api_url,
user,
specifiers.repo_filter,
&excludes,
progress.as_deref(),
)
.await
{
Ok(mut urls) => {
for url in urls.drain(..) {
if seen.insert(url.clone()) {
repos.push(url);
}
}
}
Err(err) => {
warn!("Failed to enumerate Gitea repositories for user {user}: {err}");
}
}
}
let mut organizations = specifiers.organization.clone();
if specifiers.all_organizations {
match fetch_authenticated_orgs(&client, token.as_deref(), &api_url).await {
Ok(mut orgs) => organizations.append(&mut orgs),
Err(err) => warn!("Failed to enumerate Gitea organizations: {err}"),
}
}
organizations.sort();
organizations.dedup();
for org in organizations {
if let Some(pb) = progress.as_mut() {
pb.set_message(format!("Enumerating Gitea organization {org}"));
}
match fetch_org_repos(
&client,
token.as_deref(),
&api_url,
&org,
specifiers.repo_filter,
&excludes,
progress.as_deref(),
)
.await
{
Ok(mut urls) => {
for url in urls.drain(..) {
if seen.insert(url.clone()) {
repos.push(url);
}
}
}
Err(err) => {
warn!("Failed to enumerate Gitea repositories for organization {org}: {err}");
}
}
}
repos.sort();
repos.dedup();
Ok(repos)
}
pub async fn list_repositories(
api_url: Url,
ignore_certs: bool,
progress_enabled: bool,
users: &[String],
orgs: &[String],
all_orgs: bool,
exclude_repos: &[String],
repo_filter: RepoType,
) -> Result<()> {
let mut progress = if progress_enabled {
let style = ProgressStyle::with_template("{spinner} {msg} [{elapsed_precise}]")
.expect("progress bar style template should compile");
let pb = ProgressBar::new_spinner().with_style(style).with_message("Fetching repositories");
pb.enable_steady_tick(Duration::from_millis(500));
pb
} else {
ProgressBar::hidden()
};
let specifiers = RepoSpecifiers {
user: users.to_vec(),
organization: orgs.to_vec(),
all_organizations: all_orgs,
repo_filter,
exclude_repos: exclude_repos.to_vec(),
};
let urls = enumerate_repo_urls(&specifiers, api_url, ignore_certs, Some(&mut progress)).await?;
for url in urls {
println!("{}", url);
}
progress.finish_and_clear();
Ok(())
}
fn parse_repo(repo_url: &GitUrl) -> Option<(String, String, String)> {
let url = Url::parse(repo_url.as_str()).ok()?;
let host = url.host_str()?.to_string();
let mut segments = url.path_segments()?;
let owner = segments.next()?.to_string();
let mut repo = segments.next()?.to_string();
if let Some(stripped) = repo.strip_suffix(".git") {
repo = stripped.to_string();
}
Some((host, owner, repo))
}
pub fn wiki_url(repo_url: &GitUrl) -> Option<GitUrl> {
let (host, owner, repo) = parse_repo(repo_url)?;
let url = format!("https://{host}/{owner}/{repo}.wiki.git");
GitUrl::from_str(&url).ok()
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn parse_excluded_repo_variants() {
assert_eq!(parse_excluded_repo("Owner/Repo").as_deref(), Some("owner/repo"));
assert_eq!(
parse_excluded_repo("https://gitea.example.com/Owner/Repo.git").as_deref(),
Some("owner/repo")
);
assert_eq!(
parse_excluded_repo("ssh://git@example.com:3000/Owner/Repo.git").as_deref(),
Some("owner/repo")
);
}
#[test]
fn normalize_repo_identifier_handles_git_suffix() {
assert_eq!(normalize_repo_identifier("owner/repo.git"), Some("owner/repo".into()));
}
}

View file

@ -17,6 +17,7 @@ pub mod git_commit_metadata;
pub mod git_metadata_graph;
mod git_repo_enumerator;
pub mod git_url;
pub mod gitea;
pub mod github;
pub mod gitlab;
pub mod jira;

View file

@ -52,7 +52,7 @@ use kingfisher::{
},
findings_store,
findings_store::FindingsStore,
github,
gitea, github,
rule_loader::RuleLoader,
rules_database::RulesDatabase,
scanner::{load_and_record_rules, run_scan},
@ -72,6 +72,7 @@ use url::Url;
use crate::cli::commands::{
bitbucket::{BitbucketAuthArgs, BitbucketCommand, BitbucketRepoType, BitbucketReposCommand},
gitea::{GiteaCommand, GiteaRepoType, GiteaReposCommand},
gitlab::{GitLabCommand, GitLabRepoType, GitLabReposCommand},
};
@ -89,6 +90,7 @@ fn main() -> anyhow::Result<()> {
Command::GitHub(_) => num_cpus::get(), // Default for GitHub commands
Command::GitLab(_) => num_cpus::get(), // Default for GitLab commands
Command::Bitbucket(_) => num_cpus::get(), // Default for Bitbucket commands
Command::Gitea(_) => num_cpus::get(), // Default for Gitea commands
Command::Rules(_) => num_cpus::get(), // Default for Rules commands
};
@ -265,6 +267,23 @@ async fn async_main(args: CommandLineArgs) -> Result<()> {
}
},
},
Command::Gitea(gitea_args) => match gitea_args.command {
GiteaCommand::Repos(repos_command) => match repos_command {
GiteaReposCommand::List(list_args) => {
gitea::list_repositories(
gitea_args.gitea_api_url,
global_args.ignore_certs,
global_args.use_progress(),
&list_args.repo_specifiers.user,
&list_args.repo_specifiers.organization,
list_args.repo_specifiers.all_organizations,
&list_args.repo_specifiers.exclude_repos,
list_args.repo_specifiers.repo_type.into(),
)
.await?;
}
},
},
Command::Bitbucket(bitbucket_args) => match bitbucket_args.command {
BitbucketCommand::Repos(repos_command) => match repos_command {
BitbucketReposCommand::List(list_args) => {
@ -329,6 +348,13 @@ fn create_default_scan_args() -> cli::commands::scan::ScanArgs {
gitlab_repo_type: GitLabRepoType::All,
gitlab_include_subgroups: false,
gitea_user: Vec::new(),
gitea_organization: Vec::new(),
gitea_exclude: Vec::new(),
all_gitea_organizations: false,
gitea_api_url: Url::parse("https://gitea.com/api/v1/").unwrap(),
gitea_repo_type: GiteaRepoType::Source,
bitbucket_user: Vec::new(),
bitbucket_workspace: Vec::new(),
bitbucket_project: Vec::new(),

View file

@ -40,6 +40,7 @@ mod tests {
use crate::{
blob::BlobId,
cli::commands::bitbucket::{BitbucketAuthArgs, BitbucketRepoType},
cli::commands::gitea::GiteaRepoType,
cli::commands::github::GitHubRepoType,
cli::commands::inputs::ContentFilteringArgs,
cli::commands::inputs::InputSpecifierArgs,
@ -90,6 +91,15 @@ mod tests {
gitlab_api_url: Url::parse("https://gitlab.com/").unwrap(),
gitlab_repo_type: GitLabRepoType::All,
gitlab_include_subgroups: false,
// Gitea
gitea_user: Vec::new(),
gitea_organization: Vec::new(),
gitea_exclude: Vec::new(),
all_gitea_organizations: false,
gitea_api_url: Url::parse("https://gitea.com/api/v1/").unwrap(),
gitea_repo_type: GiteaRepoType::Source,
// Bitbucket
bitbucket_user: Vec::new(),
bitbucket_workspace: Vec::new(),

View file

@ -20,7 +20,7 @@ use crate::{
confluence, findings_store,
git_binary::{CloneMode, Git},
git_url::GitUrl,
github, gitlab, jira,
gitea, github, gitlab, jira,
matcher::{Match, Matcher, MatcherStats},
origin::{Origin, OriginSet},
rules_database::RulesDatabase,
@ -243,6 +243,68 @@ pub async fn enumerate_gitlab_repos(
Ok(repo_urls)
}
pub async fn enumerate_gitea_repos(
args: &scan::ScanArgs,
global_args: &global::GlobalArgs,
) -> Result<Vec<GitUrl>> {
let repo_specifiers = gitea::RepoSpecifiers {
user: args.input_specifier_args.gitea_user.clone(),
organization: args.input_specifier_args.gitea_organization.clone(),
all_organizations: args.input_specifier_args.all_gitea_organizations,
repo_filter: args.input_specifier_args.gitea_repo_type.into(),
exclude_repos: args.input_specifier_args.gitea_exclude.clone(),
};
let mut repo_urls = args.input_specifier_args.git_url.clone();
if !repo_specifiers.is_empty() {
let mut progress = if global_args.use_progress() {
let style =
ProgressStyle::with_template("{spinner} {msg} {human_len} [{elapsed_precise}]")
.expect("progress bar style template should compile");
let pb = ProgressBar::new_spinner()
.with_style(style)
.with_message("Enumerating Gitea repositories...");
pb.enable_steady_tick(Duration::from_millis(500));
pb
} else {
ProgressBar::hidden()
};
let mut num_found: u64 = 0;
let api_url = args.input_specifier_args.gitea_api_url.clone();
let repo_strings = gitea::enumerate_repo_urls(
&repo_specifiers,
api_url,
global_args.ignore_certs,
Some(&mut progress),
)
.await
.context("Failed to enumerate Gitea repositories")?;
for repo_string in repo_strings {
match GitUrl::from_str(&repo_string) {
Ok(repo_url) => {
repo_urls.push(repo_url);
num_found += 1;
}
Err(e) => {
progress.suspend(|| {
error!("Failed to parse repo URL from {repo_string}: {e}");
});
}
}
}
progress.finish_with_message(format!(
"Found {} repositories from Gitea",
HumanCount(num_found)
));
}
repo_urls.sort();
repo_urls.dedup();
Ok(repo_urls)
}
pub async fn enumerate_bitbucket_repos(
args: &scan::ScanArgs,
global_args: &global::GlobalArgs,

View file

@ -11,7 +11,7 @@ use crate::{
cli::{commands::scan, global},
findings_store,
findings_store::{FindingsStore, FindingsStoreMessage},
github, gitlab,
gitea, github, gitlab,
liquid_filters::register_all,
matcher::MatcherStats,
reporter::styles::Styles,
@ -23,8 +23,8 @@ use crate::{
clone_or_update_git_repos, enumerate_bitbucket_repos, enumerate_filesystem_inputs,
enumerate_github_repos,
repos::{
enumerate_gitlab_repos, fetch_confluence_pages, fetch_git_host_artifacts,
fetch_jira_issues, fetch_s3_objects, fetch_slack_messages,
enumerate_gitea_repos, enumerate_gitlab_repos, fetch_confluence_pages,
fetch_git_host_artifacts, fetch_jira_issues, fetch_s3_objects, fetch_slack_messages,
},
run_secret_validation, save_docker_images,
summary::print_scan_summary,
@ -73,10 +73,12 @@ pub async fn run_async_scan(
let mut repo_urls = enumerate_github_repos(args, global_args).await?;
let gitlab_repo_urls = enumerate_gitlab_repos(args, global_args).await?;
let gitea_repo_urls = enumerate_gitea_repos(args, global_args).await?;
let bitbucket_repo_urls = enumerate_bitbucket_repos(args, global_args).await?;
// Combine repository URLs
repo_urls.extend(gitlab_repo_urls);
repo_urls.extend(gitea_repo_urls);
repo_urls.extend(bitbucket_repo_urls);
repo_urls.sort();
repo_urls.dedup();
@ -91,6 +93,9 @@ pub async fn run_async_scan(
if let Some(w) = gitlab::wiki_url(url) {
wiki_urls.push(w);
}
if let Some(w) = gitea::wiki_url(url) {
wiki_urls.push(w);
}
if let Some(w) = bitbucket::wiki_url(url) {
wiki_urls.push(w);
}

View file

@ -8,6 +8,7 @@ use kingfisher::{
cli::{
commands::{
bitbucket::{BitbucketAuthArgs, BitbucketRepoType},
gitea::GiteaRepoType,
github::{GitCloneMode, GitHistoryMode, GitHubRepoType},
gitlab::GitLabRepoType,
inputs::{ContentFilteringArgs, InputSpecifierArgs},
@ -70,6 +71,12 @@ fn run_skiplist(skip_regex: Vec<String>, skip_skipword: Vec<String>) -> Result<u
gitlab_api_url: Url::parse("https://gitlab.com/").unwrap(),
gitlab_repo_type: GitLabRepoType::Owner,
gitlab_include_subgroups: false,
gitea_user: Vec::new(),
gitea_organization: Vec::new(),
gitea_exclude: Vec::new(),
all_gitea_organizations: false,
gitea_api_url: Url::parse("https://gitea.com/api/v1/").unwrap(),
gitea_repo_type: GiteaRepoType::Source,
bitbucket_user: Vec::new(),
bitbucket_workspace: Vec::new(),
bitbucket_project: Vec::new(),

View file

@ -8,6 +8,7 @@ use kingfisher::{
cli::{
commands::{
bitbucket::{BitbucketAuthArgs, BitbucketRepoType},
gitea::GiteaRepoType,
github::{GitCloneMode, GitHistoryMode, GitHubRepoType},
gitlab::GitLabRepoType,
inputs::{ContentFilteringArgs, InputSpecifierArgs},
@ -66,6 +67,13 @@ fn test_bitbucket_remote_scan() -> Result<()> {
gitlab_repo_type: GitLabRepoType::Owner,
gitlab_include_subgroups: false,
gitea_user: Vec::new(),
gitea_organization: Vec::new(),
gitea_exclude: Vec::new(),
all_gitea_organizations: false,
gitea_api_url: Url::parse("https://gitea.com/api/v1/")?,
gitea_repo_type: GiteaRepoType::Source,
bitbucket_user: Vec::new(),
bitbucket_workspace: Vec::new(),
bitbucket_project: Vec::new(),

View file

@ -12,6 +12,7 @@ use kingfisher::{
cli::{
commands::{
bitbucket::{BitbucketAuthArgs, BitbucketRepoType},
gitea::GiteaRepoType,
github::{GitCloneMode, GitHistoryMode, GitHubRepoType},
gitlab::GitLabRepoType,
inputs::{ContentFilteringArgs, InputSpecifierArgs},
@ -83,6 +84,13 @@ rules:
gitlab_repo_type: GitLabRepoType::Owner,
gitlab_include_subgroups: false,
gitea_user: Vec::new(),
gitea_organization: Vec::new(),
gitea_exclude: Vec::new(),
all_gitea_organizations: false,
gitea_api_url: Url::parse("https://gitea.com/api/v1/").unwrap(),
gitea_repo_type: GiteaRepoType::Source,
bitbucket_user: Vec::new(),
bitbucket_workspace: Vec::new(),
bitbucket_project: Vec::new(),

View file

@ -9,6 +9,7 @@ use kingfisher::{
cli::{
commands::{
bitbucket::{BitbucketAuthArgs, BitbucketRepoType},
gitea::GiteaRepoType,
github::{GitCloneMode, GitHistoryMode, GitHubRepoType},
gitlab::GitLabRepoType,
inputs::{ContentFilteringArgs, InputSpecifierArgs},
@ -70,6 +71,13 @@ fn test_github_remote_scan() -> Result<()> {
gitlab_repo_type: GitLabRepoType::Owner,
gitlab_include_subgroups: false,
gitea_user: Vec::new(),
gitea_organization: Vec::new(),
gitea_exclude: Vec::new(),
all_gitea_organizations: false,
gitea_api_url: Url::parse("https://gitea.com/api/v1/").unwrap(),
gitea_repo_type: GiteaRepoType::Source,
bitbucket_user: Vec::new(),
bitbucket_workspace: Vec::new(),
bitbucket_project: Vec::new(),

View file

@ -9,6 +9,7 @@ use kingfisher::{
cli::{
commands::{
bitbucket::{BitbucketAuthArgs, BitbucketRepoType},
gitea::GiteaRepoType,
github::{GitCloneMode, GitHistoryMode, GitHubRepoType},
gitlab::GitLabRepoType,
inputs::{ContentFilteringArgs, InputSpecifierArgs},
@ -69,6 +70,13 @@ fn test_gitlab_remote_scan() -> Result<()> {
gitlab_repo_type: GitLabRepoType::Owner,
gitlab_include_subgroups: false,
gitea_user: Vec::new(),
gitea_organization: Vec::new(),
gitea_exclude: Vec::new(),
all_gitea_organizations: false,
gitea_api_url: Url::parse("https://gitea.com/api/v1/")?,
gitea_repo_type: GiteaRepoType::Source,
bitbucket_user: Vec::new(),
bitbucket_workspace: Vec::new(),
bitbucket_project: Vec::new(),
@ -192,6 +200,13 @@ fn test_gitlab_remote_scan_no_history() -> Result<()> {
gitlab_repo_type: GitLabRepoType::Owner,
gitlab_include_subgroups: false,
gitea_user: Vec::new(),
gitea_organization: Vec::new(),
gitea_exclude: Vec::new(),
all_gitea_organizations: false,
gitea_api_url: Url::parse("https://gitea.com/api/v1/")?,
gitea_repo_type: GiteaRepoType::Source,
bitbucket_user: Vec::new(),
bitbucket_workspace: Vec::new(),
bitbucket_project: Vec::new(),

View file

@ -9,6 +9,7 @@ use kingfisher::{
cli::{
commands::{
bitbucket::{BitbucketAuthArgs, BitbucketRepoType},
gitea::GiteaRepoType,
github::{GitCloneMode, GitHistoryMode, GitHubRepoType},
gitlab::GitLabRepoType,
inputs::{ContentFilteringArgs, InputSpecifierArgs},
@ -53,6 +54,12 @@ async fn test_redact_hashes_finding_values() -> Result<()> {
gitlab_api_url: Url::parse("https://gitlab.com/").unwrap(),
gitlab_repo_type: GitLabRepoType::Owner,
gitlab_include_subgroups: false,
gitea_user: Vec::new(),
gitea_organization: Vec::new(),
gitea_exclude: Vec::new(),
all_gitea_organizations: false,
gitea_api_url: Url::parse("https://gitea.com/api/v1/").unwrap(),
gitea_repo_type: GiteaRepoType::Source,
bitbucket_user: Vec::new(),
bitbucket_workspace: Vec::new(),
bitbucket_project: Vec::new(),

View file

@ -8,6 +8,7 @@ use kingfisher::{
cli::{
commands::{
bitbucket::{BitbucketAuthArgs, BitbucketRepoType},
gitea::GiteaRepoType,
github::{GitCloneMode, GitHistoryMode, GitHubRepoType},
gitlab::GitLabRepoType,
inputs::{ContentFilteringArgs, InputSpecifierArgs},
@ -59,6 +60,13 @@ impl TestContext {
gitlab_api_url: Url::parse("https://gitlab.com/").unwrap(),
gitlab_repo_type: GitLabRepoType::Owner,
gitlab_include_subgroups: false,
gitea_user: Vec::new(),
gitea_organization: Vec::new(),
gitea_exclude: Vec::new(),
all_gitea_organizations: false,
gitea_api_url: Url::parse("https://gitea.com/api/v1/").unwrap(),
gitea_repo_type: GiteaRepoType::Source,
bitbucket_user: Vec::new(),
bitbucket_workspace: Vec::new(),
bitbucket_project: Vec::new(),
@ -168,6 +176,13 @@ async fn test_scan_slack_messages() -> Result<()> {
gitlab_api_url: Url::parse("https://gitlab.com/").unwrap(),
gitlab_repo_type: GitLabRepoType::Owner,
gitlab_include_subgroups: false,
gitea_user: Vec::new(),
gitea_organization: Vec::new(),
gitea_exclude: Vec::new(),
all_gitea_organizations: false,
gitea_api_url: Url::parse("https://gitea.com/api/v1/").unwrap(),
gitea_repo_type: GiteaRepoType::Source,
bitbucket_user: Vec::new(),
bitbucket_workspace: Vec::new(),
bitbucket_project: Vec::new(),

View file

@ -12,6 +12,7 @@ use kingfisher::{
cli::{
commands::{
bitbucket::{BitbucketAuthArgs, BitbucketRepoType},
gitea::GiteaRepoType,
github::{GitCloneMode, GitHistoryMode, GitHubRepoType},
gitlab::GitLabRepoType,
inputs::{ContentFilteringArgs, InputSpecifierArgs},
@ -125,6 +126,14 @@ async fn test_validation_cache_and_depvars() -> Result<()> {
gitlab_api_url: Url::parse("https://gitlab.com/").unwrap(),
gitlab_repo_type: GitLabRepoType::Owner,
gitlab_include_subgroups: false,
gitea_user: Vec::new(),
gitea_organization: Vec::new(),
gitea_exclude: Vec::new(),
all_gitea_organizations: false,
gitea_api_url: Url::parse("https://gitea.com/api/v1/").unwrap(),
gitea_repo_type: GiteaRepoType::Source,
bitbucket_user: Vec::new(),
bitbucket_workspace: Vec::new(),
bitbucket_project: Vec::new(),

View file

@ -10,6 +10,7 @@ use kingfisher::{
cli::{
commands::{
bitbucket::{BitbucketAuthArgs, BitbucketRepoType},
gitea::GiteaRepoType,
github::{GitCloneMode, GitHistoryMode, GitHubRepoType},
gitlab::GitLabRepoType,
inputs::{ContentFilteringArgs, InputSpecifierArgs},
@ -69,6 +70,13 @@ impl TestContext {
gitlab_repo_type: GitLabRepoType::Owner,
gitlab_include_subgroups: false,
gitea_user: Vec::new(),
gitea_organization: Vec::new(),
gitea_exclude: Vec::new(),
all_gitea_organizations: false,
gitea_api_url: Url::parse("https://gitea.com/api/v1/").unwrap(),
gitea_repo_type: GiteaRepoType::Source,
bitbucket_user: Vec::new(),
bitbucket_workspace: Vec::new(),
bitbucket_project: Vec::new(),
@ -165,6 +173,13 @@ impl TestContext {
gitlab_repo_type: GitLabRepoType::Owner,
gitlab_include_subgroups: false,
gitea_user: Vec::new(),
gitea_organization: Vec::new(),
gitea_exclude: Vec::new(),
all_gitea_organizations: false,
gitea_api_url: Url::parse("https://gitea.com/api/v1/").unwrap(),
gitea_repo_type: GiteaRepoType::Source,
bitbucket_user: Vec::new(),
bitbucket_workspace: Vec::new(),
bitbucket_project: Vec::new(),