Commit graph

466 commits

Author SHA1 Message Date
a75f28e073 Fix fly.io proxy rate limit to key on real client IP
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 2m24s
The general rate limit zone used $binary_remote_addr (Fly's internal
proxy IP), causing all external clients to share one bucket. Switch to
$http_fly_client_ip to match forge_auth's correct behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:00:33 -07:00
40556e5a2d Review gandi.md: add missing forge.eblu.me CNAME record
The Pulumi code has had a forge.eblu.me CNAME since it was added, but the
doc's DNS table only listed docs and cv. Also fixed the __main__.py
description to mention CNAMEs alongside A records.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 09:54:46 -07:00
5757df115d Upgrade ollama from 0.17.5 to 0.20.4
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 06:42:05 -07:00
07f52e9488 Deploy Paperless-ngx document management (#328)
All checks were successful
Build Container / detect (push) Successful in 2s
Build Container / build-dockerfile (paperless) (push) Successful in 9s
## Summary

- Add paperless-ngx (v2.20.13) as a new ArgoCD-managed service on indri
- Dockerfile built from forge mirror (`mirrors/paperless-ngx`), multi-stage with s6-overlay
- PostgreSQL database via `blumeops-pg` CNPG cluster, Redis sidecar for Celery
- NFS document storage on sifaka (`/volume1/paperless`)
- Authentik OIDC SSO via baked JSON blob from 1Password
- Caddy route at `paperless.ops.eblu.me`
- 1Password item "Paperless (blumeops)" created with all secrets

## Files

- `containers/paperless/Dockerfile` — multi-stage build
- `argocd/manifests/paperless/` — full k8s manifest set
- `argocd/apps/paperless.yaml` — ArgoCD application
- `argocd/manifests/databases/` — CNPG role + ExternalSecret
- `ansible/roles/caddy/defaults/main.yml` — Caddy route
- `service-versions.yaml` — version tracking entry
- `docs/reference/services/paperless.md` — reference card

## Remaining deploy steps

1. Build container: `mise run container-build-and-release paperless`
2. Update kustomization.yaml `newTag` with actual image tag
3. Create Authentik application/provider for paperless
4. Create `paperless` database on blumeops-pg
5. Sync ArgoCD apps, then sync paperless from branch
6. Provision Caddy: `mise run provision-indri -- --tags caddy`
7. Verify at https://paperless.ops.eblu.me

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #328
2026-04-08 17:54:12 -07:00
e04455c911 Add changelog fragment for adding-a-service tutorial review
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:29:54 -07:00
d3235c5ca9 Review adding-a-service tutorial: fix ingress, repoURL, add kustomize and reference card steps
- Fix Tailscale Ingress: move hostname to tls.hosts, remove from rules (ProxyGroup compat)
- Update ArgoCD repoURL to forge.ops.eblu.me:2222
- Add kustomization.yaml section with :kustomized sentinel tag pattern
- Add Step 5: Create a Reference Card (keep under 30s reading time)
- Set last-reviewed: 2026-04-08

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:28:46 -07:00
2eb28301e4 Upgrade authentik 2026.2.0 → 2026.2.2 (patch release)
All checks were successful
Build Container / detect (push) Successful in 2s
Build Container / build-nix (authentik) (push) Successful in 1m6s
Bug-fix release with web UI fixes, LDAP page size, and SAML SLO
redirect. Also bumps client-go to v3.2026.2.1.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 10:53:03 -07:00
0366a0346b Set Frigate preview quality to CRF 8 for faster timeline loading
Previews are ~4MB/hour at default quality (CRF 1), served over NFS from
sifaka. Reducing to CRF 8 shrinks preview files to improve review page
load times when scrubbing older footage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 08:43:43 -07:00
936d29bbe1 Fix UnPoller dashboard UIDs exceeding Grafana 12's 40-char limit
Strip redundant "unifi-poller-" prefix from generated slugs, bringing
UIDs from 45-48 chars down to 32-35 chars.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 07:03:39 -07:00
f59f8859dc Localize kube-state-metrics container (Dockerfile + nix) (#327)
All checks were successful
Build Container / detect (push) Successful in 2s
Build Container / build-dockerfile (kube-state-metrics) (push) Successful in 5s
Build Container / build-nix (kube-state-metrics) (push) Successful in 7s
## Summary

- Build kube-state-metrics v2.18.0 locally from forge mirror, replacing upstream `registry.k8s.io` image
- Dockerfile (two-stage Go build) for indri/minikube
- default.nix (buildGoModule + buildLayeredImage) for ringtail/k3s
- Both kustomization files updated with `newName` pointing to local registry

## Verification

- [x] Nix build succeeded on ringtail (`nix-build` → 10-layer image)
- [x] Dockerfile build succeeded locally (`dagger call build` → ~2min)
- [x] `container-version-check --all-files` passes (2.18.0 consistent across Dockerfile, nix, service-versions.yaml)
- [ ] CI builds container images from this branch
- [ ] Update kustomization `newTag` with SHA-tagged version from CI
- [ ] ArgoCD sync on both clusters

## Test plan

- Trigger CI build: `mise run container-build-and-release kube-state-metrics`
- Verify tags: `mise run container-list kube-state-metrics`
- Update newTag in kustomization files with CI-produced tag
- Sync ArgoCD on indri: `argocd app sync kube-state-metrics`
- Sync ArgoCD on ringtail: `argocd app sync kube-state-metrics --context=k3s-ringtail` (note: argocd uses its own auth, not kubectl context)
- Verify metrics still flowing to Prometheus

Reviewed-on: #327
2026-04-07 16:09:25 -07:00
efae404d1e Remove superuser from teslamate PG role, transfer extension ownership
teslamate had superuser on the shared blumeops-pg cluster (which also
hosts miniflux and authentik). Downgraded to plain database owner with
extension ownership (cube, earthdistance) transferred manually so it
can still ALTER EXTENSION UPDATE. earthdistance is untrusted in PG so
DROP+CREATE would need temporary superuser escalation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 15:36:39 -07:00
fc34a7da5b Review postgresql.md: add authentik user/db, immich-pg borgmatic secret
Doc review found the authentik database, user, and external secret were
missing, along with the immich-pg borgmatic secret. Added Cluster column
to Users table for clarity. Set last-reviewed: 2026-04-07.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 15:21:48 -07:00
1fd8aae8f6 Upgrade ArgoCD v3.3.2 → v3.3.6, SHA-pin install manifest
Patch upgrade with bug fixes (diff normalization, installation ID cache).
Pin the upstream manifest URL to commit SHA for supply chain integrity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:21:11 -07:00
e85c71e73f Add changelog fragments for seccomp hardening and bracket fix
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 10:38:41 -07:00
a059d81314 Add review-compliance-reports task and reorganize report storage
New mise task fetches Prowler reports from sifaka, parses with proper
muted/unmuted distinction, shows week-over-week delta, and includes
a scaffold for Kingfisher once JSON/CSV output is available upstream.

Moved all legacy top-level reports on sifaka into date subdirectories
to match the current CronJob output structure. Updated
read-compliance-reports doc with task reference and links.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 10:16:46 -07:00
54213ab810 Fix flake-update pipeline and update ringtail flake inputs
The `--exclude` flag added in #321 never existed in nix — it was
introduced broken and never tested. Replace with dynamic input
discovery: query `nix flake metadata --json` for all input names,
filter out skip_inputs (default: nixpkgs-services), pass the rest
as positional args. Also bump NIX_IMAGE 2.33.3 → 2.34.4.

Updated inputs: nixpkgs, home-manager, disko.
nixpkgs-services stays pinned.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 08:27:31 -07:00
Forgejo Actions
370a3574b2 Update docs release to v1.15.4
- Built changelog from towncrier fragments

[skip ci]
2026-04-06 07:53:54 -07:00
0eaf8680fd Rewrite observability stack tutorial to match actual practices
Replace generic Helm install instructions with kustomize/ArgoCD patterns
that reflect how BlumeOps actually deploys Prometheus, Loki, Grafana, and
Alloy. Fix "BluemeOps" typos, document Alloy as a core (not optional)
component, remove hardcoded admin password, add proper prerequisites and
cross-references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 07:52:35 -07:00
f42fa2d558 Remove stale Helm chart mirror references from forgejo docs
All Helm chart mirrors (grafana-helm-charts, connect-helm-charts,
cloudnative-pg-charts) have been deleted from forge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 07:37:21 -07:00
c7e5af6d51 Migrate 1Password Connect from Helm to kustomize (1.8.1 → 1.8.2) (#326)
## Summary

- Renders manifests from `connect-helm-charts v2.4.1` as plain kustomize (deployment + service)
- Bumps 1Password Connect from 1.8.1 → 1.8.2
- Completes the no-helm-policy migration — all services now use kustomize
- Retains all production hardening from the Helm chart (securityContext, runAsNonRoot, drop ALL, seccomp, resource limits)

## Changes

- **New:** `deployment.yaml`, `service.yaml`, `kustomization.yaml` in `argocd/manifests/1password-connect/`
- **Rewritten:** Both ArgoCD app definitions (indri + ringtail) — single source kustomize instead of multi-source Helm
- **Deleted:** `values.yaml` (Helm values no longer needed)
- **Updated:** `no-helm-policy.md`, `service-versions.yaml`, `README.md`

## Deployment plan

1. Sync `apps` app to pick up the new app definitions
2. `argocd app set 1password-connect --revision 1password-connect-kustomize`
3. `argocd app sync 1password-connect` — verify on indri
4. Repeat for ringtail
5. After merge: reset revision to main, re-sync both

## Test plan

- [ ] `kubectl kustomize` renders cleanly (verified locally)
- [ ] ArgoCD diff shows expected changes (Helm labels removed, images bumped)
- [ ] Pods come up healthy on indri
- [ ] External Secrets still resolves 1Password items
- [ ] Repeat on ringtail

Reviewed-on: #326
2026-04-06 07:31:40 -07:00
Forgejo Actions
facb803010 Update docs release to v1.15.3
- Built changelog from towncrier fragments

[skip ci]
2026-04-05 21:24:25 -07:00
f9397b7fa0 Review core-services tutorial: add SSH rationale, runner example, TODO notice
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 21:22:00 -07:00
3c819cf16e Fix Homepage migration date: 2025-12 → 2026-02 (per git history)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 09:53:48 -07:00
6e06efa6d0 Add AI-drafted disclaimer to no-helm-policy explanation doc
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 09:50:53 -07:00
64200a55c5 Migrate Immich from Helm chart to kustomize manifests (v2.5.6 → v2.6.3)
Replace the Helm chart deployment with plain kustomize manifests following
the Authentik pattern (separate deployments per component). Consolidate
the immich-storage ArgoCD app into the main immich app. Add no-helm-policy
doc establishing kustomize as the standard deployment mechanism.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 09:42:25 -07:00
464e3222d2 Document upstream fix for Prowler --registry bug (pending release)
PR #10470 merged 2026-03-30; initContainer workaround stays until a
Prowler release includes the fix (latest is 5.22.0).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 20:21:19 -07:00
75f9ba4943 Build Tempo container from source (2.10.3) (#323)
All checks were successful
Build Container / detect (push) Successful in 2s
Build Container / build-dockerfile (tempo) (push) Successful in 6s
## Summary
- Add `containers/tempo/Dockerfile` — two-stage Go build from forge mirror, modeled on loki
- Switch kustomization from upstream `grafana/tempo` to `registry.ops.eblu.me/blumeops/tempo`
- Bump Tempo 2.10.1 → 2.10.3

## Test plan
- [ ] Kick off container build via `mise run container-build-and-release tempo`
- [ ] Update kustomization `newTag` with built image tag
- [ ] Deploy from branch: `argocd app set tempo --revision local-tempo-container && argocd app sync tempo`
- [ ] Verify Tempo health: `curl tempo.ops.eblu.me/ready`
- [ ] Verify traces flowing in Grafana Tempo datasource

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #323
2026-04-02 13:45:02 -07:00
b1e2811077 Upgrade Grafana 12.3.3 → 12.4.2 (#322)
All checks were successful
Build Container / detect (push) Successful in 2s
Build Container / build-dockerfile (grafana) (push) Successful in 7s
## Summary

- Bumps Grafana from 12.3.3 to 12.4.2
- Patches 7 CVEs, notably CVE-2026-27880 (unauthenticated OOM DoS, CVSS 7.5) and CVE-2026-27879 (authenticated OOM via resample queries)
- No config changes required — reviewed alerting, datasources, OIDC, and feature toggles against 12.4.x breaking changes

## Breaking changes reviewed

| Change | Impact |
|--------|--------|
| Alerting: pending period applies to NoData/Error | Net positive — reduces noise from transient blips |
| Default notification uses empty receiver | No impact — we explicitly set `ntfy-infra` |
| Removed feature toggles (4) | No impact — none configured |
| OAuth ID token signature validation | Low risk — verify OIDC login post-deploy |
| OpsGenie deprecated | No impact — using webhook |

## Test plan

- [ ] Container build completes at forge
- [ ] Update kustomization.yaml with new image tag
- [ ] `argocd app set grafana --revision upgrade/grafana-12.4.2 && argocd app sync grafana`
- [ ] Verify Grafana UI loads at grafana.ops.eblu.me
- [ ] Verify OIDC login via Authentik
- [ ] Verify dashboards and datasources load
- [ ] Check alerting rules are intact

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #322
2026-04-02 11:33:19 -07:00
08d57ef4d4 Review pulumi reference doc: fix Tailscale auth, add how-to links
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 10:55:57 -07:00
1e67975acb Add changelog fragment for compensating control review
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 22:04:38 -07:00
baee7ae54b Review single-user-cluster control and add evidence collection card
Stamp single-user-cluster last-reviewed to 2026-04-01 after verifying
Tailscale ACLs and kubeconfig distribution. Add aspirational how-to card
documenting what PCI DSS evidence collection would look like (CCW,
artifacts, Drata workflow). Link from existing review process card.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 22:01:57 -07:00
a18a424866 Pin NixOS service versions via nixpkgs-services overlay (#321)
## Summary

- Add `nixpkgs-services` flake input pinned to a specific nixpkgs commit, with an overlay that pulls `forgejo-runner`, `snowflake`, and `k3s` from it instead of the rolling `nixpkgs`
- Dagger `flake-update` pipeline now excludes `nixpkgs-services` via `--exclude`
- Fix stale nix-container-builder version in service-versions.yaml (was 12.6.4, actually running 12.7.2)
- Add k3s and minikube to service-versions.yaml tracking
- Document the pinning approach in review-services how-to and ringtail reference

## Motivation

During service review, discovered that flake updates had silently upgraded forgejo-runner from 12.6.4 → 12.7.2 without updating service-versions.yaml. This "sneak-in upgrade" bypasses the service review process. The overlay ensures these three services only change versions deliberately.

## Test plan

- [ ] Verify `nix flake update` from `nixos/ringtail/` does not change `nixpkgs-services` lock entry
- [ ] Verify `mise run provision-ringtail` builds successfully with the overlay
- [ ] Confirm running service versions unchanged after deploy

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #321
2026-04-01 21:37:57 -07:00
cfbf4cadbd Review argocd-cli reference doc: stamp last-reviewed
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 20:35:52 -07:00
Forgejo Actions
2b7b21dc9b Update docs release to v1.15.2
- Built changelog from towncrier fragments

[skip ci]
2026-03-30 17:48:40 -07:00
4059b3d27b Add compensating controls framework and date-based report dirs (#320)
## Summary

- Add `compensating-controls.yaml` tracking 9 named controls that justify suppressed security findings
- Update all Prowler mutelist descriptions with `CC: <id>` references to named controls
- Add `mise run review-compensating-controls` task — surfaces stalest control with all codebase references
- Add [[review-compensating-controls]] how-to doc
- Organize Prowler and Kingfisher reports into `YYYY-MM-DD` subdirectories

### Compensating controls

| ID | Mitigates |
|----|-----------|
| `single-user-cluster` | Image cache abuse, RBAC breadth, system pod privileges |
| `tailscale-network-isolation` | Profiling endpoints, weak TLS, debug ports |
| `local-registry` | AlwaysPullImages gap |
| `sso-gated-admin-tools` | ArgoCD wildcard RBAC |
| `operator-managed-pods` | Tailscale proxy pod security settings |
| `ephemeral-privileged-jobs` | Prowler hostPID exposure |
| `trusted-ci-only` | Forgejo runner DinD |
| `init-container-isolation` | Grafana root init container |
| `observability-stack-audit` | Missing apiserver audit logging |

## Test plan

- [ ] `mise run review-compensating-controls` shows table and references
- [ ] `kubectl kustomize argocd/manifests/prowler/` renders correctly
- [ ] Sync prowler and kingfisher, verify next scan writes to dated subdirectory
- [ ] Grep for `CC:` in mutelist files — every muted finding should have at least one

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #320
2026-03-30 17:44:11 -07:00
a76e471d54 Add Prowler mutelist and fix kube-state-metrics seccomp (#319)
## Summary

- Add mutelist files to suppress expected/accepted Prowler CIS findings from components we don't control
- Mutelist files stored in `mutelist/` directory, grouped by category, merged at runtime via initContainer
- Fix missing seccomp `RuntimeDefault` profile on kube-state-metrics deployment

### Mutelist categories

| File | Checks | Covers |
|------|--------|--------|
| `apiserver.yaml` | 12 | Minikube apiserver flags |
| `control-plane.yaml` | 3 | Scheduler, controller-manager, kubelet |
| `core-pod-security.yaml` | 7 | System pods, Tailscale operator, Grafana init, Prowler hostPID, forgejo-runner |
| `rbac.yaml` | 3 | Built-in K8s roles, ArgoCD, CNPG |

Muted findings appear as `status=MUTED` in reports (not hidden), preserving audit trail.

### Not muted (follow-up)

- Alloy, Immich pods missing seccomp — need separate investigation (Helm/operator-managed)

## Test plan

- [ ] `kubectl kustomize argocd/manifests/prowler/` renders cleanly
- [ ] Trigger manual scan: `kubectl --context=minikube-indri -n prowler create job prowler-mutelist-test --from=cronjob/prowler`
- [ ] Verify initContainer merges successfully (check pod logs)
- [ ] Verify muted findings show as `MUTED` in report
- [ ] Sync kube-state-metrics and verify pod starts with seccomp profile

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #319
2026-03-30 17:22:31 -07:00
1e391f96bb Upgrade forgejo-runner 12.7.0 → 12.7.3, add service card
Patch upgrade picks up idempotent FetchTask API, offline registration
fix, cloudflare/circl security dep update, and custom gRPC user-agent.
No config defaults changed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 16:31:06 -07:00
77eebe507e Review Ansible reference doc: add missing roles, clarify IaC positioning
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 16:10:24 -07:00
c069f889d2 Harden borgmatic photos backup: restrict dirs, add keepalives + checkpoints
Restrict backup to library/ and upload/ only (skip regenerable encoded-video/,
thumbs/, backups/). Add SSH ServerAliveInterval to prevent broken pipe on long
transfers, and checkpoint_interval so interrupted backups save progress.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 10:30:28 -07:00
f9206bf10b Build custom Kingfisher container from sporked deploy branch (#318)
All checks were successful
Build Container / detect (push) Successful in 2s
Build Container / build-nix (kingfisher) (push) Successful in 12s
## Summary

- Add Dockerfile for Kingfisher built from source (sporked deploy branch)
- Multi-stage: Rust build with Boost/vectorscan, debian-slim runtime
- Switch CronJob from upstream `ghcr.io/mongodb/kingfisher` to `registry.ops.eblu.me/blumeops/kingfisher`
- Add kingfisher to service-versions.yaml (version tracks upstream main SHA)
- Document spork workflow in CLAUDE.md

## Test plan

- [ ] Build container: `mise run container-build-and-release kingfisher 1d37d29`
- [ ] Verify image on registry: `mise run container-list`
- [ ] Update kustomization newTag
- [ ] Sync ArgoCD kingfisher app from branch
- [ ] Trigger manual CronJob and verify scan completes
- [ ] Verify reports on sifaka

Reviewed-on: #318
2026-03-30 06:34:49 -07:00
99df78664e Note upstream history rewrite as a spork sync failure mode
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 08:22:11 -07:00
e1429fc3e7 Document Spork Attack supply-chain risk
Upstream can push workflows (in .github/ or .forgejo/) that execute
on our runners via any trigger mechanism including cron. Runner label
mismatch is the current defense but is fragile. No complete fix exists
short of disabling Actions entirely.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 08:16:09 -07:00
6ecfaf02b6 Add spork strategy: tooling and documentation
Spork-create mise task sets up a floating-branch soft-fork of a
mirrored upstream project with daily mirror-sync via Forgejo Actions.
Includes explanation card, how-to guides for setup and branch
management, and the spork-create uv script.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 22:58:10 -07:00
bb60369956 Simplify Kingfisher CronJob to HTML-only output
Remove the second scan pass for JSON — one format is enough for now.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:50:54 -07:00
2808ffd450 Document Kingfisher secret scanner service
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:47:37 -07:00
35705faca2 Add Kingfisher secret scanner CronJob (#317)
## Summary

- Deploys MongoDB Kingfisher as a weekly CronJob on minikube-indri
- Scans all Forgejo repos (eblume + all orgs) for leaked secrets with live validation
- Produces timestamped HTML and JSON reports on sifaka NFS (`/volume1/reports/kingfisher/`)
- Forgejo API token sourced from 1Password via ExternalSecret
- Uses official `ghcr.io/mongodb/kingfisher:1.91.0` container image
- Runs Sunday 4am (after Prowler's 3am k8s scan)

## Resources

- CronJob, PV/PVC (sifaka NFS), ExternalSecret
- ArgoCD Application with manual sync + CreateNamespace

## Test plan

- [x] Sync ArgoCD `apps` app to pick up new kingfisher Application
- [x] Set `--revision feature/kingfisher-cronjob` on kingfisher app
- [x] Verify ExternalSecret creates the `kingfisher-forgejo-token` Secret
- [x] Trigger manual job: `kubectl create job --from=cronjob/kingfisher kingfisher-manual -n kingfisher --context=minikube-indri`
- [ ] Verify reports appear on sifaka at `/volume1/reports/kingfisher/`
- [ ] After merge: set `--revision main` and re-sync

Reviewed-on: #317
2026-03-28 21:39:55 -07:00
6b1717bf28 Add Kingfisher secret scanner to prek hooks
Running alongside TruffleHog to compare coverage. Kingfisher uses
staged-only mode with validation disabled for fast, offline-safe
pre-commit checks. Validation will be enabled in the planned cron job.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:07:07 -07:00
Forgejo Actions
7fb6eff388 Update docs release to v1.15.1
- Built changelog from towncrier fragments

[skip ci]
2026-03-28 09:15:21 -07:00
2bd1611ac1 Document sifaka NFS/Tailscale TUN troubleshooting
Sifaka's Tailscale can revert to userspace networking after package
updates, causing NFS mounts to fail because the NFS daemon sees
127.0.0.1 instead of the client's Tailscale IP. Added troubleshooting
how-to doc and updated sifaka reference card with frigate export and
TUN requirement.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 09:12:00 -07:00
3017f759a7 Migrate Forgejo from Homebrew to source build (#316)
## Summary

- Migrate Forgejo from Homebrew to source-built binary with mcquack LaunchAgent
- Matches the established pattern used by zot, caddy, and alloy
- Upgrades to v14.0.3 (7 security fixes: PKCE bypass, OAuth scope bypass, open redirect, and more)

## Changes

- **Ansible role**: Replace brew install/services with binary stat check + LaunchAgent
- **Paths**: `/opt/homebrew/var/forgejo` → `~/forgejo`, binary at `~/code/3rd/forgejo/forgejo`
- **Run user**: `forgejo` → `erichblume` (LaunchAgent user; SSH git user stays `forgejo`)
- **Docs**: Updated Forgejo reference card, restart-indri guide
- **Service review**: Stamped frigate-notify, cloudnative-pg, blumeops-pg as current

## One-time migration steps (manual, on indri)

1. Clone from Codeberg, add forge mirror remote
2. Check out v14.0.3, build with `make build && make forgejo`
3. Stop brew, `cp -a` data to `~/forgejo`, fix ownership
4. Run `provision-indri --tags forgejo`
5. Verify, then `brew uninstall forgejo`

## Data safety

- `cp -a` preserves everything (repos, SQLite DB, LFS, sessions, OAuth config)
- Brew version stays installed as rollback until verification passes
- No schema changes between 14.0.2 → 14.0.3

Reviewed-on: #316
2026-03-28 08:19:23 -07:00