Move the Dagger module from .dagger/ to the repo root (src/blumeops/),
rename from blumeops-ci to blumeops, and introduce native Dagger pipelines
that replace docker_build() for container builds.
docker_build() swallowed build errors — native pipelines surface full
output per step. Navidrome is the first container migrated as a proof of
concept (containers/navidrome/container.py).
- Containers with container.py use native Dagger builds
- Containers with only Dockerfile fall back to docker_build()
- dagger call container-version extracts VERSION from container.py
- CI workflow, container-list, container-version-check, and
container-build-and-release all updated for hybrid mode
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add flyio-tailscale (v1.94.1), flyio-nginx (1.29.6-alpine), and
flyio-alloy (v1.14.1) entries with new `fly` service type so future
upgrades go through the service-review workflow.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tailscale :stable pulled v1.96.5 during last deploy, which returns
SERVFAIL for tailnet DNS names (no upstream resolvers set). This broke
all public routing (forge/docs/cv.eblu.me) through the Fly proxy.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
check_alert() used head -1 to display only the first firing instance,
silently swallowing additional alerts (e.g. frigate pod-not-ready was
hidden behind ollama).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The general rate limit zone used $binary_remote_addr (Fly's internal
proxy IP), causing all external clients to share one bucket. Switch to
$http_fly_client_ip to match forge_auth's correct behavior.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Pulumi code has had a forge.eblu.me CNAME since it was added, but the
doc's DNS table only listed docs and cv. Also fixed the __main__.py
description to mention CNAMEs alongside A records.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bug-fix release with web UI fixes, LDAP page size, and SAML SLO
redirect. Also bumps client-go to v3.2026.2.1.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previews are ~4MB/hour at default quality (CRF 1), served over NFS from
sifaka. Reducing to CRF 8 shrinks preview files to improve review page
load times when scrubbing older footage.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Strip redundant "unifi-poller-" prefix from generated slugs, bringing
UIDs from 45-48 chars down to 32-35 chars.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## Summary
- Build kube-state-metrics v2.18.0 locally from forge mirror, replacing upstream `registry.k8s.io` image
- Dockerfile (two-stage Go build) for indri/minikube
- default.nix (buildGoModule + buildLayeredImage) for ringtail/k3s
- Both kustomization files updated with `newName` pointing to local registry
## Verification
- [x] Nix build succeeded on ringtail (`nix-build` → 10-layer image)
- [x] Dockerfile build succeeded locally (`dagger call build` → ~2min)
- [x] `container-version-check --all-files` passes (2.18.0 consistent across Dockerfile, nix, service-versions.yaml)
- [ ] CI builds container images from this branch
- [ ] Update kustomization `newTag` with SHA-tagged version from CI
- [ ] ArgoCD sync on both clusters
## Test plan
- Trigger CI build: `mise run container-build-and-release kube-state-metrics`
- Verify tags: `mise run container-list kube-state-metrics`
- Update newTag in kustomization files with CI-produced tag
- Sync ArgoCD on indri: `argocd app sync kube-state-metrics`
- Sync ArgoCD on ringtail: `argocd app sync kube-state-metrics --context=k3s-ringtail` (note: argocd uses its own auth, not kubectl context)
- Verify metrics still flowing to Prometheus
Reviewed-on: #327
Patch upgrade with bug fixes (diff normalization, installation ID cache).
Pin the upstream manifest URL to commit SHA for supply chain integrity.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New mise task fetches Prowler reports from sifaka, parses with proper
muted/unmuted distinction, shows week-over-week delta, and includes
a scaffold for Kingfisher once JSON/CSV output is available upstream.
Moved all legacy top-level reports on sifaka into date subdirectories
to match the current CronJob output structure. Updated
read-compliance-reports doc with task reference and links.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The `--exclude` flag added in #321 never existed in nix — it was
introduced broken and never tested. Replace with dynamic input
discovery: query `nix flake metadata --json` for all input names,
filter out skip_inputs (default: nixpkgs-services), pass the rest
as positional args. Also bump NIX_IMAGE 2.33.3 → 2.34.4.
Updated inputs: nixpkgs, home-manager, disko.
nixpkgs-services stays pinned.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the Helm chart deployment with plain kustomize manifests following
the Authentik pattern (separate deployments per component). Consolidate
the immich-storage ArgoCD app into the main immich app. Add no-helm-policy
doc establishing kustomize as the standard deployment mechanism.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PR #10470 merged 2026-03-30; initContainer workaround stays until a
Prowler release includes the fix (latest is 5.22.0).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## Summary
- Add `containers/tempo/Dockerfile` — two-stage Go build from forge mirror, modeled on loki
- Switch kustomization from upstream `grafana/tempo` to `registry.ops.eblu.me/blumeops/tempo`
- Bump Tempo 2.10.1 → 2.10.3
## Test plan
- [ ] Kick off container build via `mise run container-build-and-release tempo`
- [ ] Update kustomization `newTag` with built image tag
- [ ] Deploy from branch: `argocd app set tempo --revision local-tempo-container && argocd app sync tempo`
- [ ] Verify Tempo health: `curl tempo.ops.eblu.me/ready`
- [ ] Verify traces flowing in Grafana Tempo datasource
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: #323
## Summary
- Add `nixpkgs-services` flake input pinned to a specific nixpkgs commit, with an overlay that pulls `forgejo-runner`, `snowflake`, and `k3s` from it instead of the rolling `nixpkgs`
- Dagger `flake-update` pipeline now excludes `nixpkgs-services` via `--exclude`
- Fix stale nix-container-builder version in service-versions.yaml (was 12.6.4, actually running 12.7.2)
- Add k3s and minikube to service-versions.yaml tracking
- Document the pinning approach in review-services how-to and ringtail reference
## Motivation
During service review, discovered that flake updates had silently upgraded forgejo-runner from 12.6.4 → 12.7.2 without updating service-versions.yaml. This "sneak-in upgrade" bypasses the service review process. The overlay ensures these three services only change versions deliberately.
## Test plan
- [ ] Verify `nix flake update` from `nixos/ringtail/` does not change `nixpkgs-services` lock entry
- [ ] Verify `mise run provision-ringtail` builds successfully with the overlay
- [ ] Confirm running service versions unchanged after deploy
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: #321
Patch upgrade picks up idempotent FetchTask API, offline registration
fix, cloudflare/circl security dep update, and custom gRPC user-agent.
No config defaults changed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Restrict backup to library/ and upload/ only (skip regenerable encoded-video/,
thumbs/, backups/). Add SSH ServerAliveInterval to prevent broken pipe on long
transfers, and checkpoint_interval so interrupted backups save progress.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Spork-create mise task sets up a floating-branch soft-fork of a
mirrored upstream project with daily mirror-sync via Forgejo Actions.
Includes explanation card, how-to guides for setup and branch
management, and the spork-create uv script.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## Summary
- Deploys MongoDB Kingfisher as a weekly CronJob on minikube-indri
- Scans all Forgejo repos (eblume + all orgs) for leaked secrets with live validation
- Produces timestamped HTML and JSON reports on sifaka NFS (`/volume1/reports/kingfisher/`)
- Forgejo API token sourced from 1Password via ExternalSecret
- Uses official `ghcr.io/mongodb/kingfisher:1.91.0` container image
- Runs Sunday 4am (after Prowler's 3am k8s scan)
## Resources
- CronJob, PV/PVC (sifaka NFS), ExternalSecret
- ArgoCD Application with manual sync + CreateNamespace
## Test plan
- [x] Sync ArgoCD `apps` app to pick up new kingfisher Application
- [x] Set `--revision feature/kingfisher-cronjob` on kingfisher app
- [x] Verify ExternalSecret creates the `kingfisher-forgejo-token` Secret
- [x] Trigger manual job: `kubectl create job --from=cronjob/kingfisher kingfisher-manual -n kingfisher --context=minikube-indri`
- [ ] Verify reports appear on sifaka at `/volume1/reports/kingfisher/`
- [ ] After merge: set `--revision main` and re-sync
Reviewed-on: #317
Running alongside TruffleHog to compare coverage. Kingfisher uses
staged-only mode with validation disabled for fast, offline-safe
pre-commit checks. Validation will be enabled in the planned cron job.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## Summary
- Adds a second borgmatic config (`photos.yaml`) that backs up `/Volumes/photos` (sifaka SMB mount, ~128 GB) to a dedicated BorgBase repo (`immich-photos`), running daily at 4 AM
- Separate launchd agent (`mcquack.eblume.borgmatic-photos`) so photo backups run independently from the main backup
- Refactors `borgmatic_metrics` script to support multiple repos with a `repo` Prometheus label
- Updates Grafana "Borg Backups" dashboard with a `repo` template variable so you can filter/compare repos
- Docs updated: `backups.md`, `borgmatic.md`
## Prerequisites (manual)
- [x] Create `immich-photos` repo on BorgBase with same SSH key
- [ ] Upgrade BorgBase plan to Small ($24/yr) if currently on free tier (128 GB exceeds 10 GB limit)
- [ ] After deploy: `borg init` the new repo (borgmatic does this automatically on first run)
## Test plan
- [ ] Dry run: `mise run provision-indri -- --check --diff --tags borgmatic,borgmatic_metrics`
- [ ] Deploy borgmatic role and verify both configs deployed
- [ ] Run `borgmatic --config ~/.config/borgmatic/photos.yaml create --verbosity 1` manually for first backup (will take hours)
- [ ] Verify metrics script collects from both repos: `~/.local/bin/borgmatic-metrics && cat /opt/homebrew/var/node_exporter/textfile/borgmatic.prom`
- [ ] Sync grafana-config in ArgoCD and verify dashboard repo selector works
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: #315
## Summary
- Add `authentik` database (blumeops-pg cluster) to borgmatic pg_dump backups
- Add `immich` database (immich-pg cluster) to borgmatic pg_dump backups
- For immich-pg: new borgmatic managed role with `pg_read_all_data`, ExternalSecret, Tailscale LoadBalancer service, and Caddy L4 TCP proxy on port 5433
- Update backup docs to reflect all four CNPG databases + mealie SQLite
## Deploy plan
Deploy order matters — k8s resources must exist before ansible can route to them:
1. **ArgoCD (databases app):** sync to pick up immich-pg borgmatic role, ExternalSecret, and Tailscale service
```
argocd app set blumeops-pg --revision feature/borgmatic-all-pg-backups
argocd app sync blumeops-pg
```
2. **Wait** for `immich-pg-tailscale` service to get a Tailscale IP and `immich-pg.tail8d86e.ts.net` to resolve
3. **Ansible (caddy):** deploy Caddy L4 route for port 5433
```
mise run provision-indri -- --tags caddy
```
4. **Ansible (borgmatic):** deploy updated config and .pgpass
```
mise run provision-indri -- --tags borgmatic
```
5. **Verify:** trigger a manual borgmatic run and check all four pg_dump streams succeed
```
borgmatic --verbosity 1 2>&1 | grep -E '(Dumping|ERROR)'
```
## Test plan
- [x] `kubectl kustomize` builds cleanly
- [x] `ansible --check --diff` for borgmatic and caddy show expected changes
- [ ] ArgoCD sync succeeds for databases app
- [ ] `immich-pg.tail8d86e.ts.net` resolves
- [ ] `pg.ops.eblu.me:5433` accepts connections
- [ ] `borgmatic --verbosity 1` dumps all four databases without errors
Reviewed-on: #314
Single-file Go tool implementing the QArt technique (Russ Cox, 2012)
using only the public rsc.io/qr API. Generates QR codes whose data
modules form a recognizable image by exploiting error correction
freedom via GF(2) Gaussian elimination.
Includes a web UI with live-updating sliders for version, mask,
rotation, dx/dy offset, and scale. Keyboard shortcuts for rapid
iteration. Also works as a CLI for batch generation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update manage-lockfile doc with post-deploy steps (kernel update detection,
reboot guidance, generation pruning). Add prune-ringtail-generations mise
task that keeps the 5 most recent generations plus the most recent one
matching the booted kernel for safe rollback.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix stale CV service doc (URL, forge domain, container tag) and add
guidance for reviewing build-time dependencies in private forge repos
during service reviews.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The 5-minute lookback window kept stale data from terminated pods
visible during rollouts, causing the alert to sit in Pending for
~5 minutes after every routine deployment. 60s still covers two
scrape cycles (30s interval) while clearing stale data much faster.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>