Commit graph

468 commits

Author SHA1 Message Date
bae1cc116d Fix Grafana dashboard folder provisioning for TeslaMate
foldersFromFilesStructure was set to false, which caused Grafana's
provisioning provider to ignore the subdirectory structure created by
the sidecar's folderAnnotation feature. All dashboards ended up in the
root "Dashboards" folder despite ConfigMaps having the correct
grafana_folder annotation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 18:38:05 -08:00
6871bf32a8 Remove unused gpu panel in frigate dashboard 2026-02-22 18:26:47 -08:00
2c6c6a244a Fix Frigate Prometheus metrics & rebuild Grafana dashboard (#252)
## Summary

- **Prometheus scrape target:** Changed from `frigate.frigate.svc.cluster.local:5000` (broken after ringtail migration) to `nvr.ops.eblu.me` via HTTPS through Caddy on indri
- **Grafana dashboard:** Rebuilt for Frigate 0.17 metrics — 12 panels total:
  - Row 1 (stats): Uptime, Inference Speed, Camera FPS, Detection FPS, GPU Usage, GPU Temp
  - Row 2 (timeseries): CPU Usage, Memory Usage
  - Row 3 (timeseries): Camera FPS + Skipped FPS, GPU Usage + Memory over time
  - Row 4 (timeseries): Storage Usage, Detection Events (rate by camera/label)

## Deployment and Testing

1. Sync prometheus app on branch:
   ```
   argocd app set prometheus --revision fix/frigate-metrics-dashboard && argocd app sync prometheus
   ```
2. Check `prometheus.ops.eblu.me/targets` — frigate job should show UP
3. Sync grafana-config:
   ```
   argocd app sync grafana-config
   ```
4. Check `grafana.ops.eblu.me` — Frigate NVR dashboard should show live data
5. After merge: reset both apps to `--revision main` and sync

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/252
2026-02-22 18:14:17 -08:00
Forgejo Actions
dda7d719b3 Update docs release to v1.11.2
- Built changelog from towncrier fragments

[skip ci]
2026-02-22 17:52:05 -08:00
e655f4556e Upgrade k8s forgejo-runner from v6.3.1 to v12.7.0 (#251) v1.11.2
## Summary

Completes the `upgrade-k8s-runner` mikado chain. Both prerequisites (workflow validation in Dagger, config review against v12 defaults) were resolved in #250.

- Bump runner image `code.forgejo.org/forgejo/runner:6.3.1` → `12.7.0`
- Update `service-versions.yaml` to track new version
- Mark goal card complete (remove `status: active`)

## Deployment and Testing

After merge:
1. `argocd app sync forgejo-runner`
2. Verify runner registers in Forgejo admin → runners
3. Trigger a test workflow (e.g. `branch-cleanup.yaml` manual dispatch)

Rollback: revert image tag to `6.3.1`, push, sync.
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/251
2026-02-22 17:43:39 -08:00
0f6a1898f0 Prepare forgejo-runner v12 upgrade (leaf nodes) (#250)
## Summary
- Review runner config against v12.7.0 defaults — added `shutdown_timeout: 3h`, no breaking changes found
- Add `validate_workflows` Dagger function using `forgejo-runner validate --directory .` inside upstream container
- All 6 workflows pass v12.7.0 schema validation
- Wire `mise run validate-workflows` task and pre-commit hook on `.forgejo/workflows/` changes
- Mark both leaf Mikado cards (`review-runner-config-v12`, `validate-workflows-against-v12`) complete

## Mikado State
After merge, `upgrade-k8s-runner` goal card has no unmet dependencies — ready to execute the actual image bump in a follow-up PR.

## Test Plan
- [x] `dagger call validate-workflows --src=.` passes (all 6 workflows OK)
- [x] Pre-commit hooks pass
- [ ] Reviewer: confirm `shutdown_timeout: 3h` addition to ConfigMap looks reasonable

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/250
2026-02-22 17:38:32 -08:00
00b0287bcc Upgrade k8s forgejo-runner from v6.3.1 to v12.x (#249)
## Summary
- C2 Mikado chain for upgrading the k8s forgejo-runner daemon (6 major versions behind)
- Root goal card with two leaf prerequisites: workflow validation and config review
- Ringtail runner is already at ~v12.6.4 via nixpkgs, no work needed there

## Mikado Chain

```
upgrade-k8s-runner (goal)
├── validate-workflows-against-v12 (leaf)
└── review-runner-config-v12 (leaf)
```

Both leaves are actionable now. The biggest risk is workflow schema validation
(introduced in v8/v9) rejecting our existing workflows.

## Next Steps
Work the leaf nodes in a follow-up session, then attempt the goal.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/249
2026-02-22 17:12:45 -08:00
198c810e86 Fix branch-cleanup: fall back to head.label for deleted branches (#248)
## Summary
- Forgejo rewrites `head.ref` to `refs/pull/N/head` once a PR's source branch is deleted from the remote
- The original branch name is preserved in `head.label`
- This was causing 188 out of 246 merged PRs to go undetected by the cleanup script
- Fix: fall back to `head.label` when `head.ref` starts with `refs/pull/`

## Test plan
- [x] Dry run correctly identifies 18 previously-missed local branches
- [x] Live run successfully deleted all 18

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/248
2026-02-22 16:15:51 -08:00
abb7e6fe88 Add branch-cleanup mise task (#247)
## Summary
- New `mise run branch-cleanup` task that finds branches merged into main and deletes them locally and on the Forgejo remote
- Configurable `--cutoff` (default 30 days) skips branches with recent HEAD commits
- Supports `--dry-run`, `--local-only`, `--remote-only` flags
- Interactive confirmation before any deletion

## Test plan
- [x] `mise run branch-cleanup -- --dry-run` shows correct table of candidates
- [ ] Run without `--dry-run` to confirm actual deletion works

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/247
2026-02-22 16:00:09 -08:00
d51c180fe6 Switch Frigate detection model from YOLO-NAS-S to YOLOv9-c (#246)
## Summary
- Replace abandoned YOLO-NAS-S (320x320, `yolonas`) with YOLOv9-c (640x640, `yolo-generic`)
- YOLOv9-c benefits from CUDA Graphs in Frigate 0.17 on the RTX 4080
- Add `export_yolov9` Dagger pipeline and `frigate-export-model` mise task for reproducible model exports
- Model already deployed to `sifaka:/volume1/frigate/models/yolov9-c-640.onnx`

## Config changes
- `model_type: yolonas` → `yolo-generic`
- `input_dtype: int` → `float`
- `width/height: 320` → `640`
- `path:` → `yolov9-c-640.onnx`

## Deployment and Testing
- [ ] Merge and sync Frigate ArgoCD app: `argocd app sync frigate`
- [ ] Verify Frigate starts and detects objects at https://nvr.ops.eblu.me
- [ ] Confirm GPU inference via Frigate system metrics

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/246
2026-02-22 15:14:45 -08:00
2c081eed28 Add Forgejo repository health metrics and Grafana dashboard (#245)
## Summary
- New `forgejo_metrics` Ansible role that queries the Forgejo REST API every 60s and writes Prometheus textfile metrics (open PRs, issues, languages, releases, commits, Actions runs/duration/success)
- Grafana dashboard "Forgejo Repository Health" with 12 panels across 4 rows: overview stats, CI/CD health, repository info, and staleness tracking
- Deletes superseded `forgejo-actions-dashboard` plan doc (this implementation covers a broader scope)

## Deployment and Testing
- [ ] `mise run provision-indri -- --tags forgejo_metrics` to deploy the collector
- [ ] `ssh indri 'cat /opt/homebrew/var/node_exporter/textfile/forgejo.prom'` to verify metrics
- [ ] `argocd app sync grafana-config` to deploy the dashboard
- [ ] Check Grafana dashboard "Forgejo Repository Health" loads with data
- [ ] `mise run services-check` passes

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/245
2026-02-22 11:16:03 -08:00
Forgejo Actions
c21cf54847 Update docs release to v1.11.1
- Built changelog from towncrier fragments

[skip ci]
2026-02-22 10:21:19 -08:00
e41c28ed90 Replace indri-runner-logs with general-purpose runner-logs Typer CLI (#244) v1.11.1
## Summary
- Replace bash `indri-runner-logs` with a Python Typer CLI `runner-logs` that supports filtering by runner host (`indri`, `ringtail`, or `all`) with rich table output
- Add missing `#USAGE` declarations to `docs-review`, `docs-review-stale`, and `service-review` so flags work without the `--` separator
- Update docs references in `review-documentation.md` and `review-services.md` to use the new flag syntax

## Test plan
- [x] `mise run runner-logs all` lists runs from both runners
- [x] `mise run runner-logs ringtail` filters to ringtail-only runs
- [x] `mise run docs-review-stale --threshold 90` works without `--`
- [x] `mise run docs-review --limit 5` works without `--`
- [x] `mise run service-review --limit 3` works without `--`
- [x] Pre-commit hooks pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/244
2026-02-22 10:20:11 -08:00
c897fc8e1f Use Zot registry icon on homepage dashboard
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 09:19:46 -08:00
Forgejo Actions
627caeb61f Update docs release to v1.11.0
- Built changelog from towncrier fragments

[skip ci]
2026-02-22 09:16:00 -08:00
c427f04ec4 Review 3 docs: agent-change-process, build-authentik-container, create-authentik-secrets (#243) v1.11.0
## Summary
- Stamped `last-reviewed: 2026-02-22` on three never-reviewed docs
- `agent-change-process.md`: accurate, no content changes
- `build-authentik-container.md`: accurate, container image verified in registry
- `create-authentik-secrets.md`: added note about additional OIDC client secret fields added since original card was written

## Changelog
- `docs/changelog.d/doc-review/agent-change-process.doc.md` (not added — stamp-only, no user-visible change)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/243
2026-02-22 09:12:31 -08:00
529ba10939 Fix frigate-notify: webapi polling, dedup, hi-res snapshots (#242)
## Summary
- Switch from MQTT to webapi polling (v0.5.4 requires only one method)
- Poll every 15s for responsive alerts
- **`notify_once: true`** — one notification per event instead of repeats as object changes zones
- **`nosnap: drop`** — skip events without snapshots (was causing all events to be dropped on v0.3.5)
- **`snap_hires: true`** — use recording stream for higher quality snapshot images

## Deployment and Testing
- [ ] Sync: `argocd app set frigate --revision fix/frigate-notify-config && argocd app sync frigate`
- [ ] Verify pod starts: `kubectl --context=k3s-ringtail -n frigate get pods -l app=frigate-notify`
- [ ] Check logs for successful startup and event processing (no "No snapshot" drops)
- [ ] Wait for a motion event and confirm single ntfy notification with hi-res snapshot
- [ ] After merge: `argocd app set frigate --revision main && argocd app sync frigate`

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/242
2026-02-22 09:05:45 -08:00
7dcab826fa Upgrade frigate-notify from v0.3.5 to v0.5.4 (#241)
## Summary
- Service review: upgrade frigate-notify from v0.3.5 to v0.5.4
- No breaking changes for current MQTT + ntfy config
- Notable additions: high-res snapshots, MQTT topic parsing fixes, env var parsing fixes

## Deployment and Testing
- [ ] Sync frigate app on ringtail: `argocd app set frigate --revision review/frigate-notify-v0.5.4 && argocd app sync frigate`
- [ ] Verify pod starts cleanly: `kubectl --context=k3s-ringtail -n frigate get pods`
- [ ] Trigger a test alert (motion event) and confirm ntfy notification arrives
- [ ] After merge: `argocd app set frigate --revision main && argocd app sync frigate`

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/241
2026-02-22 08:42:47 -08:00
a5429d5a34 Update ringtail flake inputs, add flake-update pipeline (#240)
## Summary
- Update all ringtail NixOS flake inputs (nixpkgs, disko, home-manager) to latest
- Add `flake_update` Dagger function (`nix flake update`) alongside existing `flake_lock` (`nix flake lock`)
- Add how-to guide for managing the ringtail lockfile
- Update dagger and ringtail reference cards

## Deployment and Testing
- [x] `mise run provision-ringtail` — deployed successfully, `changed=2` (repo + rebuild)
- [x] `mise run services-check` — all services healthy
- [x] Doc link and index checks pass

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/240
2026-02-22 08:17:52 -08:00
b4015153c6 No navidrome authentikation 2026-02-21 20:33:48 -08:00
a82c705bf6 Add SSO login button to Jellyfin login page
Deploy branding.xml with a "Sign in with Authentik" button in the
login disclaimer. Local password login remains available.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 20:08:57 -08:00
07fb48626d Add Authentik SSO integration for Jellyfin (#239)
## Summary
- Add Authentik OIDC provider + application for Jellyfin via blueprint (all authenticated users allowed, no policy binding)
- Wire `jellyfin-client-secret` through ExternalSecret and Authentik worker deployment
- Install [jellyfin-plugin-sso](https://github.com/9p4/jellyfin-plugin-sso) v4.0.0.3 via Ansible, with OIDC config template
- Authentik `admins` group maps to Jellyfin administrator role
- Local login left enabled; SSO is additive

## Deployment and Testing
- [ ] Sync ArgoCD `authentik` app on branch — verify provider + application appear in Authentik admin
- [ ] `mise run provision-indri -- --tags jellyfin --check --diff` (dry run)
- [ ] `mise run provision-indri -- --tags jellyfin` (deploy plugin + config)
- [ ] Test SSO flow: `https://jellyfin.ops.eblu.me/sso/OID/start/authentik`
- [ ] Verify `eblume` account auto-links via `preferred_username` match
- [ ] Verify admins group → Jellyfin admin
- [ ] Reset ArgoCD app revision to main after merge

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/239
2026-02-21 20:05:44 -08:00
e1c2892878 Fix container tags deleted during old-tag cleanup
Five container manifests were removed when deleting old-style tags
(shared digests). Rebuild on a72a0d8 and update references.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 16:26:29 -08:00
a72a0d8e8e Update all container images to new upstream-version tagging scheme (#238)
## Summary
- Updates all 15 container image references across 14 ArgoCD manifest files
- Migrates from old internal `vX.Y.Z` tags to new `v<upstream-version>-<sha>` format
- Covers: authentik, cv, devpi, forgejo-runner, homepage, kiwix-serve, kubectl, miniflux, navidrome, ntfy, quartz, teslamate, transmission

## Deployment and Testing
- [ ] Sync all ArgoCD apps on branch revision
- [ ] Verify all services come up healthy
- [ ] Merge and re-sync on main
- [ ] Clean up old-style tags from zot registry

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/238
2026-02-21 15:58:11 -08:00
55d31c9c0b Docs pass: update zot Mikado chain for completion
- harden-zot-registry: fix Authentik hostname, check off all
  verified items, add metrics config to "what was done"
- enforce-tag-immutability: fix admins permissions (was missing
  update)
- agent-change-process: clarify that requires: is permanent and
  status: active is the only completion marker
- zot reference: update modified date
- wire-ci-registry-auth fragment: add metrics fix
- Remove stale harden-zot-mikado-cards.ai.md planning fragment

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 15:32:34 -08:00
8775c3841a Allow anonymous access to zot /metrics endpoint
Add accessControl.metrics.users with empty string to allow
unauthenticated Prometheus/Alloy scraping. Zot represents
anonymous users with an empty username internally.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 12:37:59 -08:00
ff63679efb Enable zot registry auth + wire CI credentials (#237)
## Summary

- Enable OIDC + API key authentication on zot registry with three-tier accessControl
  - `anonymousPolicy: ["read"]` — anyone can pull
  - `artifact-workloads` group: `["read", "create"]` — CI push, no overwrite/delete
  - `admins` group: `["read", "create", "update", "delete"]` — break-glass
- Wire both CI push paths (Dagger and Nix/skopeo) with `ZOT_CI_API_KEY` credentials
- Add `artifact-workloads` PolicyBinding in Authentik blueprint for zot app access
- Add `ZOT_CI_API_KEY` to Forgejo Actions secrets via existing ansible role

Completes the `wire-ci-registry-auth` and `harden-zot-registry` Mikado cards.

## Manual Deployment Steps (after merge)

1. Deploy Authentik blueprint: `argocd app sync authentik`
2. In Authentik admin UI: set a password for the `zot-ci` service account
3. Deploy zot config: `mise run provision-indri -- --tags zot`
4. Log in to `https://registry.ops.eblu.me` as `zot-ci` via OIDC → generate API key
5. Store API key in 1Password as `zot-ci-apikey` in blumeops vault
6. Sync Forgejo secrets: `mise run provision-indri -- --tags forgejo_actions_secrets`
7. Trigger a test container build to verify CI push
8. Verify anonymous pull: `curl -sf https://registry.ops.eblu.me/v2/_catalog`

## Uncertainties

- **Zot `accessControl` group matching with OIDC:** Groups from Authentik's `profile` scope claim should map to zot policy groups, but the exact claim-to-group matching needs runtime verification
- **`http.auth.apikey: true`:** This config key is documented but needs verification against the specific zot version built from source on indri
- **API key permissions:** Need to confirm zot API keys inherit the generating user's group for accessControl evaluation

## Test Plan

- [ ] `mise run provision-indri -- --check --diff --tags zot` shows expected config changes
- [ ] Anonymous pull works after deploy
- [ ] Unauthenticated push fails (401)
- [ ] OIDC browser login redirects to Authentik and back
- [ ] API key push works after key generation
- [ ] CI push succeeds with both Dagger and skopeo paths
- [ ] `mise run services-check` passes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/237
2026-02-21 12:20:29 -08:00
30a7c4de9b Close register-zot-oidc-client Mikado card
Completed in PR #236. Updated card to reflect what was actually
implemented, including deviations (worker env var wiring, manual
service account setup).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 08:49:32 -08:00
21b6533aea Register Zot as OIDC client in Authentik (#236)
## Summary
- Add Authentik blueprint (`zot.yaml`) with OAuth2 provider, application, `artifact-workloads` group, and `zot-ci` service account
- Wire `zot-client-secret` through ExternalSecret → worker Deployment env var → blueprint `!Env`
- Add Ansible pre_task to fetch OIDC secret from 1Password (item ID `oor7os5kapczgpbwv7obkca4y4`)
- Add `oidc-credentials.json.j2` template and deploy task in zot role (with `when` guard)

## Manual Steps Required Before Deploy
1. Generate client secret: `openssl rand -hex 32`
2. Store in 1Password: add field `zot-client-secret` to "Authentik (blumeops)" item in vault `blumeops`

## What This Does NOT Do
- Does NOT modify `config.json.j2` (that's the root goal `harden-zot-registry`)
- Does NOT wire CI auth (that's `wire-ci-registry-auth`)
- Does NOT set service account password or API keys (manual post-deploy)

## Verification
After ArgoCD sync:
- [ ] Authentik admin UI shows "Zot Registry" application
- [ ] OIDC discovery at `https://authentik.ops.eblu.me/application/o/zot/.well-known/openid-configuration` returns valid JSON
- [ ] Blueprint status is `successful`
- [ ] `artifact-workloads` group exists with `zot-ci` service account

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/236
2026-02-21 08:45:06 -08:00
04e036c603 Fold enforce-tag-immutability into harden-zot-registry (#235)
## Summary

- Removed `status: active` from `enforce-tag-immutability` card — its requirements are folded into the parent `harden-zot-registry` goal's `accessControl` configuration
- Updated `harden-zot-registry` with three-tier access control spec (anonymous read, artifact-workloads read+create, admins full)
- Added `artifact-workloads` group creation step to `register-zot-oidc-client`
- Added service account context to `wire-ci-registry-auth`

## Rationale

Tag immutability requires authentication to be meaningful. Without auth, everyone is anonymous and gets the same policy. Rather than client-side push checks, the registry enforces immutability server-side: CI gets `["read", "create"]` (no update/delete), so pushing an existing tag is rejected by zot itself.

## Test plan

- [ ] `mise run docs-check-links` passes
- [ ] `mise run docs-mikado` shows enforce-tag-immutability as resolved

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/235
2026-02-21 08:05:16 -08:00
64691da4fb Complete adopt-commit-based-container-tags Mikado card
All 14 containers now have commit-SHA-based tags in the registry.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 23:28:45 -08:00
96a2d420fb Update install-dagger-on-nix-runner card with actual resolution
Dagger can't run on the bare nix runner (needs container runtime).
Used nix eval directly instead.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 23:23:06 -08:00
b8bc0bffd8 Update ringtail flake.lock (remove dagger input)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 23:21:36 -08:00
02f2be5523 Use nix eval instead of dagger for nix runner version extraction
Dagger CLI needs a container runtime (Docker/containerd) to start its
engine, which the bare nix runner doesn't have. Use nix eval directly
instead — it's already available and more appropriate for a nix host.

Reverts the dagger flake input since it's not usable here.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 23:21:16 -08:00
de037ae51f Update ringtail flake.lock with dagger/nix input
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 23:15:19 -08:00
a2d2808270 Add dagger CLI to nix runner via github:dagger/nix flake
Dagger was removed from nixpkgs due to trademark concerns. Use the
official dagger/nix flake as a flake input instead, passing the package
through via specialArgs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 23:14:46 -08:00
e0d5f28147 Add dagger to nix-container-builder runner (#234)
## Summary
- Add `dagger` to `hostPackages` for the ringtail nix-container-builder runner
- Needed for `dagger call nix-version` fallback in the nix build workflow (authentik)
- `hostPackages` is scoped to the runner's systemd unit PATH, not system-wide
- Marks `install-dagger-on-nix-runner` Mikado card complete

## Deployment and Testing
- [ ] Merge, then `mise run provision-ringtail`
- [ ] `mise run container-build-and-release authentik` to verify nix build succeeds

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/234
2026-02-20 23:09:01 -08:00
a68a170a10 Add install-dagger-on-nix-runner Mikado card (#233)
## Summary
- New Mikado card: the ringtail nix-container-builder runner lacks dagger, which the nix workflow needs for `dagger call nix-version` (authentik version extraction fallback)
- Re-opens `adopt-commit-based-container-tags` with this new prerequisite
- All other containers (11 Dockerfile-only, nettest + ntfy with nix) build fine — only authentik's nix build is blocked

## Deployment and Testing
- Docs only, no deployment needed

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/233
2026-02-20 23:03:12 -08:00
ffa8727660 Adopt commit-based container tags (#232)
## Summary
- Replace git-tag-triggered container builds with path-based triggers on main and workflow_dispatch
- Image tags now encode upstream app version + commit SHA (`vX.Y.Z-<sha>`) for full traceability
- Replace `container-tag-and-release` task with `container-build-and-release` (dispatches workflows via Forgejo API)
- Update dagger `publish()` to accept `commit_sha` parameter
- Update all docs and references to the new workflow

## Deployment and Testing
- [ ] Merge to main
- [ ] `mise run container-build-and-release <name>` for each container to populate new-format tags
- [ ] Verify tags in registry via `mise run container-list`
- [ ] Existing images untouched — old tags remain available

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/232
2026-02-20 22:56:20 -08:00
0e2c10176d Harden zot registry, pt 1 (#231)
## Summary
- Enable OIDC + API key authentication on zot with anonymous pull preserved
- Enforce tag immutability for version tags
- Adopt commit-SHA-based container image tagging

Details in the [[harden-zot-registry]] Mikado chain (`mise run docs-mikado harden-zot-registry`).

## Test plan
- [ ] Anonymous pull still works
- [ ] Unauthenticated push fails (401)
- [ ] CI container builds pass with new auth and tagging
- [ ] `mise run services-check` passes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/231
2026-02-20 22:50:01 -08:00
6d7071e5ec Add commit-based container tagging prereq to harden-zot-registry chain (#230)
## Summary

- New Mikado card: `adopt-commit-based-container-tags` — replaces git-tag-triggered container builds with path-based main-branch triggers and manual workflow dispatch
- Image tags become `vX.Y.Z-<sha>` (with `-main` suffix for main branch builds, `-nix` for Nix builds), tying versions to the actual bundled app version and exact source commit
- `container-tag-and-release` mise task to be renamed to `container-build-and-release`, triggering workflow dispatch with the current HEAD SHA
- Added as soft prereq to `harden-zot-registry` Mikado chain

## Test plan

- [x] Pre-commit hooks pass (docs-check-index, docs-check-links, etc.)
- [ ] Review card content for completeness

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/230
2026-02-20 18:26:27 -08:00
379bcb98af Create C2 Mikado cards for harden-zot-registry (#229)
## Summary
- Replace the old pre-Mikado plan doc (`docs/how-to/plans/harden-zot-registry.md`) with a proper C2 Mikado chain in `docs/how-to/zot/`
- Root goal: `harden-zot-registry` — enable OIDC + API key auth on zot with anonymous pull preserved
- Three leaf prereqs: `register-zot-oidc-client`, `wire-ci-registry-auth`, `enforce-tag-immutability`
- Add Zot section to `how-to.md` index, remove plan entry from plans index
- All doc checks pass (`docs-check-links`, `docs-check-index`, `docs-mikado`)

## Changes
- **New:** `docs/how-to/zot/harden-zot-registry.md` — C2 Mikado root goal
- **New:** `docs/how-to/zot/register-zot-oidc-client.md` — Register OIDC client in Authentik
- **New:** `docs/how-to/zot/wire-ci-registry-auth.md` — Wire CI push paths with registry auth
- **New:** `docs/how-to/zot/enforce-tag-immutability.md` — Prevent version tag overwrites
- **Deleted:** `docs/how-to/plans/harden-zot-registry.md` — Old plan doc (content absorbed into Mikado cards)
- **Updated:** `docs/how-to/how-to.md` — Add Zot section, remove plan entry
- **Updated:** `docs/how-to/plans/plans.md` — Remove plan entry

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/229
2026-02-20 17:56:25 -08:00
cd50c1454a Integrate Forgejo with Authentik OIDC (#228)
## Summary

- Refactor Authentik blueprints: extract shared `admins` group into `common.yaml`, add `groups` scope mapping to all providers for group-based admin propagation
- Add Forgejo OAuth2 provider and application blueprint (`forgejo.yaml`)
- Add `forgejo-client-secret` to ExternalSecret and worker deployment env
- Configure Forgejo `[oauth2_client]` with `ACCOUNT_LINKING=login` to safely link existing accounts
- Update documentation (forgejo.md, authentik.md, federated-login.md)

## Deployment and Testing

After merge, deployment requires these steps in order:

1. **Authentik (ArgoCD):**
   - `argocd app set authentik --revision feature/forgejo-authentik-oidc && argocd app sync authentik`
   - Verify: Forgejo app/provider visible in Authentik admin UI
   - Verify: Grafana SSO still works (blueprint refactor)

2. **Forgejo app.ini (Ansible):**
   - `mise run provision-indri -- --tags forgejo --check --diff` (dry run)
   - `mise run provision-indri -- --tags forgejo` (apply, restarts Forgejo)

3. **Create Forgejo auth source (CLI on indri):**
   ```
   ssh indri 'sudo -u forgejo /opt/homebrew/bin/forgejo admin auth add-oauth \
     --name authentik \
     --provider openidConnect \
     --key forgejo \
     --secret "$(op read "op://vg6xf6vvfmoh5hqjjhlhbeoaie/Authentik (blumeops)/forgejo-client-secret")" \
     --auto-discover-url https://authentik.ops.eblu.me/application/o/forgejo/.well-known/openid-configuration \
     --scopes "openid email profile groups" \
     --group-claim-name groups \
     --admin-group admins'
   ```

4. **Link eblume account:** Sign in with Authentik on Forgejo, confirm link with local password

5. **Verify:** `tea repo list`, Forgejo Actions, local password break-glass

After merge: `argocd app set authentik --revision main && argocd app sync authentik`

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/228
2026-02-20 17:39:50 -08:00
e0c6b7df99 Add Authentik to homepage dashboard
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 13:03:48 -08:00
71cb256527 Deploy Authentik identity provider (C2 Mikado) (#227)
## Summary
C2 Mikado chain for deploying Authentik as the SSO identity provider, replacing Dex.

This PR will evolve over multiple sessions. Each iteration adds documentation (prerequisite cards) and eventually code as leaf nodes are resolved.

## Current Mikado State
- **Goal:** `deploy-authentik` (active)
- **Leaf prerequisites:**
  - `build-authentik-container` — Build Nix container image
  - `provision-authentik-database` — Create PostgreSQL database on CNPG cluster
  - `create-authentik-secrets` — Create 1Password item with credentials

## Process refinements
- Updated agent-change-process with lessons from first attempt: reset code before committing cards, open PRs early

## Test plan
- [ ] `mise run docs-mikado` shows correct dependency chain
- [ ] Leaf nodes can be worked independently
- [ ] Container builds on ringtail
- [ ] Authentik starts and reaches healthy state
- [ ] Forgejo OAuth2 connector works

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/227
2026-02-20 12:55:59 -08:00
174c6414ac Convert deploy-authentik plan to C2 Mikado chain (#226)
## Summary
- Strip detailed phase instructions from deploy-authentik plan (400→50 lines)
- Retain architecture decisions (ringtail, CNPG on indri, Nix containers, kustomize, Tailscale+Caddy) and open questions
- Add `status: active` frontmatter — now visible as a root goal in `mise run docs-mikado`
- Update plans index to reflect Active (C2) status

This is the first real use of the C2 Mikado chain system from #225. Future sessions will discover prerequisites, create sub-cards with `requires`, and work leaf nodes first.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/226
2026-02-20 08:22:19 -08:00
c1f7b2a9a3 Add agent change process (C0/C1/C2) and docs-mikado tool (#225)
## Summary
- Introduce C0/C1/C2 change classification based on the Mikado method, where documentation cards serve as persistent context for agents across sessions
- Add `docs-mikado` mise task to visualize active Mikado dependency chains from `status: active` and `requires` frontmatter fields
- Rename `zk-docs` task to `ai-docs`

## Changes
- **New:** `docs/how-to/agent-change-process.md` — methodology card
- **New:** `mise-tasks/docs-mikado` — Python uv script for dependency graph visualization
- **Renamed:** `mise-tasks/zk-docs` → `mise-tasks/ai-docs`
- **Updated:** `CLAUDE.md` — added Change Classification section, updated references
- **Updated:** `ai-assistance-guide.md`, `exploring-the-docs.md`, `how-to.md` — updated references and index

## Verification
- [x] `mise run ai-docs` works
- [x] `mise run docs-mikado` runs (no active chains yet, as expected)
- [x] `docs-check-links` — all valid
- [x] `docs-check-index` — all indexed
- [x] `docs-check-frontmatter` — all valid
- [x] All pre-commit hooks pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/225
2026-02-20 08:15:20 -08:00
c3748a0638 Add Authentik deployment plan (#224)
## Summary
- New plan document at `docs/how-to/plans/deploy-authentik.md` covering full Authentik deployment
- 6 phases: Helm analysis, prerequisites (CNPG/Redis/1Password), Nix containers, kustomize manifests, networking, monitoring
- Authentik replaces Dex as the identity provider for central user management and multi-protocol SSO
- Updated plans index and how-to index

## Deployment and Testing
- [x] Pre-commit hooks pass (docs-check-links, docs-check-index, docs-check-frontmatter)
- [ ] Review plan content for accuracy and completeness
- No deployment needed — documentation only

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/224
2026-02-20 07:06:56 -08:00
Forgejo Actions
18f1ac61fc Update docs release to v1.10.0
- Built changelog from towncrier fragments

[skip ci]
2026-02-19 20:45:43 -08:00
d21798b1f3 Document Dex OIDC and add services-check integration (#223) v1.10.0
## Summary
- Create Dex reference card (`docs/reference/services/dex.md`) with quick reference, architecture, identity source, storage, OIDC clients, secrets, and endpoints
- Write federated login explanation article (`docs/explanation/federated-login.md`) covering the Dex + Forgejo two-layer auth model, login flow, and break-glass access
- Add Dex to `services-check` (HTTP health endpoint + k3s pod check)
- Update Grafana docs with new Authentication section documenting SSO via Dex
- Update Forgejo docs with OAuth2 Provider section documenting its role as upstream identity source
- Add Dex to ringtail workloads table and reference service index
- Move `adopt-oidc-provider` plan to `completed/` with final design reflecting actual implementation

## Test plan
- [ ] `mise run services-check` passes (includes new Dex checks)
- [ ] `docs-check-links` passes (all wiki-links resolve)
- [ ] `docs-check-index` passes (new docs are indexed)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/223
2026-02-19 20:44:23 -08:00