Commit graph

101 commits

Author SHA1 Message Date
08b9570ac7 Review build-authentik-from-source Mikado chain docs
Fix go-server-derivation: wrong path target (webui not authentik-django)
and missing internal/web/static.go patch. Remove stale DRF fork content
from mirror-build-deps (no longer needed as of 2026.2.0). Add
last-reviewed to all 5 cards without it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 07:28:09 -08:00
2a2811d7a5 Review authentik-api-client-generation doc: fix stale content
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 17:21:46 -08:00
2d4098e480 Fix authentik 2026.2.0 migration ordering bug (#275)
All checks were successful
Build Container / detect (push) Successful in 2s
Build Container (Nix) / detect (push) Successful in 1s
Build Container / build (authentik) (push) Successful in 1s
Build Container (Nix) / build (authentik) (push) Successful in 3m6s
## Summary

- Patch `authentik_rbac/0010` migration to depend on `authentik_core/0056`, fixing non-deterministic ordering that crashes startup with `FieldError: Cannot resolve keyword 'group_id'`
- Upstream bug: goauthentik/authentik#19616, #20634 — no fix released yet
- Document the issue in the lessons-learned table

## Deployment and Testing

- [ ] CI builds container image
- [ ] Deploy from branch: `argocd app set authentik --revision fix/authentik-migration-ordering && argocd app sync authentik`
- [ ] Pods reach Running/Ready without crash-looping
- [ ] `kubectl logs` show 0056 migrating before 0010
- [ ] authentik UI loads at authentik.ops.eblu.me
- [ ] `mise run services-check`
- [ ] After merge: `argocd app set authentik --revision main && argocd app sync authentik`

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/275
2026-03-01 16:28:36 -08:00
efa9806bfa C2: Build authentik from source (Mikado chain) (#274)
All checks were successful
Build Container / detect (push) Successful in 3s
Build Container (Nix) / detect (push) Successful in 1s
Build Container / build (authentik) (push) Successful in 2s
Build Container (Nix) / build (authentik) (push) Successful in 22s
## Mikado Chain: build-authentik-from-source

Replace `pkgs.authentik` from nixpkgs with a custom Nix derivation built from source.
This removes the dependency on the nixpkgs packaging timeline and gives full version control.

Target version: **2025.12.4** (nixpkgs reference, upgrading from deployed 2025.10.1).

### Dependency Graph

```
build-authentik-from-source (goal)
├── authentik-go-server-derivation
│   ├── authentik-api-client-generation  ← IN PROGRESS
│   └── authentik-python-backend-derivation
├── authentik-web-ui-derivation
│   └── authentik-api-client-generation  ← IN PROGRESS
└── authentik-python-backend-derivation
```

### Ready Leaves
- `authentik-api-client-generation` — Go + TypeScript client generation from OpenAPI schema
- `authentik-python-backend-derivation` — Django backend with 60+ deps, 4 in-tree packages

### Architecture
Ported from [nixpkgs `pkgs/by-name/au/authentik/package.nix`](https://github.com/NixOS/nixpkgs/tree/master/pkgs/by-name/au/authentik):
- `source.nix` — shared version/source fetch
- `client-go.nix` — Go API client generation
- `client-ts.nix` — TypeScript API client generation
- `api-go-vendor-hook.nix` — Go vendor directory injection hook
- (more components to follow as leaves are closed)

### Related Cards
- [[build-authentik-from-source]] — Goal card
- [[authentik-api-client-generation]]
- [[authentik-python-backend-derivation]]
- [[authentik-web-ui-derivation]]
- [[authentik-go-server-derivation]]

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/274
2026-03-01 13:45:00 -08:00
0aaf9bb8b2 Add Dagger local build step to authentik source build goal
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 08:39:25 -08:00
7094ea7d3e Start C2 Mikado chain: build authentik from source
Create goal card and 4 prerequisite cards for building authentik from a
custom Nix derivation instead of using pkgs.authentik from nixpkgs. This
removes the dependency on the nixpkgs packaging timeline and gives full
version control over authentik releases.

Chain: mikado/authentik-source-build
Leaf nodes: authentik-api-client-generation, authentik-python-backend-derivation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 08:20:17 -08:00
8d1e98617b Review build-grafana-container docs: stamp reviewed, fix cross-links
Also fix stale grafana.md reference card (Helm → Kustomize).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 07:28:06 -08:00
7cecaf0471 Review forgejo-runner docs: stamp reviewed, fix cross-links
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 15:10:20 -08:00
9a7acffa26 Review manage-forgejo-mirrors doc: clarify cron default, stamp reviewed
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 07:17:18 -08:00
84338c32c2 Add authenticated GitHub PAT for Forgejo mirror sync (#269)
## Summary

- **mirror-create**: Auto-includes GitHub PAT from 1Password for authenticated upstream fetches at mirror creation time
- **mirror-update-pats**: New mise task that SSHes into indri and rewrites the git remote URL in every GitHub mirror's bare repo config to embed the PAT. Idempotent, supports `--dry-run`
- **app.ini.j2**: Explicit `[mirror]` section with `DEFAULT_INTERVAL = 8h` and `MIN_INTERVAL = 10m` (bakes in the defaults for visibility)
- **manage-forgejo-mirrors**: New how-to doc covering mirror creation, PAT storage, the `mirror-update-pats` task, and the full 20-day PAT rotation procedure

## Context

GitHub tightened unauthenticated rate limits for git clone/fetch in May 2025. With 23 GitHub mirrors syncing every 8 hours, authenticated fetches avoid throttling. The PAT is stored in 1Password (`Forgejo Secrets` → `github-mirror-pat`) and has been applied to all existing mirrors.

## Deployment and Testing

- [x] `mirror-update-pats` dry-run verified (23 mirrors detected)
- [x] `mirror-update-pats` applied to all 23 GitHub mirrors on indri
- [x] Idempotency confirmed (re-run shows 0 updated, 23 skipped)
- [ ] Provision indri with `--tags forgejo` to apply `[mirror]` config
- [ ] Trigger a manual mirror sync and verify success in Forgejo UI

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/269
2026-02-25 20:20:23 -08:00
e273f399ea Review 3 how-to docs and fix update-tailscale-acls inaccuracies
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 07:02:49 -08:00
34a1314f8d Document AirPlay cross-VLAN firewall rules and fix rule ordering
AirPlay from Main to IoT VLAN (Samsung Frame TV) required adding
established/related, AirPlay port, and dynamic reverse port rules —
but the root cause was rule ordering (allows appended after blocks).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 20:49:31 -08:00
cd578144f7 Migrate upstream mirrors to mirrors/ Forgejo org (#265)
All checks were successful
Build Container (Nix) / detect (push) Successful in 2s
Build Container (Nix) / build (homepage) (push) Successful in 3s
Build Container (Nix) / build (navidrome) (push) Successful in 3s
Build Container (Nix) / build (ntfy) (push) Successful in 8s
Build Container / detect (push) Successful in 42s
Build Container / build (navidrome) (push) Successful in 9m37s
Build Container / build (homepage) (push) Successful in 9m56s
Build Container / build (ntfy) (push) Successful in 2m35s
## Summary

- Created `mirrors` Forgejo organization for upstream mirror repos
- Transferred 22 mirror repos from `eblume/` to `mirrors/` (mirror sync config preserved)
- Deleted unused repos: hajimari, hister
- Updated all container build URLs (homepage, navidrome, ntfy Dockerfiles + nix)
- Updated documentation references (migrate-forgejo-from-brew, upstream-fork-strategy, fix-ntfy-nix-version)
- `dotfiles` intentionally kept under `eblume/` per user request
- `devpi` transferred to `mirrors/`

Repos remaining under `eblume/`: blumeops, cv, mcquack, dotfiles

## Cleanup TODO

- [ ] Delete temp Forgejo API token "claude-migration-temp" (Settings > Applications)

## Test Plan

- [x] Verified mirror config (mirror=true, original_url) survived transfer on test repo (tesla_auth)
- [x] All pre-commit hooks pass (including container-version-check, docs-check-links)
- [ ] Verify a mirror repo sync runs successfully after transfer (check mirrors/authentik or similar)
- [ ] Rebuild containers from branch to verify Dockerfile URLs resolve

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/265
2026-02-24 20:43:14 -08:00
1b9f706a30 Document container tag provenance and enhance container-list (#263)
## Summary

After investigating deployed container images, confirmed that squash-merging PRs orphans the commit SHAs embedded in container image tags. Two of our currently deployed images (prometheus, grafana) reference branch commits not on main.

This PR:

- Documents the squash-merge SHA orphan problem and the post-merge workflow in [[build-container-image]]
- Adds step 9 to the C1 process: after merging a PR that changes `containers/`, do a follow-up C0 to point manifests at the rebuilt `[main]` tag
- Rewrites `container-list` as a `uv run --script` (typer + rich + httpx)
- Adds optional container name filter (`mise run container-list prometheus` shows 10 tags instead of 4)
- Annotates every tag with `[main]` or `[branch]` based on git commit ancestry

## Test plan

- [x] `mise run container-list` — all containers shown with `[main]`/`[branch]` hints
- [x] `mise run container-list prometheus` — filtered view, more tags, correctly shows `[main]` and `[branch]`
- [x] `mise run container-list nonexistent` — error message with exit code 1
- [x] Pre-commit hooks pass

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/263
2026-02-24 09:54:58 -08:00
b1ba96f6d6 Review migrate-grafana-to-authentik: fix file paths, add last-reviewed
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 07:29:41 -08:00
9b4951bf94 Improve Mikado process: cycle discipline, reset rigor, --resume enhancements (#261)
## Summary
- **End-of-cycle prompting:** After closing a leaf node and pushing, the agent should prompt the user to review and suggest ending the session rather than rushing into the next leaf
- **Reset rigor:** Reinforced that errors during impl should trigger a branch reset + plan update (not fix-forward). Documented the `git log --oneline --not main` → `git reset --hard` → `git cherry-pick` pattern with clear threshold guidance
- **`--resume` shows PR number:** Queries the Forgejo API for open PRs matching the branch, displays number/title/URL and a hint to run `pr-comments`
- **`--resume` checks git stash:** Shows stash entries as a non-presumptive hint — informs without assuming they apply

## Test plan
- [ ] `mise run docs-mikado --resume` runs without errors (no active chains case)
- [ ] On a mikado branch with an open PR, verify PR info is shown
- [ ] With stashed work, verify stash entries are displayed
- [ ] Review agent-change-process.md for clarity

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/261
2026-02-23 21:03:27 -08:00
d05d2fbaff C2: Upgrade Grafana to 12.x with Nix container and Kustomize (#260)
All checks were successful
Build Container (Nix) / detect (push) Successful in 2s
Build Container / detect (push) Successful in 1s
Build Container (Nix) / build (grafana) (push) Successful in 2s
Build Container / build (grafana) (push) Successful in 7s
## Summary

Mikado chain to upgrade Grafana from 11.4.0 (Helm chart) to 12.x with:
- Home-built Nix container image (`forge.ops.eblu.me/eblume/grafana`)
- Kustomize manifests replacing the Helm chart
- Single-source ArgoCD app

## Chain

Goal: `upgrade-grafana`
Leaves: `build-grafana-container`, `kustomize-grafana-deployment`

Track with: `mise run docs-mikado upgrade-grafana`

## Test plan
- [ ] Container builds successfully via Nix
- [ ] Container pushed to registry
- [ ] Kustomize manifests produce equivalent resources to current Helm
- [ ] Pod runs, UI loads, OIDC works, datasources healthy
- [ ] `mise run services-check` passes

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/260
2026-02-23 18:07:18 -08:00
4c5e0f0d16 Rename containers/forgejo-runner to runner-job-image
All checks were successful
Build Container (Nix) / detect (push) Successful in 2s
Build Container / detect (push) Successful in 2s
Build Container (Nix) / build (runner-job-image) (push) Successful in 2s
Build Container / build (runner-job-image) (push) Successful in 1m42s
The forgejo-runner container is the CI job execution environment (Dagger,
ArgoCD CLI, etc.), not the runner daemon itself. Rename to runner-job-image
to fix the version-check false positive (Dagger 0.19.11 vs daemon 12.7.0)
and clarify the distinction.

RUNNER_LABELS still references the old image name — will update after
building the image under the new name.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 17:44:51 -08:00
66b5b32f1d Formalize C0/C1/C2 change classification (#259)
## Summary
- **C0 (Quick Fix):** Now explicitly allows direct-to-main commits with no PR required — for low-risk, fix-forward-safe changes
- **C1 (Human Review):** New docs-first workflow with branch deployment (ArgoCD `--revision`, Ansible from checkout). Includes upgrade criteria for escalation to C2
- **C2 (Mikado Chain):** Introduces the **Mikado Branch Invariant** — strict commit ordering where card-introducing commits come first, followed by code progress, followed by card closures. Branch resets required when new prerequisites are discovered

Updates CLAUDE.md rules (3, 4, 8, 9) to reflect that C0 bypasses branching/PR requirements. Also updates ai-assistance-guide, how-to index, and docs-mikado task description.

## Files changed
- `CLAUDE.md` — rules and classification table
- `docs/how-to/agent-change-process.md` — full process rewrite
- `docs/tutorials/ai-assistance-guide.md` — branching and pitfalls sections
- `docs/how-to/how-to.md` — index description
- `mise-tasks/docs-mikado` — task description
- `docs/changelog.d/formalize-change-classification.doc.md` — changelog fragment

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/259
2026-02-23 16:19:54 -08:00
f05e5cccdf Review Grafana: replace Helm upgrade plan with C2 Mikado chain (#258)
## Summary
- Delete the old 3-phase Helm chart upgrade plan (predates Mikado system)
- Create C2 Mikado chain with goal card `upgrade-grafana` and two leaf prereqs:
  - `kustomize-grafana-deployment` — convert Helm to kustomize manifests
  - `build-grafana-container` — home-built Grafana 12.x image (no upstream containers)
- Record first-ever Grafana review: currently at v11.4.0 on Helm chart 8.8.2
- Update service-versions.yaml, how-to index, and plans index

## Service Review Findings
- Grafana is healthy and synced in ArgoCD
- Running v11.4.0, latest upstream is 12.3.3
- Breaking changes for 12.x are low-risk (React panels only, UIDs compliant)
- PVC is disposable — dashboards and datasources are all config-provisioned

## Deployment and Testing
- [ ] No deployment needed — documentation-only change
- [ ] `docs-check-links` passes
- [ ] `docs-check-index` passes

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/258
2026-02-23 15:06:00 -08:00
2865bf5c27 Review deploy-authentik: rewrite as process guide (#257)
## Summary
- Rewrites deploy-authentik from a historical changelog into a reproducible process guide
- Removes stale version info (`v1.1.2-nix`) and future work section (Forgejo federation is done, rest belongs elsewhere)
- Marks deploy-authentik as completed in plans index and completed archive
- Removes hardcoded image tag from authentik reference card (use `service-versions.yaml`)
- Adds `last-reviewed: 2026-02-23` frontmatter

## Test plan
- [x] All pre-commit hooks pass (docs-check-links, docs-check-index, etc.)
- [x] ArgoCD app verified synced and healthy
- [x] All wiki-links validated

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/257
2026-02-23 14:35:39 -08:00
84d2cdcf14 Update tooling dependencies (Feb 2026 cycle)
Pre-commit: trufflehog v3.93.4, ruff v0.15.2, shellcheck v0.11.0.1,
prettier v3.8.1, actionlint v1.7.11

Fly.io: pin nginx 1.28.2-alpine, bump alloy v1.5.1 -> v1.13.1

Forgejo workflows: pin actions/checkout to SHA (v4.3.1)

Mise tasks: normalize httpx>=0.28.0, typer>=0.15.0 across all scripts

Add how-to doc for the monthly tooling dependency update cycle.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 13:22:09 -08:00
cb9a06bb75 Update tooling dependencies (Feb 2026 cycle) (#254)
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 1m30s
## Summary

Monthly tooling dependency update cycle:

- **Pre-commit hooks**: trufflehog v3.92.5→v3.93.4, ruff v0.14.13→v0.15.2, shellcheck v0.10.0.1→v0.11.0.1, prettier v3.8.0→v3.8.1, actionlint v1.7.10→v1.7.11
- **Fly.io Dockerfile**: pin nginx to 1.28.2-alpine (was unpinned), bump alloy v1.5.1→v1.13.1
- **Mise tasks**: normalize httpx lower bound to >=0.28.0 and typer to >=0.15.0 across all scripts
- **Forgejo workflows**: actions/checkout@v4 is current, no changes needed
- **New how-to doc**: [[update-tooling-dependencies]] documenting this monthly cycle

## No changes needed

- pre-commit-hooks v6.0.0, yamllint v1.38.0, shfmt v3.12.0-2, taplo v0.9.3, ansible-lint 26.1.1 — all already at latest

## Test plan

- [x] `uvx pre-commit run --all-files` — all 24 hooks pass
- [ ] Fly.io deploy (triggered automatically on merge to main via deploy-fly workflow)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/254
2026-02-23 13:08:41 -08:00
e655f4556e Upgrade k8s forgejo-runner from v6.3.1 to v12.7.0 (#251)
## Summary

Completes the `upgrade-k8s-runner` mikado chain. Both prerequisites (workflow validation in Dagger, config review against v12 defaults) were resolved in #250.

- Bump runner image `code.forgejo.org/forgejo/runner:6.3.1` → `12.7.0`
- Update `service-versions.yaml` to track new version
- Mark goal card complete (remove `status: active`)

## Deployment and Testing

After merge:
1. `argocd app sync forgejo-runner`
2. Verify runner registers in Forgejo admin → runners
3. Trigger a test workflow (e.g. `branch-cleanup.yaml` manual dispatch)

Rollback: revert image tag to `6.3.1`, push, sync.
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/251
2026-02-22 17:43:39 -08:00
0f6a1898f0 Prepare forgejo-runner v12 upgrade (leaf nodes) (#250)
## Summary
- Review runner config against v12.7.0 defaults — added `shutdown_timeout: 3h`, no breaking changes found
- Add `validate_workflows` Dagger function using `forgejo-runner validate --directory .` inside upstream container
- All 6 workflows pass v12.7.0 schema validation
- Wire `mise run validate-workflows` task and pre-commit hook on `.forgejo/workflows/` changes
- Mark both leaf Mikado cards (`review-runner-config-v12`, `validate-workflows-against-v12`) complete

## Mikado State
After merge, `upgrade-k8s-runner` goal card has no unmet dependencies — ready to execute the actual image bump in a follow-up PR.

## Test Plan
- [x] `dagger call validate-workflows --src=.` passes (all 6 workflows OK)
- [x] Pre-commit hooks pass
- [ ] Reviewer: confirm `shutdown_timeout: 3h` addition to ConfigMap looks reasonable

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/250
2026-02-22 17:38:32 -08:00
00b0287bcc Upgrade k8s forgejo-runner from v6.3.1 to v12.x (#249)
## Summary
- C2 Mikado chain for upgrading the k8s forgejo-runner daemon (6 major versions behind)
- Root goal card with two leaf prerequisites: workflow validation and config review
- Ringtail runner is already at ~v12.6.4 via nixpkgs, no work needed there

## Mikado Chain

```
upgrade-k8s-runner (goal)
├── validate-workflows-against-v12 (leaf)
└── review-runner-config-v12 (leaf)
```

Both leaves are actionable now. The biggest risk is workflow schema validation
(introduced in v8/v9) rejecting our existing workflows.

## Next Steps
Work the leaf nodes in a follow-up session, then attempt the goal.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/249
2026-02-22 17:12:45 -08:00
2c081eed28 Add Forgejo repository health metrics and Grafana dashboard (#245)
## Summary
- New `forgejo_metrics` Ansible role that queries the Forgejo REST API every 60s and writes Prometheus textfile metrics (open PRs, issues, languages, releases, commits, Actions runs/duration/success)
- Grafana dashboard "Forgejo Repository Health" with 12 panels across 4 rows: overview stats, CI/CD health, repository info, and staleness tracking
- Deletes superseded `forgejo-actions-dashboard` plan doc (this implementation covers a broader scope)

## Deployment and Testing
- [ ] `mise run provision-indri -- --tags forgejo_metrics` to deploy the collector
- [ ] `ssh indri 'cat /opt/homebrew/var/node_exporter/textfile/forgejo.prom'` to verify metrics
- [ ] `argocd app sync grafana-config` to deploy the dashboard
- [ ] Check Grafana dashboard "Forgejo Repository Health" loads with data
- [ ] `mise run services-check` passes

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/245
2026-02-22 11:16:03 -08:00
e41c28ed90 Replace indri-runner-logs with general-purpose runner-logs Typer CLI (#244)
## Summary
- Replace bash `indri-runner-logs` with a Python Typer CLI `runner-logs` that supports filtering by runner host (`indri`, `ringtail`, or `all`) with rich table output
- Add missing `#USAGE` declarations to `docs-review`, `docs-review-stale`, and `service-review` so flags work without the `--` separator
- Update docs references in `review-documentation.md` and `review-services.md` to use the new flag syntax

## Test plan
- [x] `mise run runner-logs all` lists runs from both runners
- [x] `mise run runner-logs ringtail` filters to ringtail-only runs
- [x] `mise run docs-review-stale --threshold 90` works without `--`
- [x] `mise run docs-review --limit 5` works without `--`
- [x] `mise run service-review --limit 3` works without `--`
- [x] Pre-commit hooks pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/244
2026-02-22 10:20:11 -08:00
c427f04ec4 Review 3 docs: agent-change-process, build-authentik-container, create-authentik-secrets (#243)
## Summary
- Stamped `last-reviewed: 2026-02-22` on three never-reviewed docs
- `agent-change-process.md`: accurate, no content changes
- `build-authentik-container.md`: accurate, container image verified in registry
- `create-authentik-secrets.md`: added note about additional OIDC client secret fields added since original card was written

## Changelog
- `docs/changelog.d/doc-review/agent-change-process.doc.md` (not added — stamp-only, no user-visible change)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/243
2026-02-22 09:12:31 -08:00
a5429d5a34 Update ringtail flake inputs, add flake-update pipeline (#240)
## Summary
- Update all ringtail NixOS flake inputs (nixpkgs, disko, home-manager) to latest
- Add `flake_update` Dagger function (`nix flake update`) alongside existing `flake_lock` (`nix flake lock`)
- Add how-to guide for managing the ringtail lockfile
- Update dagger and ringtail reference cards

## Deployment and Testing
- [x] `mise run provision-ringtail` — deployed successfully, `changed=2` (repo + rebuild)
- [x] `mise run services-check` — all services healthy
- [x] Doc link and index checks pass

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/240
2026-02-22 08:17:52 -08:00
55d31c9c0b Docs pass: update zot Mikado chain for completion
- harden-zot-registry: fix Authentik hostname, check off all
  verified items, add metrics config to "what was done"
- enforce-tag-immutability: fix admins permissions (was missing
  update)
- agent-change-process: clarify that requires: is permanent and
  status: active is the only completion marker
- zot reference: update modified date
- wire-ci-registry-auth fragment: add metrics fix
- Remove stale harden-zot-mikado-cards.ai.md planning fragment

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 15:32:34 -08:00
ff63679efb Enable zot registry auth + wire CI credentials (#237)
## Summary

- Enable OIDC + API key authentication on zot registry with three-tier accessControl
  - `anonymousPolicy: ["read"]` — anyone can pull
  - `artifact-workloads` group: `["read", "create"]` — CI push, no overwrite/delete
  - `admins` group: `["read", "create", "update", "delete"]` — break-glass
- Wire both CI push paths (Dagger and Nix/skopeo) with `ZOT_CI_API_KEY` credentials
- Add `artifact-workloads` PolicyBinding in Authentik blueprint for zot app access
- Add `ZOT_CI_API_KEY` to Forgejo Actions secrets via existing ansible role

Completes the `wire-ci-registry-auth` and `harden-zot-registry` Mikado cards.

## Manual Deployment Steps (after merge)

1. Deploy Authentik blueprint: `argocd app sync authentik`
2. In Authentik admin UI: set a password for the `zot-ci` service account
3. Deploy zot config: `mise run provision-indri -- --tags zot`
4. Log in to `https://registry.ops.eblu.me` as `zot-ci` via OIDC → generate API key
5. Store API key in 1Password as `zot-ci-apikey` in blumeops vault
6. Sync Forgejo secrets: `mise run provision-indri -- --tags forgejo_actions_secrets`
7. Trigger a test container build to verify CI push
8. Verify anonymous pull: `curl -sf https://registry.ops.eblu.me/v2/_catalog`

## Uncertainties

- **Zot `accessControl` group matching with OIDC:** Groups from Authentik's `profile` scope claim should map to zot policy groups, but the exact claim-to-group matching needs runtime verification
- **`http.auth.apikey: true`:** This config key is documented but needs verification against the specific zot version built from source on indri
- **API key permissions:** Need to confirm zot API keys inherit the generating user's group for accessControl evaluation

## Test Plan

- [ ] `mise run provision-indri -- --check --diff --tags zot` shows expected config changes
- [ ] Anonymous pull works after deploy
- [ ] Unauthenticated push fails (401)
- [ ] OIDC browser login redirects to Authentik and back
- [ ] API key push works after key generation
- [ ] CI push succeeds with both Dagger and skopeo paths
- [ ] `mise run services-check` passes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/237
2026-02-21 12:20:29 -08:00
30a7c4de9b Close register-zot-oidc-client Mikado card
Completed in PR #236. Updated card to reflect what was actually
implemented, including deviations (worker env var wiring, manual
service account setup).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 08:49:32 -08:00
04e036c603 Fold enforce-tag-immutability into harden-zot-registry (#235)
## Summary

- Removed `status: active` from `enforce-tag-immutability` card — its requirements are folded into the parent `harden-zot-registry` goal's `accessControl` configuration
- Updated `harden-zot-registry` with three-tier access control spec (anonymous read, artifact-workloads read+create, admins full)
- Added `artifact-workloads` group creation step to `register-zot-oidc-client`
- Added service account context to `wire-ci-registry-auth`

## Rationale

Tag immutability requires authentication to be meaningful. Without auth, everyone is anonymous and gets the same policy. Rather than client-side push checks, the registry enforces immutability server-side: CI gets `["read", "create"]` (no update/delete), so pushing an existing tag is rejected by zot itself.

## Test plan

- [ ] `mise run docs-check-links` passes
- [ ] `mise run docs-mikado` shows enforce-tag-immutability as resolved

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/235
2026-02-21 08:05:16 -08:00
64691da4fb Complete adopt-commit-based-container-tags Mikado card
All 14 containers now have commit-SHA-based tags in the registry.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 23:28:45 -08:00
96a2d420fb Update install-dagger-on-nix-runner card with actual resolution
Dagger can't run on the bare nix runner (needs container runtime).
Used nix eval directly instead.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 23:23:06 -08:00
e0d5f28147 Add dagger to nix-container-builder runner (#234)
## Summary
- Add `dagger` to `hostPackages` for the ringtail nix-container-builder runner
- Needed for `dagger call nix-version` fallback in the nix build workflow (authentik)
- `hostPackages` is scoped to the runner's systemd unit PATH, not system-wide
- Marks `install-dagger-on-nix-runner` Mikado card complete

## Deployment and Testing
- [ ] Merge, then `mise run provision-ringtail`
- [ ] `mise run container-build-and-release authentik` to verify nix build succeeds

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/234
2026-02-20 23:09:01 -08:00
a68a170a10 Add install-dagger-on-nix-runner Mikado card (#233)
## Summary
- New Mikado card: the ringtail nix-container-builder runner lacks dagger, which the nix workflow needs for `dagger call nix-version` (authentik version extraction fallback)
- Re-opens `adopt-commit-based-container-tags` with this new prerequisite
- All other containers (11 Dockerfile-only, nettest + ntfy with nix) build fine — only authentik's nix build is blocked

## Deployment and Testing
- Docs only, no deployment needed

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/233
2026-02-20 23:03:12 -08:00
ffa8727660 Adopt commit-based container tags (#232)
## Summary
- Replace git-tag-triggered container builds with path-based triggers on main and workflow_dispatch
- Image tags now encode upstream app version + commit SHA (`vX.Y.Z-<sha>`) for full traceability
- Replace `container-tag-and-release` task with `container-build-and-release` (dispatches workflows via Forgejo API)
- Update dagger `publish()` to accept `commit_sha` parameter
- Update all docs and references to the new workflow

## Deployment and Testing
- [ ] Merge to main
- [ ] `mise run container-build-and-release <name>` for each container to populate new-format tags
- [ ] Verify tags in registry via `mise run container-list`
- [ ] Existing images untouched — old tags remain available

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/232
2026-02-20 22:56:20 -08:00
0e2c10176d Harden zot registry, pt 1 (#231)
## Summary
- Enable OIDC + API key authentication on zot with anonymous pull preserved
- Enforce tag immutability for version tags
- Adopt commit-SHA-based container image tagging

Details in the [[harden-zot-registry]] Mikado chain (`mise run docs-mikado harden-zot-registry`).

## Test plan
- [ ] Anonymous pull still works
- [ ] Unauthenticated push fails (401)
- [ ] CI container builds pass with new auth and tagging
- [ ] `mise run services-check` passes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/231
2026-02-20 22:50:01 -08:00
6d7071e5ec Add commit-based container tagging prereq to harden-zot-registry chain (#230)
## Summary

- New Mikado card: `adopt-commit-based-container-tags` — replaces git-tag-triggered container builds with path-based main-branch triggers and manual workflow dispatch
- Image tags become `vX.Y.Z-<sha>` (with `-main` suffix for main branch builds, `-nix` for Nix builds), tying versions to the actual bundled app version and exact source commit
- `container-tag-and-release` mise task to be renamed to `container-build-and-release`, triggering workflow dispatch with the current HEAD SHA
- Added as soft prereq to `harden-zot-registry` Mikado chain

## Test plan

- [x] Pre-commit hooks pass (docs-check-index, docs-check-links, etc.)
- [ ] Review card content for completeness

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/230
2026-02-20 18:26:27 -08:00
379bcb98af Create C2 Mikado cards for harden-zot-registry (#229)
## Summary
- Replace the old pre-Mikado plan doc (`docs/how-to/plans/harden-zot-registry.md`) with a proper C2 Mikado chain in `docs/how-to/zot/`
- Root goal: `harden-zot-registry` — enable OIDC + API key auth on zot with anonymous pull preserved
- Three leaf prereqs: `register-zot-oidc-client`, `wire-ci-registry-auth`, `enforce-tag-immutability`
- Add Zot section to `how-to.md` index, remove plan entry from plans index
- All doc checks pass (`docs-check-links`, `docs-check-index`, `docs-mikado`)

## Changes
- **New:** `docs/how-to/zot/harden-zot-registry.md` — C2 Mikado root goal
- **New:** `docs/how-to/zot/register-zot-oidc-client.md` — Register OIDC client in Authentik
- **New:** `docs/how-to/zot/wire-ci-registry-auth.md` — Wire CI push paths with registry auth
- **New:** `docs/how-to/zot/enforce-tag-immutability.md` — Prevent version tag overwrites
- **Deleted:** `docs/how-to/plans/harden-zot-registry.md` — Old plan doc (content absorbed into Mikado cards)
- **Updated:** `docs/how-to/how-to.md` — Add Zot section, remove plan entry
- **Updated:** `docs/how-to/plans/plans.md` — Remove plan entry

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/229
2026-02-20 17:56:25 -08:00
71cb256527 Deploy Authentik identity provider (C2 Mikado) (#227)
## Summary
C2 Mikado chain for deploying Authentik as the SSO identity provider, replacing Dex.

This PR will evolve over multiple sessions. Each iteration adds documentation (prerequisite cards) and eventually code as leaf nodes are resolved.

## Current Mikado State
- **Goal:** `deploy-authentik` (active)
- **Leaf prerequisites:**
  - `build-authentik-container` — Build Nix container image
  - `provision-authentik-database` — Create PostgreSQL database on CNPG cluster
  - `create-authentik-secrets` — Create 1Password item with credentials

## Process refinements
- Updated agent-change-process with lessons from first attempt: reset code before committing cards, open PRs early

## Test plan
- [ ] `mise run docs-mikado` shows correct dependency chain
- [ ] Leaf nodes can be worked independently
- [ ] Container builds on ringtail
- [ ] Authentik starts and reaches healthy state
- [ ] Forgejo OAuth2 connector works

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/227
2026-02-20 12:55:59 -08:00
174c6414ac Convert deploy-authentik plan to C2 Mikado chain (#226)
## Summary
- Strip detailed phase instructions from deploy-authentik plan (400→50 lines)
- Retain architecture decisions (ringtail, CNPG on indri, Nix containers, kustomize, Tailscale+Caddy) and open questions
- Add `status: active` frontmatter — now visible as a root goal in `mise run docs-mikado`
- Update plans index to reflect Active (C2) status

This is the first real use of the C2 Mikado chain system from #225. Future sessions will discover prerequisites, create sub-cards with `requires`, and work leaf nodes first.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/226
2026-02-20 08:22:19 -08:00
c1f7b2a9a3 Add agent change process (C0/C1/C2) and docs-mikado tool (#225)
## Summary
- Introduce C0/C1/C2 change classification based on the Mikado method, where documentation cards serve as persistent context for agents across sessions
- Add `docs-mikado` mise task to visualize active Mikado dependency chains from `status: active` and `requires` frontmatter fields
- Rename `zk-docs` task to `ai-docs`

## Changes
- **New:** `docs/how-to/agent-change-process.md` — methodology card
- **New:** `mise-tasks/docs-mikado` — Python uv script for dependency graph visualization
- **Renamed:** `mise-tasks/zk-docs` → `mise-tasks/ai-docs`
- **Updated:** `CLAUDE.md` — added Change Classification section, updated references
- **Updated:** `ai-assistance-guide.md`, `exploring-the-docs.md`, `how-to.md` — updated references and index

## Verification
- [x] `mise run ai-docs` works
- [x] `mise run docs-mikado` runs (no active chains yet, as expected)
- [x] `docs-check-links` — all valid
- [x] `docs-check-index` — all indexed
- [x] `docs-check-frontmatter` — all valid
- [x] All pre-commit hooks pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/225
2026-02-20 08:15:20 -08:00
c3748a0638 Add Authentik deployment plan (#224)
## Summary
- New plan document at `docs/how-to/plans/deploy-authentik.md` covering full Authentik deployment
- 6 phases: Helm analysis, prerequisites (CNPG/Redis/1Password), Nix containers, kustomize manifests, networking, monitoring
- Authentik replaces Dex as the identity provider for central user management and multi-protocol SSO
- Updated plans index and how-to index

## Deployment and Testing
- [x] Pre-commit hooks pass (docs-check-links, docs-check-index, docs-check-frontmatter)
- [ ] Review plan content for accuracy and completeness
- No deployment needed — documentation only

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/224
2026-02-20 07:06:56 -08:00
d21798b1f3 Document Dex OIDC and add services-check integration (#223)
## Summary
- Create Dex reference card (`docs/reference/services/dex.md`) with quick reference, architecture, identity source, storage, OIDC clients, secrets, and endpoints
- Write federated login explanation article (`docs/explanation/federated-login.md`) covering the Dex + Forgejo two-layer auth model, login flow, and break-glass access
- Add Dex to `services-check` (HTTP health endpoint + k3s pod check)
- Update Grafana docs with new Authentication section documenting SSO via Dex
- Update Forgejo docs with OAuth2 Provider section documenting its role as upstream identity source
- Add Dex to ringtail workloads table and reference service index
- Move `adopt-oidc-provider` plan to `completed/` with final design reflecting actual implementation

## Test plan
- [ ] `mise run services-check` passes (includes new Dex checks)
- [ ] `docs-check-links` passes (all wiki-links resolve)
- [ ] `docs-check-index` passes (new docs are indexed)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/223
2026-02-19 20:44:23 -08:00
b876e39981 Replace Homepage Helm chart with kustomize manifests and custom Dockerfile (#221)
## Summary
- Replace third-party Helm chart (jameswynn/homepage v2.1.0, pinned at app v1.2.0) with plain kustomize manifests and a custom Dockerfile building from forge mirror at v1.10.1
- Adds Dockerfile (`containers/homepage/`) with multi-stage build (node:22-slim builder, node:22-alpine runtime)
- Creates kustomize manifests: Deployment, Service, ConfigMap (6 config files), ServiceAccount, ClusterRole, ClusterRoleBinding
- Keeps existing ingress-tailscale.yaml and all 6 ExternalSecret resources unchanged
- Updates ArgoCD app definition from multi-source Helm to single directory source

## Prerequisite
- Homepage source mirrored at forge.ops.eblu.me/eblume/homepage.git 
- Container must be built and pushed before syncing: `mise run container-release homepage v1.10.1`

## Deployment and Testing
- [ ] Build and push container image: `mise run container-release homepage v1.10.1`
- [ ] Branch-test via ArgoCD: `argocd app set homepage --revision feature/homepage-kustomize && argocd app sync homepage`
- [ ] Verify dashboard loads at go.ops.eblu.me / go.tail8d86e.ts.net
- [ ] Verify k8s autodiscovery works (services appear on dashboard)
- [ ] Verify widgets load (weather, Forgejo, Jellyfin, etc.)
- [ ] After merge: `argocd app set homepage --revision main && argocd app sync homepage`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/221
2026-02-19 18:29:19 -08:00
869f6bd20d Review: update-documentation doc (#220)
## Summary
- Add missing workflow step 8: Fly.io proxy cache purge after deploy
- Remove misleading "Directory" column from changelog fragment types table (all fragments use flat `<name>.<type>.md` pattern, not subdirectories)
- Stamp `last-reviewed: 2026-02-19`

## Review notes
Verified all claims against actual workflow YAML, Dagger pipeline, ArgoCD manifests, towncrier config, and Quartz config files. Everything else checks out.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/220
2026-02-19 17:40:05 -08:00
291fff345c Fix services-check and update docs for Frigate migration to ringtail (#218)
## Summary
- Move mosquitto, ntfy, frigate, frigate-notify pod checks from `minikube-indri` to `k3s-ringtail` context in `services-check`
- Add `nvidia-device-plugin` pod check for ringtail k3s
- Rename "Kubernetes pods" section to "Indri minikube pods" for clarity
- Update 8 documentation files to reflect the migration completed in PRs #216/#217

## Files Changed
| File | Change |
|------|--------|
| `mise-tasks/services-check` | Move 4 pod checks to k3s-ringtail, add nvidia-device-plugin |
| `docs/reference/services/frigate.md` | Image→tensorrt, detector→ONNX/CUDA, shm→512Mi |
| `docs/reference/infrastructure/ringtail.md` | List actual k3s workloads |
| `docs/reference/infrastructure/indri.md` | Note frigate migration |
| `docs/explanation/architecture.md` | Add ringtail to diagram + compute layer |
| `docs/reference/kubernetes/cluster.md` | Note two clusters, add k3s section |
| `docs/reference/reference.md` | Update frigate/ntfy location |
| `docs/how-to/plans/completed/operationalize-reolink-camera.md` | Add post-completion migration note |
| `CLAUDE.md` | Add k3s-ringtail context guidance |

## Test plan
- [ ] `mise run services-check` — all checks pass
- [ ] Review each doc for accuracy against deployed state

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/218
2026-02-19 14:38:21 -08:00