The heph CLI requests scope "openid offline_access", but the Authentik
heph OAuth2 provider only mapped openid/email/profile. Without the
offline_access mapping the issued refresh token is bound to the login
session rather than the 30-day refresh-token window; once the session
lapses, hephd's refresh_token grant returns 400 Bad Request and spoke
sync silently degrades (heph sync --status -> auth_failure: true).
Add the built-in offline_access scope mapping to the provider's
property_mappings and document the requirement in the service reference.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds the heph-pwa redirect URIs to the Authentik `heph` OAuth2 provider so the new browser **Login with Authentik** flow (Authorization Code + PKCE, hephaestus PR #9) can redirect back and exchange the code:
- `https://heph.ops.eblu.me/` (the PWA origin)
- `http://localhost:8787/` (local dev: `hephd --web-root`)
Authentik also keys token-endpoint CORS off these origins, so they're required for the browser token exchange. Additive (the provider was `redirect_uris: []`); harmless until the PWA feature deploys.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: #370
Makes indri the canonical **heph** hub for the hub-and-spoke task/context system, deployed as a self-updating LaunchAgent managed by Ansible. Other devices (gilbert) attach as offline-capable spokes.
## What's here
- **`ansible/roles/heph`** (tag `heph`) — bootstrap `cargo install hephd` (only if absent; `--self-update` keeps it current after), version-pinned `heph-pwa` checkout served via `--web-root`, launchagent `mcquack.eblume.heph`:
```
hephd --mode server --http-addr 0.0.0.0:8787 --db … --web-root …
--oidc-issuer …/o/heph/ --oidc-audience heph
--self-update --self-update-interval-secs 600
```
`~/.cargo/bin` is on the agent `PATH` so self-update's `cargo install` works.
- **Caddy** — `heph.ops.eblu.me → localhost:8787` (TLS for the PWA secure context).
- **Authentik** — new `heph` **public device-code** OIDC app + `default-device-code-flow` bound to the default brand's `flow_device_code` (verified live: brand `authentik-default`, field currently unset → additive).
- **Docs** — `services/hephaestus.md` (Path-A seeding runbook + spoke caveat), `indri.md`, changelog fragment.
## Three features requested
- **Autoupdate** — 10-min interval (`--self-update-interval-secs 600`).
- **PWA** — `--web-root` (confirmed shipped in v1.2.0).
- **Spoke** — gilbert reconfig documented (post-merge step).
## Deploy plan (not done yet — awaiting review)
1. Seed from gilbert (Path A): `heph daemon stop` → copy `heph.db` → `DELETE FROM meta WHERE key='origin'`.
2. Sync Authentik `apps`/blueprint; verify blueprint status via API (not just logs).
3. `provision-indri --tags heph,caddy` from this branch.
4. Point gilbert at the hub + `heph auth login`.
## Known follow-ups (heph-side, tracked in the Hephaestus project)
- `heph daemon` can't bake hub/spoke config or pass `--self-update-interval-secs` → worked around by the ansible plist.
- Path-A seeding lacks a clean `hephd --owner-id`/seed command → manual `meta.origin` reset for now.
- Self-update moves hephd ahead of the ansible-pinned PWA shell over time (drift; tolerated by the SW cache, revisit on next release).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: #369
Now that argocd's Authentik OAuth2 client is public (PKCE-only), the
client_secret plumbing is dead code:
- delete argocd-oidc-authentik ExternalSecret and drop it from kustomization
- remove AUTHENTIK_ARGOCD_CLIENT_SECRET env from authentik-worker
- remove argocd-client-secret mapping from authentik-config ExternalSecret
The argocd-client-secret field in the 1Password "Authentik (blumeops)"
item is now unreferenced and can be deleted there.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Changes argocd's Authentik OAuth2 client from confidential to public and
drops the clientSecret from argocd-cm. Public + PKCE works for both the
web UI (argocd-server backend) and the argocd CLI (`argocd login --sso`)
without a shared secret, matching OAuth 2.1 guidance.
Confidential → public was needed because the CLI can't hold a client
secret; Authentik's per-app issuer model made the alternative
("cliClientID" pattern with separate public client) awkward since it
requires a shared issuer across apps which Authentik doesn't serve.
Follow-up: deadcode AUTHENTIK_ARGOCD_CLIENT_SECRET env wiring and the
argocd-oidc-authentik ExternalSecret once verified.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds http://localhost:8085/auth/callback to the ArgoCD OAuth2 provider's
redirect_uris so `argocd login --sso` works. Loopback redirect is the
RFC 8252 pattern for native CLI apps; PKCE (already enabled) covers the
code-interception risk.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Worker forks 4 Dramatiq processes each loading the full Django app
(~250MB each), hitting the 1Gi limit on startup. Ringtail has ample
RAM headroom.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
C0 follow-up to #309: update kustomization newTag for all containers
rebuilt by the merge (authentik, authentik-redis, ntfy, alloy).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## Summary
- Replace upstream `docker.io/library/redis:7-alpine` (Redis 7.4.8) with a nix-built container using Redis 8.2.3 from nixpkgs
- Introduce **attached service pattern**: `parent` field in service-versions.yaml, `<parent>-<component>` naming convention, and `assert pkgs.redis.version == version` in default.nix to prevent silent version drift on `flake.lock` updates
- Document the pattern in [[review-services]] so future attached services slot in cleanly
- Backfill `parent: grafana` on existing `grafana-sidecar` entry
## Version drift protection
1. `flake.lock` update bumps nixpkgs redis → `assert` in `default.nix` breaks `nix-build`
2. Developer updates `version` in `default.nix` → prek's `container-version-check` demands matching `service-versions.yaml` update
3. Both must agree before commit succeeds
## Test plan
- [ ] Build container from branch on ringtail (`mise run container-build-and-release authentik-redis`)
- [ ] Update kustomization `newTag` to branch-built image tag
- [ ] Sync authentik ArgoCD app from branch (`argocd app set authentik --revision localize-redis && argocd app sync authentik`)
- [ ] Verify Authentik login, session persistence, and task queue still work
- [ ] After merge: C0 follow-up to update `newTag` to the main-built image tag
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: #309
Dramatiq defaults to one worker process per CPU core. On ringtail (16 cores)
this spawned 16 processes, each loading the full Django app, exceeding the
1Gi memory limit and causing a crash loop (228 restarts over 7 days).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bare image references in manifests were ambiguous — unclear whether the
tag was intentionally omitted or managed by kustomize. Add :kustomized
sentinel to all 37 image refs overridden by kustomize images transformer.
Add sync notes for tailscale-operator proxyclass (CRD fields not processed
by kustomize). Mark devpi reviewed (6.19.1 is current).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Add Authentik OAuth2 provider + application blueprint for ArgoCD (ringtail side)
- Add OIDC config to ArgoCD ConfigMap with Authentik as identity provider (indri side)
- Map Authentik `admins` group to ArgoCD `role:admin` via RBAC policy
- ExternalSecrets on both sides pull `argocd-client-secret` from 1Password
- Local admin password remains as break-glass — both login methods coexist
## Pre-deployment manual step
Add `argocd-client-secret` field to "Authentik (blumeops)" in 1Password with a random value (e.g., `openssl rand -hex 32`).
## Deployment order
1. Sync Authentik app on ringtail first (blueprint + secret + worker env var)
2. Sync ArgoCD app on indri second (cm, rbac, ExternalSecret)
## Verification
- [ ] `argocd-client-secret` field added to 1Password
- [ ] Authentik app synced on ringtail — blueprint applied, provider created
- [ ] ArgoCD app synced on indri — OIDC config applied
- [ ] SSO login works: visit `https://argocd.ops.eblu.me` → "Log in via Authentik" → admin access
- [ ] Break-glass: local admin/password login still works
Reviewed-on: #284
Update image tag to v2026.2.0-efa9806-nix — the first source-built
authentik container from the build-authentik-from-source chain.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Move hardcoded image tags to kustomization.yaml `images:` transformer across **22 services** — image names in manifests become version-agnostic templates, with tags centralized in one place per service
- Replace hand-written ConfigMap manifests with `configMapGenerator:` in **12 services** — config data extracted to standalone files, generated ConfigMaps include content hashes that trigger automatic pod rollouts on changes
- Create new `kustomization.yaml` for **forgejo-runner** and **nvidia-device-plugin** (switches ArgoCD from directory mode to kustomize mode, rendered output identical)
### Services modified
**Images only (8):** cv, devpi, docs, kube-state-metrics, miniflux, navidrome, teslamate, torrent
**Images + configMapGenerator (10):** alloy-k8s, forgejo-runner, frigate, grafana, homepage, kiwix, loki, mosquitto, ntfy, prometheus
**Images only, no configMapGenerator (4):** authentik (skip blueprints — special YAML tags), tailscale-operator-base (Deployment only, CRD image fields left as-is)
**Skipped entirely (6):** argocd (remote upstream), databases (no image fields), external-secrets, grafana-config (cross-kustomization dashboards), immich (Helm-managed), 1password-connect/cloudnative-pg (no kustomization.yaml)
### What changes at deploy time
- **images:** — no functional diff, `kustomize build` produces identical output with tags
- **configMapGenerator:** — ConfigMap names gain hash suffixes (e.g., `prometheus-config` → `prometheus-config-6f42fhctcb`) and all Deployment/StatefulSet/DaemonSet references are updated automatically. Pods will restart once per service on first sync due to the name change
## Test plan
- [x] `kubectl kustomize` builds all 30 service directories successfully
- [x] Image tags verified in rendered output for all modified services
- [x] ConfigMap hash suffixes verified in rendered output
- [x] ConfigMap references in Deployments/StatefulSets confirmed to use hashed names
- [x] All pre-commit hooks pass (yamllint, shellcheck, prettier, etc.)
- [ ] `argocd app diff` each service to confirm only expected ConfigMap name changes
- [ ] Deploy from branch starting with a low-risk service (e.g., mosquitto)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/264
Five container manifests were removed when deleting old-style tags
(shared digests). Rebuild on a72a0d8 and update references.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Updates all 15 container image references across 14 ArgoCD manifest files
- Migrates from old internal `vX.Y.Z` tags to new `v<upstream-version>-<sha>` format
- Covers: authentik, cv, devpi, forgejo-runner, homepage, kiwix-serve, kubectl, miniflux, navidrome, ntfy, quartz, teslamate, transmission
## Deployment and Testing
- [ ] Sync all ArgoCD apps on branch revision
- [ ] Verify all services come up healthy
- [ ] Merge and re-sync on main
- [ ] Clean up old-style tags from zot registry
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/238
## Summary
- Enable OIDC + API key authentication on zot registry with three-tier accessControl
- `anonymousPolicy: ["read"]` — anyone can pull
- `artifact-workloads` group: `["read", "create"]` — CI push, no overwrite/delete
- `admins` group: `["read", "create", "update", "delete"]` — break-glass
- Wire both CI push paths (Dagger and Nix/skopeo) with `ZOT_CI_API_KEY` credentials
- Add `artifact-workloads` PolicyBinding in Authentik blueprint for zot app access
- Add `ZOT_CI_API_KEY` to Forgejo Actions secrets via existing ansible role
Completes the `wire-ci-registry-auth` and `harden-zot-registry` Mikado cards.
## Manual Deployment Steps (after merge)
1. Deploy Authentik blueprint: `argocd app sync authentik`
2. In Authentik admin UI: set a password for the `zot-ci` service account
3. Deploy zot config: `mise run provision-indri -- --tags zot`
4. Log in to `https://registry.ops.eblu.me` as `zot-ci` via OIDC → generate API key
5. Store API key in 1Password as `zot-ci-apikey` in blumeops vault
6. Sync Forgejo secrets: `mise run provision-indri -- --tags forgejo_actions_secrets`
7. Trigger a test container build to verify CI push
8. Verify anonymous pull: `curl -sf https://registry.ops.eblu.me/v2/_catalog`
## Uncertainties
- **Zot `accessControl` group matching with OIDC:** Groups from Authentik's `profile` scope claim should map to zot policy groups, but the exact claim-to-group matching needs runtime verification
- **`http.auth.apikey: true`:** This config key is documented but needs verification against the specific zot version built from source on indri
- **API key permissions:** Need to confirm zot API keys inherit the generating user's group for accessControl evaluation
## Test Plan
- [ ] `mise run provision-indri -- --check --diff --tags zot` shows expected config changes
- [ ] Anonymous pull works after deploy
- [ ] Unauthenticated push fails (401)
- [ ] OIDC browser login redirects to Authentik and back
- [ ] API key push works after key generation
- [ ] CI push succeeds with both Dagger and skopeo paths
- [ ] `mise run services-check` passes
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/237
## Summary
- Add Authentik blueprint (`zot.yaml`) with OAuth2 provider, application, `artifact-workloads` group, and `zot-ci` service account
- Wire `zot-client-secret` through ExternalSecret → worker Deployment env var → blueprint `!Env`
- Add Ansible pre_task to fetch OIDC secret from 1Password (item ID `oor7os5kapczgpbwv7obkca4y4`)
- Add `oidc-credentials.json.j2` template and deploy task in zot role (with `when` guard)
## Manual Steps Required Before Deploy
1. Generate client secret: `openssl rand -hex 32`
2. Store in 1Password: add field `zot-client-secret` to "Authentik (blumeops)" item in vault `blumeops`
## What This Does NOT Do
- Does NOT modify `config.json.j2` (that's the root goal `harden-zot-registry`)
- Does NOT wire CI auth (that's `wire-ci-registry-auth`)
- Does NOT set service account password or API keys (manual post-deploy)
## Verification
After ArgoCD sync:
- [ ] Authentik admin UI shows "Zot Registry" application
- [ ] OIDC discovery at `https://authentik.ops.eblu.me/application/o/zot/.well-known/openid-configuration` returns valid JSON
- [ ] Blueprint status is `successful`
- [ ] `artifact-workloads` group exists with `zot-ci` service account
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/236
## Summary
C2 Mikado chain for deploying Authentik as the SSO identity provider, replacing Dex.
This PR will evolve over multiple sessions. Each iteration adds documentation (prerequisite cards) and eventually code as leaf nodes are resolved.
## Current Mikado State
- **Goal:** `deploy-authentik` (active)
- **Leaf prerequisites:**
- `build-authentik-container` — Build Nix container image
- `provision-authentik-database` — Create PostgreSQL database on CNPG cluster
- `create-authentik-secrets` — Create 1Password item with credentials
## Process refinements
- Updated agent-change-process with lessons from first attempt: reset code before committing cards, open PRs early
## Test plan
- [ ] `mise run docs-mikado` shows correct dependency chain
- [ ] Leaf nodes can be worked independently
- [ ] Container builds on ringtail
- [ ] Authentik starts and reaches healthy state
- [ ] Forgejo OAuth2 connector works
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/227