## Summary
- `foldersFromFilesStructure` was `false` in Grafana's sidecar provider config, causing Grafana to ignore the subdirectory structure the sidecar creates from `grafana_folder` annotations
- All 18 TeslaMate dashboards were appearing in the root "Dashboards" folder despite having `grafana_folder: "TeslaMate"` annotations on their ConfigMaps
- Flipping to `true` makes Grafana replicate the sidecar's directory structure as UI folders
## Deployment and Testing
- [ ] Sync `grafana` app: `argocd app sync grafana`
- [ ] Verify TeslaMate dashboards appear under a "TeslaMate" folder in Grafana's dashboard list
- [ ] Verify other dashboards remain in the root "Dashboards" folder
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/253
## Summary
Completes the `upgrade-k8s-runner` mikado chain. Both prerequisites (workflow validation in Dagger, config review against v12 defaults) were resolved in #250.
- Bump runner image `code.forgejo.org/forgejo/runner:6.3.1` → `12.7.0`
- Update `service-versions.yaml` to track new version
- Mark goal card complete (remove `status: active`)
## Deployment and Testing
After merge:
1. `argocd app sync forgejo-runner`
2. Verify runner registers in Forgejo admin → runners
3. Trigger a test workflow (e.g. `branch-cleanup.yaml` manual dispatch)
Rollback: revert image tag to `6.3.1`, push, sync.
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/251
## Summary
- Review runner config against v12.7.0 defaults — added `shutdown_timeout: 3h`, no breaking changes found
- Add `validate_workflows` Dagger function using `forgejo-runner validate --directory .` inside upstream container
- All 6 workflows pass v12.7.0 schema validation
- Wire `mise run validate-workflows` task and pre-commit hook on `.forgejo/workflows/` changes
- Mark both leaf Mikado cards (`review-runner-config-v12`, `validate-workflows-against-v12`) complete
## Mikado State
After merge, `upgrade-k8s-runner` goal card has no unmet dependencies — ready to execute the actual image bump in a follow-up PR.
## Test Plan
- [x] `dagger call validate-workflows --src=.` passes (all 6 workflows OK)
- [x] Pre-commit hooks pass
- [ ] Reviewer: confirm `shutdown_timeout: 3h` addition to ConfigMap looks reasonable
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/250
## Summary
- C2 Mikado chain for upgrading the k8s forgejo-runner daemon (6 major versions behind)
- Root goal card with two leaf prerequisites: workflow validation and config review
- Ringtail runner is already at ~v12.6.4 via nixpkgs, no work needed there
## Mikado Chain
```
upgrade-k8s-runner (goal)
├── validate-workflows-against-v12 (leaf)
└── review-runner-config-v12 (leaf)
```
Both leaves are actionable now. The biggest risk is workflow schema validation
(introduced in v8/v9) rejecting our existing workflows.
## Next Steps
Work the leaf nodes in a follow-up session, then attempt the goal.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/249
## Summary
- New `mise run branch-cleanup` task that finds branches merged into main and deletes them locally and on the Forgejo remote
- Configurable `--cutoff` (default 30 days) skips branches with recent HEAD commits
- Supports `--dry-run`, `--local-only`, `--remote-only` flags
- Interactive confirmation before any deletion
## Test plan
- [x] `mise run branch-cleanup -- --dry-run` shows correct table of candidates
- [ ] Run without `--dry-run` to confirm actual deletion works
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/247
## Summary
- Switch from MQTT to webapi polling (v0.5.4 requires only one method)
- Poll every 15s for responsive alerts
- **`notify_once: true`** — one notification per event instead of repeats as object changes zones
- **`nosnap: drop`** — skip events without snapshots (was causing all events to be dropped on v0.3.5)
- **`snap_hires: true`** — use recording stream for higher quality snapshot images
## Deployment and Testing
- [ ] Sync: `argocd app set frigate --revision fix/frigate-notify-config && argocd app sync frigate`
- [ ] Verify pod starts: `kubectl --context=k3s-ringtail -n frigate get pods -l app=frigate-notify`
- [ ] Check logs for successful startup and event processing (no "No snapshot" drops)
- [ ] Wait for a motion event and confirm single ntfy notification with hi-res snapshot
- [ ] After merge: `argocd app set frigate --revision main && argocd app sync frigate`
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/242
- harden-zot-registry: fix Authentik hostname, check off all
verified items, add metrics config to "what was done"
- enforce-tag-immutability: fix admins permissions (was missing
update)
- agent-change-process: clarify that requires: is permanent and
status: active is the only completion marker
- zot reference: update modified date
- wire-ci-registry-auth fragment: add metrics fix
- Remove stale harden-zot-mikado-cards.ai.md planning fragment
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Enable OIDC + API key authentication on zot registry with three-tier accessControl
- `anonymousPolicy: ["read"]` — anyone can pull
- `artifact-workloads` group: `["read", "create"]` — CI push, no overwrite/delete
- `admins` group: `["read", "create", "update", "delete"]` — break-glass
- Wire both CI push paths (Dagger and Nix/skopeo) with `ZOT_CI_API_KEY` credentials
- Add `artifact-workloads` PolicyBinding in Authentik blueprint for zot app access
- Add `ZOT_CI_API_KEY` to Forgejo Actions secrets via existing ansible role
Completes the `wire-ci-registry-auth` and `harden-zot-registry` Mikado cards.
## Manual Deployment Steps (after merge)
1. Deploy Authentik blueprint: `argocd app sync authentik`
2. In Authentik admin UI: set a password for the `zot-ci` service account
3. Deploy zot config: `mise run provision-indri -- --tags zot`
4. Log in to `https://registry.ops.eblu.me` as `zot-ci` via OIDC → generate API key
5. Store API key in 1Password as `zot-ci-apikey` in blumeops vault
6. Sync Forgejo secrets: `mise run provision-indri -- --tags forgejo_actions_secrets`
7. Trigger a test container build to verify CI push
8. Verify anonymous pull: `curl -sf https://registry.ops.eblu.me/v2/_catalog`
## Uncertainties
- **Zot `accessControl` group matching with OIDC:** Groups from Authentik's `profile` scope claim should map to zot policy groups, but the exact claim-to-group matching needs runtime verification
- **`http.auth.apikey: true`:** This config key is documented but needs verification against the specific zot version built from source on indri
- **API key permissions:** Need to confirm zot API keys inherit the generating user's group for accessControl evaluation
## Test Plan
- [ ] `mise run provision-indri -- --check --diff --tags zot` shows expected config changes
- [ ] Anonymous pull works after deploy
- [ ] Unauthenticated push fails (401)
- [ ] OIDC browser login redirects to Authentik and back
- [ ] API key push works after key generation
- [ ] CI push succeeds with both Dagger and skopeo paths
- [ ] `mise run services-check` passes
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/237
## Summary
- Add Authentik blueprint (`zot.yaml`) with OAuth2 provider, application, `artifact-workloads` group, and `zot-ci` service account
- Wire `zot-client-secret` through ExternalSecret → worker Deployment env var → blueprint `!Env`
- Add Ansible pre_task to fetch OIDC secret from 1Password (item ID `oor7os5kapczgpbwv7obkca4y4`)
- Add `oidc-credentials.json.j2` template and deploy task in zot role (with `when` guard)
## Manual Steps Required Before Deploy
1. Generate client secret: `openssl rand -hex 32`
2. Store in 1Password: add field `zot-client-secret` to "Authentik (blumeops)" item in vault `blumeops`
## What This Does NOT Do
- Does NOT modify `config.json.j2` (that's the root goal `harden-zot-registry`)
- Does NOT wire CI auth (that's `wire-ci-registry-auth`)
- Does NOT set service account password or API keys (manual post-deploy)
## Verification
After ArgoCD sync:
- [ ] Authentik admin UI shows "Zot Registry" application
- [ ] OIDC discovery at `https://authentik.ops.eblu.me/application/o/zot/.well-known/openid-configuration` returns valid JSON
- [ ] Blueprint status is `successful`
- [ ] `artifact-workloads` group exists with `zot-ci` service account
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/236
## Summary
- Replace git-tag-triggered container builds with path-based triggers on main and workflow_dispatch
- Image tags now encode upstream app version + commit SHA (`vX.Y.Z-<sha>`) for full traceability
- Replace `container-tag-and-release` task with `container-build-and-release` (dispatches workflows via Forgejo API)
- Update dagger `publish()` to accept `commit_sha` parameter
- Update all docs and references to the new workflow
## Deployment and Testing
- [ ] Merge to main
- [ ] `mise run container-build-and-release <name>` for each container to populate new-format tags
- [ ] Verify tags in registry via `mise run container-list`
- [ ] Existing images untouched — old tags remain available
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/232
## Summary
- Enable OIDC + API key authentication on zot with anonymous pull preserved
- Enforce tag immutability for version tags
- Adopt commit-SHA-based container image tagging
Details in the [[harden-zot-registry]] Mikado chain (`mise run docs-mikado harden-zot-registry`).
## Test plan
- [ ] Anonymous pull still works
- [ ] Unauthenticated push fails (401)
- [ ] CI container builds pass with new auth and tagging
- [ ] `mise run services-check` passes
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/231
## Summary
- New Mikado card: `adopt-commit-based-container-tags` — replaces git-tag-triggered container builds with path-based main-branch triggers and manual workflow dispatch
- Image tags become `vX.Y.Z-<sha>` (with `-main` suffix for main branch builds, `-nix` for Nix builds), tying versions to the actual bundled app version and exact source commit
- `container-tag-and-release` mise task to be renamed to `container-build-and-release`, triggering workflow dispatch with the current HEAD SHA
- Added as soft prereq to `harden-zot-registry` Mikado chain
## Test plan
- [x] Pre-commit hooks pass (docs-check-index, docs-check-links, etc.)
- [ ] Review card content for completeness
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/230
## Summary
- Replace the old pre-Mikado plan doc (`docs/how-to/plans/harden-zot-registry.md`) with a proper C2 Mikado chain in `docs/how-to/zot/`
- Root goal: `harden-zot-registry` — enable OIDC + API key auth on zot with anonymous pull preserved
- Three leaf prereqs: `register-zot-oidc-client`, `wire-ci-registry-auth`, `enforce-tag-immutability`
- Add Zot section to `how-to.md` index, remove plan entry from plans index
- All doc checks pass (`docs-check-links`, `docs-check-index`, `docs-mikado`)
## Changes
- **New:** `docs/how-to/zot/harden-zot-registry.md` — C2 Mikado root goal
- **New:** `docs/how-to/zot/register-zot-oidc-client.md` — Register OIDC client in Authentik
- **New:** `docs/how-to/zot/wire-ci-registry-auth.md` — Wire CI push paths with registry auth
- **New:** `docs/how-to/zot/enforce-tag-immutability.md` — Prevent version tag overwrites
- **Deleted:** `docs/how-to/plans/harden-zot-registry.md` — Old plan doc (content absorbed into Mikado cards)
- **Updated:** `docs/how-to/how-to.md` — Add Zot section, remove plan entry
- **Updated:** `docs/how-to/plans/plans.md` — Remove plan entry
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/229
## Summary
C2 Mikado chain for deploying Authentik as the SSO identity provider, replacing Dex.
This PR will evolve over multiple sessions. Each iteration adds documentation (prerequisite cards) and eventually code as leaf nodes are resolved.
## Current Mikado State
- **Goal:** `deploy-authentik` (active)
- **Leaf prerequisites:**
- `build-authentik-container` — Build Nix container image
- `provision-authentik-database` — Create PostgreSQL database on CNPG cluster
- `create-authentik-secrets` — Create 1Password item with credentials
## Process refinements
- Updated agent-change-process with lessons from first attempt: reset code before committing cards, open PRs early
## Test plan
- [ ] `mise run docs-mikado` shows correct dependency chain
- [ ] Leaf nodes can be worked independently
- [ ] Container builds on ringtail
- [ ] Authentik starts and reaches healthy state
- [ ] Forgejo OAuth2 connector works
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/227
## Summary
- Strip detailed phase instructions from deploy-authentik plan (400→50 lines)
- Retain architecture decisions (ringtail, CNPG on indri, Nix containers, kustomize, Tailscale+Caddy) and open questions
- Add `status: active` frontmatter — now visible as a root goal in `mise run docs-mikado`
- Update plans index to reflect Active (C2) status
This is the first real use of the C2 Mikado chain system from #225. Future sessions will discover prerequisites, create sub-cards with `requires`, and work leaf nodes first.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/226
## Summary
- New plan document at `docs/how-to/plans/deploy-authentik.md` covering full Authentik deployment
- 6 phases: Helm analysis, prerequisites (CNPG/Redis/1Password), Nix containers, kustomize manifests, networking, monitoring
- Authentik replaces Dex as the identity provider for central user management and multi-protocol SSO
- Updated plans index and how-to index
## Deployment and Testing
- [x] Pre-commit hooks pass (docs-check-links, docs-check-index, docs-check-frontmatter)
- [ ] Review plan content for accuracy and completeness
- No deployment needed — documentation only
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/224
## Summary
- Create Dex reference card (`docs/reference/services/dex.md`) with quick reference, architecture, identity source, storage, OIDC clients, secrets, and endpoints
- Write federated login explanation article (`docs/explanation/federated-login.md`) covering the Dex + Forgejo two-layer auth model, login flow, and break-glass access
- Add Dex to `services-check` (HTTP health endpoint + k3s pod check)
- Update Grafana docs with new Authentication section documenting SSO via Dex
- Update Forgejo docs with OAuth2 Provider section documenting its role as upstream identity source
- Add Dex to ringtail workloads table and reference service index
- Move `adopt-oidc-provider` plan to `completed/` with final design reflecting actual implementation
## Test plan
- [ ] `mise run services-check` passes (includes new Dex checks)
- [ ] `docs-check-links` passes (all wiki-links resolve)
- [ ] `docs-check-index` passes (new docs are indexed)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/223
## Summary
- Synced driveway_entrance zone coordinates from live Frigate config (adjusted mask boundaries)
- Added `inertia: 3` and `loitering_time: 0` to driveway_entrance zone
- Expanded review alerts to require either `driveway_entrance` or `driveway` zone (was entrance only)
- Updated frigate-notify config to allow alerts from both `driveway_entrance` and `driveway` zones
## Deployment and Testing
- [ ] Merge and sync frigate ArgoCD app on ringtail
- [ ] Sync frigate-notify (restart pod to pick up ConfigMap change)
- [ ] Verify alerts fire for person/car in driveway zone
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/219
## Summary
- Delete `ansible/roles/frigate_detector/` and remove from indri playbook — the Apple Silicon Detector is retired
- Move Mosquitto (MQTT) ArgoCD app from indri minikube to ringtail k3s
- Move ntfy ArgoCD app from indri minikube to ringtail k3s
- Update Frigate docs to reflect detector removal and planned RTX 4080 migration
- Manifests are reused as-is (same `argocd/manifests/mosquitto/` and `argocd/manifests/ntfy/`), just pointed at ringtail
## Deployment
After merge:
1. Sync indri ArgoCD `apps` app with prune to remove old mosquitto/ntfy apps:
```
argocd app sync apps --prune
```
2. Sync new ringtail apps:
```
argocd app sync mosquitto-ringtail
argocd app sync ntfy-ringtail
```
3. Manually clean up the detector LaunchAgent on indri:
```
ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.eblume.frigate-detector.plist'
ssh indri 'rm ~/Library/LaunchAgents/mcquack.eblume.frigate-detector.plist'
```
## Notes
- Frigate on indri will lose MQTT/ntfy connectivity — this is expected (user confirmed no downtime concerns)
- ntfy Tailscale Ingress hostname `ntfy` will transfer from indri ProxyGroup to ringtail ProxyGroup
- Caddy on indri proxies `ntfy.ops.eblu.me` → `ntfy.tail8d86e.ts.net`, so no Caddy changes needed
- Frigate + frigate-notify will be ported to ringtail in a follow-up PR
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/216
## Summary
- Add `containers/nettest/default.nix` using `dockerTools.buildLayeredImage` with curl, jq, dnsutils, cacert, and bash — equivalent to the existing Dockerfile
- Update `container-tag-and-release` to require `--nix` or `--dockerfile` flag when both build types exist for a container
- Update `container-list` to show `[dockerfile+nix]` label when both exist
## Deployment and Testing
- [ ] SSH to ringtail, run `nix build -f containers/nettest/default.nix -o result` to verify the nix expression builds
- [ ] Tag `nettest-nix-v1.0.0`, confirm `build-container-nix` workflow runs on `nix-container-builder` runner and pushes to registry
- [ ] Smoke test on ringtail k3s: `kubectl run nettest --image=registry.ops.eblu.me/blumeops/nettest:v1.0.0 --restart=Never && kubectl logs nettest`
- [ ] Verify `mise run container-list` shows `[dockerfile+nix]` for nettest
- [ ] Verify `mise run container-tag-and-release nettest v1.1.0` prompts for build type
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/214
## Summary
- Replace `changed_when: true` with `register` + output inspection on the two 1Password secret tasks in `ringtail.yml`
- Tasks now correctly report `ok` when the secret content hasn't changed, and `changed` only when `kubectl apply` outputs `configured` or `created`
## Test plan
- [ ] Run `mise run provision-ringtail` twice — second run should show both tasks as `ok` not `changed`
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/213
## Summary
- Adds `inhibit_idle fullscreen` window commands to sway config on ringtail
- Covers both Wayland-native (`app_id`) and XWayland (`class`) windows
- Prevents swayidle from locking the screen during gamepad-only gaming sessions where controller input isn't detected by the Wayland idle tracker
## Notes
This is a blanket fullscreen inhibit. A more targeted approach (daemon monitoring `/dev/input` gamepad events) may be desired later to allow idle lock during long-running fullscreen apps like Factorio.
## Deployment and Testing
- [ ] `mise run provision-ringtail` to deploy
- [ ] Run a fullscreen app and verify swayidle doesn't lock after 15 minutes
- [ ] Verify lock still activates when no fullscreen window is present
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/212
## Summary
- Configure **swayidle** to lock screen (swaylock) after 15 minutes of inactivity
- Turn off display (DPMS) after 60 minutes, auto-restore on activity
- **swaylock** themed with Catppuccin Macchiato to match existing Sway config
- Add `Mod4+l` keybinding for manual screen lock
- Add PAM service for swaylock authentication
- Disable system suspend/hibernate entirely (workstation should never sleep)
## What changes
All changes in `nixos/ringtail/configuration.nix`:
- `security.pam.services.swaylock` — required for swaylock to authenticate on NixOS
- `systemd.sleep.extraConfig` — blocks all sleep/hibernate modes
- `programs.swaylock` (home-manager) — lock screen appearance config
- `services.swayidle` (home-manager) — idle timeout daemon with lock + DPMS events
- New keybinding `Mod4+l` for manual lock
## Deployment and Testing
- [ ] `mise run provision-ringtail`
- [ ] Verify swayidle is running: `systemctl --user status swayidle`
- [ ] Test manual lock with `Super+l`
- [ ] Verify display DPMS off after idle (can lower timeout temporarily to test)
- [ ] Confirm machine does not suspend: `systemctl status sleep.target`
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/211
## Summary
Extends ringtail from a desktop/gaming NixOS box into an infrastructure node with a k3s cluster, secrets management, and a Forgejo Actions
runner for building containers with Nix.
### K3s cluster
- Single-node k3s with Traefik/ServiceLB/metrics-server disabled (minimal footprint)
- TLS SAN set to `ringtail.tail8d86e.ts.net` so ArgoCD on indri can manage it via Tailscale
- Containerd registry mirrors pull through Zot on indri (`k3s-registries.yaml`)
- Tailscale interface added to `trustedInterfaces` for cross-node ArgoCD access
- `kubectl` added to system packages
### 1Password Connect + External Secrets Operator
- Four new ArgoCD apps targeting `k3s-ringtail`: `1password-connect-ringtail`, `external-secrets-crds-ringtail`, `external-secrets-ringtail`,
`external-secrets-config-ringtail`
- Reuses the same Helm charts/values as indri, just pointed at ringtail's k3s API server
- Bootstrap secrets (`op-credentials`, `onepassword-token`) provisioned by Ansible pre_tasks via `op read`, then applied to the `1password`
namespace in post_tasks
### Systemd Forgejo Actions runner
- Native `services.gitea-actions-runner` with `forgejo-runner` package — no DinD, no k8s pod, runs directly on the NixOS host
- Label `nix-container-builder:host` — jobs execute on the host with `nix`, `skopeo`, `nodejs`, etc. in PATH
- Registration token fetched from 1Password (`Forgejo Secrets/runner_reg`) by Ansible and written to `/etc/forgejo-runner/token.env`
- Runner's dynamic user (`gitea-runner`) added to `nix.settings.trusted-users` for nix daemon access
### Nix container build workflow
- New `.forgejo/workflows/build-container-nix.yaml` triggers on `*-nix-v[0-9]*` tags (e.g. `nettest-nix-v1.0.0`)
- Builds with `nix build -f containers/<name>/default.nix`, pushes to Zot via `skopeo copy`
- Existing Dockerfile workflow guarded with `if: !contains(github.ref_name, '-nix-v')` to avoid double-triggering
### Mise task updates
- `container-tag-and-release` auto-detects `default.nix` vs `Dockerfile` and uses the appropriate tag format (`-nix-v` vs `-v`)
- `container-list` shows build type indicator (`[nix]` / `[dockerfile]`)
## Post-merge
1. `mise run provision-ringtail` — deploys k3s token, runner token, NixOS rebuild
2. Register k3s cluster in ArgoCD (first time only):
```fish
ssh ringtail 'sudo cat /etc/rancher/k3s/k3s.yaml' | \
sed 's|127.0.0.1|ringtail.tail8d86e.ts.net|' > /tmp/k3s-ringtail.yaml
set -x KUBECONFIG /tmp/k3s-ringtail.yaml
argocd cluster add default --name k3s-ringtail
3. Sync ArgoCD apps in order: 1password-connect-ringtail -> external-secrets-crds-ringtail -> external-secrets-ringtail ->
external-secrets-config-ringtail
4. Verify runner: ssh ringtail 'systemctl status gitea-runner-nix-container-builder'
5. Check Forgejo admin panel for ringtail-nix-builder runner online
6. Test: create containers/<name>/default.nix, tag with <name>-nix-v0.1.0
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/209
## Summary
- Bump Frigate image from `0.16.4-standard-arm64` to `0.17.0-rc2-standard-arm64`
- Adapt `record` config to 0.17 schema: `retain.days`/`mode: all` → `continuous.days`
- Update service docs and version tracker
This is the first step toward the Apple Silicon ZMQ detector. The existing ONNX detector is kept so we can validate the upgrade independently.
## What is NOT changing
- Detector config (still `type: onnx` with YOLO-NAS-s)
- go2rtc streams, MQTT, cameras, zones, review rules
- frigate-notify, storage PVs, Grafana dashboard
## Deployment and Testing
- [ ] `argocd app set frigate --revision upgrade-frigate-0.17 && argocd app sync frigate`
- [ ] Pod starts, `/api/version` returns `0.17.0-rc2`
- [ ] No config errors in pod logs
- [ ] Frigate web UI loads at `https://nvr.ops.eblu.me`
- [ ] Live view works, detection running (`/api/stats` shows `detection_fps > 0`)
- [ ] Recordings being created (`/api/recordings/summary`)
- [ ] MQTT events flowing (check frigate-notify logs)
- [ ] After merge: `argocd app set frigate --revision main && argocd app sync frigate`
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/205
## Summary
- Cap detect FPS to 2 to prevent recording segment backlog from ONNX inference bottleneck (~750ms/frame on ARM64 CPU)
- Sync motion masks from live config (added second mask area)
- Update driveway_entrance zone coordinates from live config
- Add explicit alert labels `[person, car]` while keeping `required_zones: [driveway_entrance]`
## Context
The "No frames have been received" error on the gablecam live view was caused by the detect stream falling behind — ONNX YOLO-NAS-s takes ~750ms per inference on ARM64 CPU, but the sub-stream sends 5 FPS. This caused recording segments to pile up and the ffmpeg watchdog to repeatedly kill/restart the process, creating gaps in the live view.
## Test plan
- [ ] Sync ArgoCD `frigate` app to branch and verify pod restarts cleanly
- [ ] Check `/api/stats` — `skipped_fps` should drop significantly, `process_fps` should be close to 2
- [ ] Verify live view at https://nvr.ops.eblu.me/#gablecam no longer shows "No frames" error
- [ ] Verify detections and alerts still work in the driveway_entrance zone
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/204