Built from main in run #516 after #339 merged. Follows the navidrome
kustomization convention (deployment image = local ref + :kustomized,
kustomization override = newTag only).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
preview.quality was at the top level (invalid); moved under record
with a valid preset (very_low). Also fix services-check to catch
Grafana "Alerting (NoData)" state which was silently passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previews are ~4MB/hour at default quality (CRF 1), served over NFS from
sifaka. Reducing to CRF 8 shrinks preview files to improve review page
load times when scrubbing older footage.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bump from RC to latest stable (security fixes for config endpoint and
cross-camera auth). Add new 0.17 motion retention tier at 365 days,
reduce continuous from 180 to 30 days.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move NVR, Jellyfin, and DJ to new Home group. Move Grafana from Content
to Infrastructure. Switch all layout groups from column to row style.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Increase retention: continuous 3→180d, detections 14→30d, alerts 30→730d.
Plenty of NFS headroom (~9.4 TiB free, ~6.6 GB/day for one camera).
Add frigate-recording check to services-check that verifies camera_fps > 0,
which would have caught the 6-day outage from the mqtt config removal.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Frigate's config schema requires an `mqtt` field even when MQTT isn't
used. Commit 40f1568 removed it along with Mosquitto, causing Frigate
to fail validation on startup. Add `mqtt.enabled: false` to satisfy
the schema without needing a broker.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mosquitto has been dormant since frigate-notify switched from MQTT to
webapi polling (529ba10). Tear down live infra (ArgoCD app, namespace)
and remove all manifests, service-versions entry, services-check, and
doc references.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Frigate 0.17 does not auto-create clips/previews/<camera>/, causing
review page previews to silently fail with 500 errors.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bare image references in manifests were ambiguous — unclear whether the
tag was intentionally omitted or managed by kustomize. Add :kustomized
sentinel to all 37 image refs overridden by kustomize images transformer.
Add sync notes for tailscale-operator proxyclass (CRD fields not processed
by kustomize). Mark devpi reviewed (6.19.1 is current).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The database was at /config/frigate.db (emptyDir, ephemeral) instead of
/db/frigate.db (PVC, persistent). Every pod restart wiped the database,
losing all recording history and leaving orphaned files on NFS.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ONNX detector + CUDA ffmpeg + workers consume ~1.9Gi at steady state,
causing intermittent OOMKills at the 2Gi limit.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Parked car was being re-detected every few minutes at night due to IR
illumination noise triggering motion detection. Restrict the driveway
zone to [person, dog, cat] so cars and birds no longer create events
there. Cars still alert via the driveway_entrance zone.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Move hardcoded image tags to kustomization.yaml `images:` transformer across **22 services** — image names in manifests become version-agnostic templates, with tags centralized in one place per service
- Replace hand-written ConfigMap manifests with `configMapGenerator:` in **12 services** — config data extracted to standalone files, generated ConfigMaps include content hashes that trigger automatic pod rollouts on changes
- Create new `kustomization.yaml` for **forgejo-runner** and **nvidia-device-plugin** (switches ArgoCD from directory mode to kustomize mode, rendered output identical)
### Services modified
**Images only (8):** cv, devpi, docs, kube-state-metrics, miniflux, navidrome, teslamate, torrent
**Images + configMapGenerator (10):** alloy-k8s, forgejo-runner, frigate, grafana, homepage, kiwix, loki, mosquitto, ntfy, prometheus
**Images only, no configMapGenerator (4):** authentik (skip blueprints — special YAML tags), tailscale-operator-base (Deployment only, CRD image fields left as-is)
**Skipped entirely (6):** argocd (remote upstream), databases (no image fields), external-secrets, grafana-config (cross-kustomization dashboards), immich (Helm-managed), 1password-connect/cloudnative-pg (no kustomization.yaml)
### What changes at deploy time
- **images:** — no functional diff, `kustomize build` produces identical output with tags
- **configMapGenerator:** — ConfigMap names gain hash suffixes (e.g., `prometheus-config` → `prometheus-config-6f42fhctcb`) and all Deployment/StatefulSet/DaemonSet references are updated automatically. Pods will restart once per service on first sync due to the name change
## Test plan
- [x] `kubectl kustomize` builds all 30 service directories successfully
- [x] Image tags verified in rendered output for all modified services
- [x] ConfigMap hash suffixes verified in rendered output
- [x] ConfigMap references in Deployments/StatefulSets confirmed to use hashed names
- [x] All pre-commit hooks pass (yamllint, shellcheck, prettier, etc.)
- [ ] `argocd app diff` each service to confirm only expected ConfigMap name changes
- [ ] Deploy from branch starting with a low-risk service (e.g., mosquitto)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/264
## Summary
- Switch from MQTT to webapi polling (v0.5.4 requires only one method)
- Poll every 15s for responsive alerts
- **`notify_once: true`** — one notification per event instead of repeats as object changes zones
- **`nosnap: drop`** — skip events without snapshots (was causing all events to be dropped on v0.3.5)
- **`snap_hires: true`** — use recording stream for higher quality snapshot images
## Deployment and Testing
- [ ] Sync: `argocd app set frigate --revision fix/frigate-notify-config && argocd app sync frigate`
- [ ] Verify pod starts: `kubectl --context=k3s-ringtail -n frigate get pods -l app=frigate-notify`
- [ ] Check logs for successful startup and event processing (no "No snapshot" drops)
- [ ] Wait for a motion event and confirm single ntfy notification with hi-res snapshot
- [ ] After merge: `argocd app set frigate --revision main && argocd app sync frigate`
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/242
## Summary
- Synced driveway_entrance zone coordinates from live Frigate config (adjusted mask boundaries)
- Added `inertia: 3` and `loitering_time: 0` to driveway_entrance zone
- Expanded review alerts to require either `driveway_entrance` or `driveway` zone (was entrance only)
- Updated frigate-notify config to allow alerts from both `driveway_entrance` and `driveway` zones
## Deployment and Testing
- [ ] Merge and sync frigate ArgoCD app on ringtail
- [ ] Sync frigate-notify (restart pod to pick up ConfigMap change)
- [ ] Verify alerts fire for person/car in driveway zone
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/219
## Summary
- Bump Frigate image from `0.16.4-standard-arm64` to `0.17.0-rc2-standard-arm64`
- Adapt `record` config to 0.17 schema: `retain.days`/`mode: all` → `continuous.days`
- Update service docs and version tracker
This is the first step toward the Apple Silicon ZMQ detector. The existing ONNX detector is kept so we can validate the upgrade independently.
## What is NOT changing
- Detector config (still `type: onnx` with YOLO-NAS-s)
- go2rtc streams, MQTT, cameras, zones, review rules
- frigate-notify, storage PVs, Grafana dashboard
## Deployment and Testing
- [ ] `argocd app set frigate --revision upgrade-frigate-0.17 && argocd app sync frigate`
- [ ] Pod starts, `/api/version` returns `0.17.0-rc2`
- [ ] No config errors in pod logs
- [ ] Frigate web UI loads at `https://nvr.ops.eblu.me`
- [ ] Live view works, detection running (`/api/stats` shows `detection_fps > 0`)
- [ ] Recordings being created (`/api/recordings/summary`)
- [ ] MQTT events flowing (check frigate-notify logs)
- [ ] After merge: `argocd app set frigate --revision main && argocd app sync frigate`
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/205
## Summary
- Cap detect FPS to 2 to prevent recording segment backlog from ONNX inference bottleneck (~750ms/frame on ARM64 CPU)
- Sync motion masks from live config (added second mask area)
- Update driveway_entrance zone coordinates from live config
- Add explicit alert labels `[person, car]` while keeping `required_zones: [driveway_entrance]`
## Context
The "No frames have been received" error on the gablecam live view was caused by the detect stream falling behind — ONNX YOLO-NAS-s takes ~750ms per inference on ARM64 CPU, but the sub-stream sends 5 FPS. This caused recording segments to pile up and the ffmpeg watchdog to repeatedly kill/restart the process, creating gaps in the live view.
## Test plan
- [ ] Sync ArgoCD `frigate` app to branch and verify pod restarts cleanly
- [ ] Check `/api/stats` — `skipped_fps` should drop significantly, `process_fps` should be close to 2
- [ ] Verify live view at https://nvr.ops.eblu.me/#gablecam no longer shows "No frames" error
- [ ] Verify detections and alerts still work in the driveway_entrance zone
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/204
## Summary
- Remove car-specific `max_frames: 150` which was causing a forget-and-re-detect loop on parked cars (every ~30 seconds at 5fps)
- Set `stationary.interval: 0` so Frigate never re-runs detection on stationary objects
- Replace read-only configmap subPath mount with initContainer + emptyDir, so Frigate UI changes (zones, masks) persist at runtime
## Context
Frigate was spamming notifications because `max_frames` for cars caused it to "forget" a parked car after 150 frames, then immediately re-detect it as a brand new object. The fix follows [Frigate's official parked cars guide](https://docs.frigate.video/guides/parked_cars/).
The writable config change also unblocks using `required_zones` for car alerts — zones can now be drawn in the Frigate UI and will survive until pod reschedule (at which point they should be baked into the configmap via IaC).
## Test plan
- [ ] Sync frigate app via ArgoCD and verify pod starts with initContainer
- [ ] Confirm parked cars no longer trigger repeated alerts
- [ ] Draw a zone/mask in Frigate UI, save, verify it persists after Frigate restart
- [ ] Set up `driveway_entrance` required zone for car alerts
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/193