The Go type for headers is []map[string]string, so the YAML entry
must be a map (- Key: "value") not a quoted string (- "Key: value").
The string format silently failed unmarshaling, causing the default
"View Clip" button to always appear instead of custom actions.
Also fix camera URL path (added / before # fragment).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
View Clip linked to raw H.265 MP4 which doesn't play in browsers.
Open Event links to Frigate's review page (built-in player handles
transcoding), Open Camera links to the live camera view.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Uses frigate-notify's EventLink template variable with ntfy's
X-Actions header to link to the Frigate event page, which has
a built-in player that handles H.265 transcoding.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Clip/snapshot links in notifications were using the internal
cluster URL (frigate:5000). Set public_url to nvr.ops.eblu.me
so links work from phones.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
frigate-notify sends detection snapshots as attachments, which
requires ntfy to have attachment support configured.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use MQTT-only event collection (disable webapi), fix ntfy alert
config nesting to match frigate-notify's expected format.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ONNX detector was crashing due to missing model path config.
CPU/TFLite works out of the box on ARM64 and is sufficient for
single-camera detection of large objects.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Configures ntfy to forward poll requests through ntfy.sh for APNs
delivery. Without this, iOS delays notifications by 20-30+ minutes.
Free tier allows 250 messages/day (no account needed).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace Misc group with Infrastructure and Services in the homepage
layout configuration to match the reorganized ingress annotations.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add four new services for cloud-free camera recording and alerting:
- Mosquitto MQTT broker (shared service in mqtt namespace)
- Ntfy push notifications (tailnet-accessible)
- Frigate NVR with GableCam via HTTP-FLV, ONNX detection, NFS recordings
- frigate-notify bridging detection events to Ntfy
Also adds Prometheus scrape target, Grafana dashboard, and Caddy
reverse proxy entries for nvr.ops.eblu.me and ntfy.ops.eblu.me.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Chart 2.3.0 mounts credentials as a file with standard k8s base64
encoding. The old double-encoding workaround (credentials-base64 in
stringData) now produces invalid JSON. Use raw JSON (credentials-file)
instead.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Add `daemon.json` with `registry-mirrors` to the forgejo-runner ConfigMap, pointing DinD at `http://host.minikube.internal:5050`
- Mount `daemon.json` into the DinD sidecar at `/etc/docker/daemon.json` via `subPath`
- Docker Hub pulls during Dagger CI builds will now route through Zot's pull-through cache, reducing bandwidth and avoiding rate limits
## Deployment and Testing
- [ ] `argocd app sync forgejo-runner`
- [ ] Exec into DinD container: `docker info` should show the registry mirror
- [ ] Trigger a workflow build and check Zot logs for cache hits
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/183
v3.2.0 build failed (GitHub download timeout), rolling back to
working image while it rebuilds.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Move FORGEJO_URL, RUNNER_NAME, and RUNNER_LABELS from ExternalSecret template to deployment env vars
- ExternalSecret now only contains the actual secret (RUNNER_TOKEN)
- Image version changes in RUNNER_LABELS now trigger automatic pod rollouts
## Deployment
1. Merge this PR
2. `argocd app sync forgejo-runner` — the deployment spec change will auto-roll the pod
No manual restart needed — that's the whole point :)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/181
Replace old Apps/Observability/Infrastructure layout entries with
Content and Misc to match the recategorized ingress annotations.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ArgoCD's tailscale ingress was missed in the recategorization (filed as
service-tailscale.yaml instead of ingress-tailscale.yaml). Fix the group
annotation and rename the file to match the convention used by all other
services.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Replace the three homepage groups (Apps, Observability, Infrastructure) with two cleaner groups
- **Content**: Immich, Kiwix, Miniflux, DJ, Grafana
- **Misc**: CV, TeslaMate, Transmission, Docs, Prometheus, PyPI
## Deployment and Testing
- [ ] Sync affected ingresses via ArgoCD (all 11 services)
- [ ] Verify homepage shows the two new groups correctly
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/179
## Summary
- Remove `match_all = true` from `flyio_nginx_cache_requests_total` in Alloy so the metric only counts requests that go through the proxy cache (excludes health checks with empty `cache_status`)
- Change dashboard queries from `rate(...[5m])` to `increase(...[$__range])` — aggregates over the full dashboard time window instead of a 5-minute sliding window, giving meaningful ratios for low-traffic static sites
- Add null/NaN value mapping to show "No traffic" in neutral color instead of blank/red
## Root cause
Health check requests from Fly.io hit the default nginx server block (no `proxy_cache`), producing entries with empty `upstream_cache_status`. With `match_all = true`, these were counted in the cache metric, diluting the Fly.io dashboard ratio. For APM dashboards, `rate()[5m]` on low-traffic sites with 24h cache validity almost always returns either all-HITs (100%) or no data (blank → red background).
## Deployment
- Fly.io proxy redeploy needed for Alloy config change
- ArgoCD sync for dashboard ConfigMap changes
## Test plan
- [ ] Redeploy Fly.io proxy
- [ ] Sync grafana-config in ArgoCD
- [ ] Verify CV APM cache hit ratio shows a real percentage (not 100%)
- [ ] Verify Docs APM shows "No traffic" in neutral color when idle, real ratio when visited
- [ ] Verify Fly.io proxy dashboard cache ratio excludes health checks
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/177
## Summary
- nginx container (`containers/cv/`) downloads and serves a content tarball at startup (same pattern as quartz)
- ArgoCD app + k8s manifests (deployment, service, Tailscale ingress)
- Caddy route for `cv.ops.eblu.me`
- Deploy workflow: resolves "latest" or specific version from Forgejo packages, updates deployment, syncs ArgoCD
- Content is built and released from the separate [cv repo](https://forge.ops.eblu.me/eblume/cv)
## Deployment steps (after merge)
1. `mise run container-tag-and-release cv v1.0.0`
2. Run "Release CV" workflow in cv repo (SPECIFIC_VERSION `v0.1.0`)
3. Run "Deploy CV" workflow in blumeops (default: latest)
4. `mise run provision-indri -- --tags caddy`
5. Verify at `https://cv.ops.eblu.me/`
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/169
## Summary
With Phases 1 and 2 complete, the runner image no longer needs most of its bundled tools. This PR strips it down and adds what was missing.
**Removed** (now inside Dagger containers):
- Node.js 24.x
- Docker CLI + buildx plugin
- skopeo
- gnupg, lsb-release, xz-utils
**Added:**
- `tzdata` — fixes the TZ env var (#159, #160, #161) so `TZ=America/Los_Angeles` actually works
- `flyctl` — was being installed from scratch every release
**Workflow changes:**
- Remove "Ensure Dagger CLI" bootstrap steps from both workflows (Dagger is in the image)
- Remove "Install flyctl" step from build-blumeops (flyctl is in the image)
- Remove job-level `TZ` from build-blumeops (moved to runner configmap `runner.envs`)
- Set `TZ: America/Los_Angeles` in runner configmap so all job containers inherit it
## Deployment
After merge:
1. Build and release the new runner image: `mise run container-release forgejo-runner v2.0.0`
2. Sync the runner: `argocd app sync forgejo-runner`
3. Verify: `kubectl -n forgejo-runner exec deploy/forgejo-runner -c runner -- date` (but the real test is running a docs release and checking the changelog date)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/162
## Summary
The `TZ=America/Los_Angeles` env var from #159 has no effect because the `forgejo/runner` image doesn't ship tzdata. Mount the node's `/usr/share/zoneinfo` into the container so the timezone database is available.
## Deployment
After merge, sync forgejo-runner and verify:
```
argocd app sync forgejo-runner
kubectl -n forgejo-runner exec deploy/forgejo-runner -c runner -- date
# Should show PST/PDT, not UTC
```
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/160
## Summary
- Set `TZ=America/Los_Angeles` on the Forgejo runner container
The runner pod defaults to UTC. When releases are cut in the evening PST, towncrier stamps changelog entries with tomorrow's date (e.g., v1.6.2 shows 2026-02-12 despite being released on the evening of Feb 11 PST).
## Deployment
After merge, sync the forgejo-runner ArgoCD app:
```
argocd app sync forgejo-runner
```
The runner pod will restart with the new timezone. Note: the v1.6.2 changelog entry will remain dated 2026-02-12; future entries will use PST dates, so dates may appear non-sequential once.
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/159