Improve Frigate health checks to catch NFS and camera failures

Replace single aggregate camera_fps check with per-camera FPS validation
and NFS storage accessibility check. Motivated by an outage where Frigate
API responded OK but NFS mount was inaccessible, causing "no frames" in UI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Erich Blume 2026-03-22 09:55:53 -07:00
commit f1620abb17
2 changed files with 3 additions and 1 deletions

View file

@ -0,0 +1 @@
Improve Frigate health checks in services-check: per-camera FPS validation and NFS storage accessibility check.

View file

@ -83,7 +83,8 @@ check_http "CV" "https://cv.ops.eblu.me/"
check_http "Ntfy" "https://ntfy.ops.eblu.me/v1/health"
check_http "Authentik" "https://authentik.ops.eblu.me/-/health/live/"
check_http "Frigate" "https://nvr.ops.eblu.me/api/version"
check_service "frigate-recording" "curl -sf --max-time 5 https://nvr.ops.eblu.me/api/stats | jq -e '.camera_fps > 0'"
check_service "frigate-camera-fps" "curl -sf --max-time 5 https://nvr.ops.eblu.me/api/stats | jq -e '.cameras | to_entries | all(.value.camera_fps > 0)'"
check_service "frigate-storage" "curl -sf --max-time 5 https://nvr.ops.eblu.me/api/stats | jq -e '.service.storage | to_entries | map(select(.key | startswith(\"/media\"))) | length > 0 and all(.[]; .value.free > 0)'"
check_http "JobSync" "https://jobsync.ops.eblu.me/"
echo ""