blumeops/docs/how-to/alerts
Erich Blume 2e2a33d7ca C2(deploy-infra-alerting): close port-services-check-alerts
7 alert rules covering services-check probes:
- ServiceProbeFailure (11 HTTP probes via Alloy blackbox)
- PodNotReady (kube-state-metrics, both clusters)
- PostgresClusterUnhealthy (CNPG collector)
- TextfileStale (node_textfile_mtime_seconds)
- FrigateCameraDown (frigate_camera_fps)
- ArgoCDAppOutOfSync (argocd_app_info)

7 runbooks in docs/how-to/alerts/.

Remaining uncovered: local indri services (brew/launchctl), ringtail
SSH/tailscale, public Fly.io endpoints, k8s API health, frigate
storage. These are effectively covered by downstream alerts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 14:23:42 -07:00
..
configure-grafana-alerting-pipeline.md C2(deploy-infra-alerting): close configure-grafana-alerting-pipeline 2026-03-22 10:51:02 -07:00
deploy-infra-alerting.md C2(deploy-infra-alerting): plan add alerting pipeline cards 2026-03-22 10:28:31 -07:00
first-alert-and-runbook.md C2(deploy-infra-alerting): close first-alert-and-runbook 2026-03-22 12:05:43 -07:00
port-services-check-alerts.md C2(deploy-infra-alerting): close port-services-check-alerts 2026-03-22 14:23:42 -07:00
refactor-services-check-to-query-alerts.md C2(deploy-infra-alerting): plan add alerting pipeline cards 2026-03-22 10:28:31 -07:00
runbook-argocd-out-of-sync.md C2(deploy-infra-alerting): impl add ArgoCD scrape and sync alert 2026-03-22 13:45:34 -07:00
runbook-frigate-camera-down.md C2(deploy-infra-alerting): impl add textfile staleness and Frigate alerts 2026-03-22 13:43:16 -07:00
runbook-pod-not-ready.md C2(deploy-infra-alerting): impl add probes and alert rules for services-check coverage 2026-03-22 12:11:12 -07:00
runbook-postgres-unhealthy.md C2(deploy-infra-alerting): impl add probes and alert rules for services-check coverage 2026-03-22 12:11:12 -07:00
runbook-service-probe-failure.md C2(deploy-infra-alerting): impl add first alert rule and runbook 2026-03-22 10:57:23 -07:00
runbook-textfile-stale.md C2(deploy-infra-alerting): impl add textfile staleness and Frigate alerts 2026-03-22 13:43:16 -07:00