C2: Deploy infrastructure alerting pipeline (#303)

## Summary

Mikado chain to replace `mise run services-check` with Grafana Unified Alerting backed by ntfy push notifications.

**Design:**
- Grafana Unified Alerting evaluates rules against Prometheus/Loki
- ntfy webhook contact point delivers iOS notifications
- Anti-noise policy: page once per 24h per alert group
- Every alert links to a runbook in `docs/how-to/alerts/`
- services-check eventually queries the alerting API instead of doing its own probes

**Chain (bottom-up):**
1. `configure-grafana-alerting-pipeline` — enable alerting, ntfy contact point, notification policy
2. `first-alert-and-runbook` — end-to-end proof of concept with blackbox probe failure
3. `port-services-check-alerts` — migrate all services-check probes to alert rules + runbooks
4. `refactor-services-check-to-query-alerts` — rewrite services-check to query Grafana API
5. `deploy-infra-alerting` — goal card

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #303
This commit is contained in:
Erich Blume 2026-03-22 14:52:56 -07:00
commit 6d65e6928c
20 changed files with 1259 additions and 46 deletions

View file

@ -169,6 +169,43 @@ prometheus.exporter.blackbox "services" {
address = "http://argocd-server.argocd.svc.cluster.local:80/healthz"
module = "http_2xx"
}
target {
name = "prometheus"
address = "http://prometheus.monitoring.svc.cluster.local:9090/-/healthy"
module = "http_2xx"
}
target {
name = "loki"
address = "http://loki.monitoring.svc.cluster.local:3100/ready"
module = "http_2xx"
}
target {
name = "grafana"
address = "http://grafana.monitoring.svc.cluster.local:80/api/health"
module = "http_2xx"
}
target {
name = "teslamate"
address = "http://teslamate.teslamate.svc.cluster.local:4000/"
module = "http_2xx"
}
target {
name = "immich"
address = "http://immich-server.immich.svc.cluster.local:2283/api/server/ping"
module = "http_2xx"
}
target {
name = "navidrome"
address = "http://navidrome.navidrome.svc.cluster.local:4533/"
module = "http_2xx"
}
}
// Scrape blackbox probe results