blumeops/argocd/apps
Erich Blume 03d71544ec Add multi-cluster observability with ringtail metrics and dashboards (#270)
## Summary
- Add `cluster` label (indri/ringtail) to all Prometheus scrape jobs, Alloy k8s metrics/logs, and Alloy host metrics/logs
- Deploy kube-state-metrics on ringtail's k3s cluster (ArgoCD app + manifests)
- Deploy Alloy on ringtail to collect pod metrics and logs, remote-writing to indri's Prometheus and Loki
- Replace single-cluster "Minikube Kubernetes" and "K8s Services Health" dashboards with:
  - **Kubernetes Clusters** dashboard — multi-cluster with `cluster` and `namespace` template variables
  - **Ringtail (k3s)** dashboard — dedicated ringtail view with GPU usage panels

## Deployment and Testing
1. Sync `apps` on indri ArgoCD to pick up new app definitions (`kube-state-metrics-ringtail`, `alloy-ringtail`)
2. Sync `prometheus` → verify `cluster` label on scraped metrics
3. Sync `alloy-k8s` → verify `cluster=indri` on remote-written metrics and logs
4. Run `mise run provision-indri -- --tags alloy` → verify `cluster=indri` on host Alloy metrics/logs
5. Sync `kube-state-metrics-ringtail` → verify pods running on ringtail
6. Sync `alloy-ringtail` → verify pods running, check Prometheus for `kube_pod_info{cluster="ringtail"}`
7. Sync `grafana-config` → verify dashboards appear, cluster variable populates both values
8. Check Loki for `{cluster="ringtail"}` logs from ringtail pods

## Notes
- Alloy on ringtail uses `insecure_skip_verify=true` for TLS to Prometheus/Loki (Tailscale-managed certs not in container trust store) — tighten later
- DNS resolution for `*.tail8d86e.ts.net` from ringtail pods depends on CoreDNS inheriting host's MagicDNS resolver; may need CoreDNS forwarding rules if pods can't resolve
- The old services dashboard (blackbox probes) is removed — those probes are still running in alloy-k8s and the data is still in Prometheus, just not in a dedicated dashboard

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/270
2026-02-25 22:01:00 -08:00
..
1password-connect-ringtail.yaml Fix mirror org refs in ArgoCD apps and widen credential template (#266) 2026-02-25 06:55:53 -08:00
1password-connect.yaml Fix mirror org refs in ArgoCD apps and widen credential template (#266) 2026-02-25 06:55:53 -08:00
alloy-k8s.yaml Add Immich photo management + migrate forge URLs (#62) 2026-01-26 11:20:11 -08:00
alloy-ringtail.yaml Add multi-cluster observability with ringtail metrics and dashboards (#270) 2026-02-25 22:01:00 -08:00
apps.yaml Add Immich photo management + migrate forge URLs (#62) 2026-01-26 11:20:11 -08:00
argocd.yaml Add Immich photo management + migrate forge URLs (#62) 2026-01-26 11:20:11 -08:00
authentik.yaml Deploy Authentik identity provider (C2 Mikado) (#227) 2026-02-20 12:55:59 -08:00
blumeops-pg.yaml Add Immich photo management + migrate forge URLs (#62) 2026-01-26 11:20:11 -08:00
cloudnative-pg.yaml Port CloudNative-PG off Helm to direct release manifest (#268) 2026-02-25 17:37:53 -08:00
cv.yaml Add CV/resume web app at cv.ops.eblu.me (#169) 2026-02-12 11:09:41 -08:00
devpi.yaml Add Immich photo management + migrate forge URLs (#62) 2026-01-26 11:20:11 -08:00
docs.yaml Phase 1b: Deploy docs hosting with Quartz (#85) 2026-02-03 10:52:20 -08:00
external-secrets-config-ringtail.yaml Add k3s, 1Password Connect, and systemd nix-container-builder to ringtail (#209) 2026-02-18 21:15:30 -08:00
external-secrets-config.yaml Add External Secrets Operator with 1Password Connect (#66) (#66) 2026-01-28 19:30:10 -08:00
external-secrets-crds-ringtail.yaml Fix mirror org refs in ArgoCD apps and widen credential template (#266) 2026-02-25 06:55:53 -08:00
external-secrets-crds.yaml Fix mirror org refs in ArgoCD apps and widen credential template (#266) 2026-02-25 06:55:53 -08:00
external-secrets-ringtail.yaml Fix mirror org refs in ArgoCD apps and widen credential template (#266) 2026-02-25 06:55:53 -08:00
external-secrets.yaml Fix mirror org refs in ArgoCD apps and widen credential template (#266) 2026-02-25 06:55:53 -08:00
forgejo-runner.yaml Migrate Forgejo runner to Kubernetes with DinD (#60) 2026-01-25 19:56:17 -08:00
frigate.yaml Port Frigate NVR to ringtail k3s with GPU acceleration (#217) 2026-02-19 14:27:04 -08:00
grafana-config.yaml Add Immich photo management + migrate forge URLs (#62) 2026-01-26 11:20:11 -08:00
grafana.yaml C2: Upgrade Grafana to 12.x with Nix container and Kustomize (#260) 2026-02-23 18:07:18 -08:00
homepage.yaml Replace Homepage Helm chart with kustomize manifests and custom Dockerfile (#221) 2026-02-19 18:29:19 -08:00
immich-storage.yaml Add Immich photo management + migrate forge URLs (#62) 2026-01-26 11:20:11 -08:00
immich.yaml Fix mirror org refs in ArgoCD apps and widen credential template (#266) 2026-02-25 06:55:53 -08:00
kiwix.yaml Add Immich photo management + migrate forge URLs (#62) 2026-01-26 11:20:11 -08:00
kube-state-metrics-ringtail.yaml Add multi-cluster observability with ringtail metrics and dashboards (#270) 2026-02-25 22:01:00 -08:00
kube-state-metrics.yaml Add Immich photo management + migrate forge URLs (#62) 2026-01-26 11:20:11 -08:00
loki.yaml Add Immich photo management + migrate forge URLs (#62) 2026-01-26 11:20:11 -08:00
miniflux.yaml Add Immich photo management + migrate forge URLs (#62) 2026-01-26 11:20:11 -08:00
mqtt.yaml Port Mosquitto and ntfy to ringtail k3s, retire Apple Silicon Detector (#216) 2026-02-19 11:22:44 -08:00
navidrome.yaml Add Navidrome music streaming server (#79) 2026-01-31 20:19:31 -08:00
ntfy.yaml Port Mosquitto and ntfy to ringtail k3s, retire Apple Silicon Detector (#216) 2026-02-19 11:22:44 -08:00
nvidia-device-plugin.yaml Port Frigate NVR to ringtail k3s with GPU acceleration (#217) 2026-02-19 14:27:04 -08:00
prometheus.yaml Add Immich photo management + migrate forge URLs (#62) 2026-01-26 11:20:11 -08:00
tailscale-operator-ringtail.yaml Deploy Tailscale operator on ringtail k3s cluster (#215) 2026-02-19 09:33:05 -08:00
tailscale-operator.yaml Add Immich photo management + migrate forge URLs (#62) 2026-01-26 11:20:11 -08:00
teslamate.yaml Doc review: connect-to-postgres, create-release-artifact-workflow, deploy-k8s-service (#191) 2026-02-15 07:42:01 -08:00
torrent.yaml Add Immich photo management + migrate forge URLs (#62) 2026-01-26 11:20:11 -08:00