2026-01-19 14:40:25 -08:00
|
|
|
apiVersion: kustomize.config.k8s.io/v1beta1
|
|
|
|
|
kind: Kustomization
|
|
|
|
|
|
|
|
|
|
namespace: monitoring
|
|
|
|
|
|
|
|
|
|
resources:
|
|
|
|
|
- ingress-tailscale.yaml
|
2026-01-28 19:50:38 -08:00
|
|
|
- external-secret-admin.yaml
|
|
|
|
|
- external-secret-teslamate-datasource.yaml
|
2026-01-19 14:40:25 -08:00
|
|
|
# Dashboard ConfigMaps - discovered by Grafana sidecar via label grafana_dashboard=1
|
|
|
|
|
- dashboards/configmap-borgmatic.yaml
|
|
|
|
|
- dashboards/configmap-devpi.yaml
|
|
|
|
|
- dashboards/configmap-loki.yaml
|
|
|
|
|
- dashboards/configmap-macos.yaml
|
|
|
|
|
- dashboards/configmap-minikube.yaml
|
2026-01-30 16:57:26 -08:00
|
|
|
- dashboards/configmap-jellyfin.yaml
|
2026-01-19 14:40:25 -08:00
|
|
|
- dashboards/configmap-postgresql.yaml
|
Observability cleanup and k8s service monitoring (#43) (#43)
## Summary
- Remove stale `/opt/homebrew/var/loki` from borgmatic backup (Loki migrated to k8s)
- Add Alloy k8s DaemonSet for automatic pod log collection with auto-discovery
- Add blackbox probes for miniflux, kiwix, transmission, devpi, argocd
- Add transmission-exporter sidecar for full metrics (speed, torrent counts, ratios)
- Replace stale devpi dashboard with probe-based metrics (status, response time, uptime)
- Add unified "K8s Services Health" dashboard for service uptime/response monitoring
## Manual cleanup already performed
- Deleted stale textfile metrics on indri: `devpi.prom`, `transmission.prom`
- Deleted stale data directories on indri: `/opt/homebrew/var/loki/`, `/opt/homebrew/var/prometheus/`
## Deployment and Testing
- [x] Sync `apps` application to pick up new alloy-k8s app
- [x] Deploy alloy-k8s on feature branch: `argocd app set alloy-k8s --revision feature/observability-cleanup && argocd app sync alloy-k8s`
- [x] Deploy torrent on feature branch (for transmission exporter): `argocd app set torrent --revision feature/observability-cleanup && argocd app sync torrent`
- [x] Deploy prometheus on feature branch (for new scrape config): `argocd app set prometheus --revision feature/observability-cleanup && argocd app sync prometheus`
- [x] Deploy grafana-config on feature branch (for dashboards): `argocd app set grafana-config --revision feature/observability-cleanup && argocd app sync grafana-config`
- [x] Verify pod logs appear in Loki/Grafana
- [x] Verify transmission metrics appear in Prometheus
- [x] Verify service probe metrics appear in Prometheus
- [x] Run `mise run provision-indri -- --tags borgmatic` to update borgmatic config
- [ ] After merge, reset apps to main and resync
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/43
2026-01-22 13:51:01 -08:00
|
|
|
- dashboards/configmap-services.yaml
|
2026-01-19 14:40:25 -08:00
|
|
|
- dashboards/configmap-zot.yaml
|
2026-02-08 10:05:38 -08:00
|
|
|
- dashboards/configmap-docs-apm.yaml
|
|
|
|
|
- dashboards/configmap-flyio.yaml
|
2026-02-09 17:44:05 -08:00
|
|
|
- dashboards/configmap-sifaka-disks.yaml
|
2026-01-22 21:25:44 -08:00
|
|
|
# TeslaMate dashboards
|
|
|
|
|
- dashboards/configmap-teslamate-overview.yaml
|
|
|
|
|
- dashboards/configmap-teslamate-charges.yaml
|
|
|
|
|
- dashboards/configmap-teslamate-drives.yaml
|
|
|
|
|
- dashboards/configmap-teslamate-efficiency.yaml
|
|
|
|
|
- dashboards/configmap-teslamate-states.yaml
|
|
|
|
|
- dashboards/configmap-teslamate-vampire-drain.yaml
|
|
|
|
|
- dashboards/configmap-teslamate-battery-health.yaml
|
|
|
|
|
- dashboards/configmap-teslamate-statistics.yaml
|
|
|
|
|
- dashboards/configmap-teslamate-charge-level.yaml
|
|
|
|
|
- dashboards/configmap-teslamate-updates.yaml
|
|
|
|
|
- dashboards/configmap-teslamate-trip.yaml
|
|
|
|
|
- dashboards/configmap-teslamate-locations.yaml
|
|
|
|
|
- dashboards/configmap-teslamate-mileage.yaml
|
|
|
|
|
- dashboards/configmap-teslamate-drive-stats.yaml
|
|
|
|
|
- dashboards/configmap-teslamate-charging-stats.yaml
|
|
|
|
|
- dashboards/configmap-teslamate-projected-range.yaml
|
|
|
|
|
- dashboards/configmap-teslamate-timeline.yaml
|
|
|
|
|
- dashboards/configmap-teslamate-visited.yaml
|