blumeops/argocd/manifests/grafana-config
Erich Blume 85e36cd807 Operations and observability for sifaka NAS (#135)
## Summary
- Add `smartctl_exporter` Docker container to sifaka for SMART disk health monitoring
- Formalize existing `node_exporter` container under Ansible management
- Route both exporters through Caddy L4 TCP proxy (`nas.ops.eblu.me:9100`, `nas.ops.eblu.me:9633`), replacing the hardcoded LAN IP in Prometheus
- Create "Sifaka Disk Health" Grafana dashboard (health status, temperature, wear indicators, lifetime)
- Introduce `ansible/playbooks/sifaka.yml` and `mise run provision-sifaka` — first Ansible playbook for the NAS
- Shared exporter port variables in `group_vars/all.yml` to avoid duplication between Caddy and sifaka roles

## Prerequisites before deploy
- [ ] Enable SSH on sifaka (DSM Control Panel > Terminal & SNMP)
- [ ] Verify `ssh eblume@sifaka 'docker ps'` works
- [ ] Run `mise run provision-sifaka` to deploy containers
- [ ] Run `mise run provision-indri -- --tags caddy` to add L4 routes
- [ ] `argocd app sync prometheus` + `argocd app sync grafana-config`

## Test plan
- [ ] Verify smartctl_exporter metrics: `curl http://nas.ops.eblu.me:9633/metrics`
- [ ] Verify Prometheus targets page shows both sifaka jobs as UP
- [ ] Verify Grafana "Sifaka Disk Health" dashboard loads with data

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/135
2026-02-09 17:44:05 -08:00
..
dashboards Operations and observability for sifaka NAS (#135) 2026-02-09 17:44:05 -08:00
external-secret-admin.yaml Switch all ExternalSecrets to creationPolicy: Owner 2026-01-28 20:27:16 -08:00
external-secret-teslamate-datasource.yaml Switch all ExternalSecrets to creationPolicy: Owner 2026-01-28 20:27:16 -08:00
ingress-tailscale.yaml Restrict flyio-proxy ACLs to dedicated tag:flyio-target endpoints (#126) 2026-02-08 21:54:18 -08:00
kustomization.yaml Operations and observability for sifaka NAS (#135) 2026-02-09 17:44:05 -08:00
README.md K8s Migration Phase 2: Grafana to Kubernetes (#30) 2026-01-19 14:40:25 -08:00

Grafana Configuration

This directory contains Kubernetes manifests for Grafana configuration:

  • Tailscale Ingress for external access
  • Dashboard ConfigMaps for provisioning

Secrets Management

Current approach: Secrets are manually injected using 1Password CLI.

Before deploying Grafana, create the admin password secret:

kubectl create namespace monitoring
op inject -i secret-admin.yaml.tpl | kubectl apply -f -

The secret template (secret-admin.yaml.tpl) references 1Password:

  • Vault: vg6xf6vvfmoh5hqjjhlhbeoaie (blumeops)
  • Item: oxkcr3xtxnewy7noep2izvyr6y
  • Field: password

Future improvement: Migrate to External Secrets Operator or similar for automated secret synchronization from 1Password to Kubernetes.

Dashboards

Dashboard JSON files are stored as ConfigMaps in the dashboards/ directory. The Grafana sidecar automatically discovers ConfigMaps with label grafana_dashboard: "1" and provisions them.

To add a new dashboard:

  1. Export the dashboard JSON from Grafana UI
  2. Create a ConfigMap with the JSON content
  3. Add the grafana_dashboard: "1" label
  4. Add the ConfigMap to kustomization.yaml