Operations and observability for sifaka NAS (#135)
## Summary - Add `smartctl_exporter` Docker container to sifaka for SMART disk health monitoring - Formalize existing `node_exporter` container under Ansible management - Route both exporters through Caddy L4 TCP proxy (`nas.ops.eblu.me:9100`, `nas.ops.eblu.me:9633`), replacing the hardcoded LAN IP in Prometheus - Create "Sifaka Disk Health" Grafana dashboard (health status, temperature, wear indicators, lifetime) - Introduce `ansible/playbooks/sifaka.yml` and `mise run provision-sifaka` — first Ansible playbook for the NAS - Shared exporter port variables in `group_vars/all.yml` to avoid duplication between Caddy and sifaka roles ## Prerequisites before deploy - [ ] Enable SSH on sifaka (DSM Control Panel > Terminal & SNMP) - [ ] Verify `ssh eblume@sifaka 'docker ps'` works - [ ] Run `mise run provision-sifaka` to deploy containers - [ ] Run `mise run provision-indri -- --tags caddy` to add L4 routes - [ ] `argocd app sync prometheus` + `argocd app sync grafana-config` ## Test plan - [ ] Verify smartctl_exporter metrics: `curl http://nas.ops.eblu.me:9633/metrics` - [ ] Verify Prometheus targets page shows both sifaka jobs as UP - [ ] Verify Grafana "Sifaka Disk Health" dashboard loads with data 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/135
This commit is contained in:
parent
4ee643a81d
commit
85e36cd807
15 changed files with 538 additions and 9 deletions
|
|
@ -84,3 +84,7 @@ caddy_tcp_services:
|
|||
backend: "localhost:2200" # Forgejo SSH
|
||||
- port: 5432
|
||||
backend: "pg.tail8d86e.ts.net:5432" # PostgreSQL
|
||||
- port: "{{ sifaka_node_exporter_port }}"
|
||||
backend: "sifaka:{{ sifaka_node_exporter_port }}" # Sifaka node_exporter
|
||||
- port: "{{ sifaka_smartctl_exporter_port }}"
|
||||
backend: "sifaka:{{ sifaka_smartctl_exporter_port }}" # Sifaka smartctl_exporter
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue