Operations and observability for sifaka NAS #135

Merged
eblume merged 6 commits from feature/sifaka-ops-observability into main 2026-02-09 17:44:06 -08:00

6 commits

Author SHA1 Message Date
c3d9a2478d Fix SMART Health Status panel layout
Switch to auto orientation and increase height so the 4 device
status blocks display as horizontal squares instead of vertical strips.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 17:42:37 -08:00
663c2d92b0 Fix dashboard: correct metric name and stat panel layout
- smartctl_device_smart_healthy → smartctl_device_smart_status
- Make all stat panels full-width with auto orientation so 4 device
  values display side-by-side instead of stacked vertically

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 17:39:19 -08:00
97420f302b Move group_vars into inventory dir and document sifaka setup
Ansible searches for group_vars/ relative to the inventory directory,
not the project root. Also adds first-time setup docs and hardware
details to the sifaka reference card.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 17:28:34 -08:00
316c213aae Document sifaka first-time setup and hardware details
Adds one-time setup steps (SSH, sudoers, Docker path, device naming)
to the sifaka reference card for reproducibility if the NAS is replaced.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 17:22:45 -08:00
4ed2e3bb5e Fix sifaka_exporters role for Synology environment
- Use full docker path (/volume1/@appstore/ContainerManager/usr/bin/docker)
- Match existing container name (prom-node-exporter-1)
- Remove unnecessary node_exporter flags (--pid=host, volume mounts)
- Add become: true for all docker tasks (requires sudo on Synology)
- Run smartctl_exporter as --user=root (image drops to nobody internally)
- Explicitly specify /dev/sata* devices (Synology uses non-standard paths)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 17:12:32 -08:00
483db74a3c Add SMART disk health monitoring and Ansible provisioning for sifaka NAS
Adds smartctl_exporter alongside the existing node_exporter on sifaka,
routed through Caddy L4 TCP proxy at nas.ops.eblu.me, with a Grafana
dashboard for disk health visibility. Introduces the first Ansible
playbook for sifaka (mise run provision-sifaka) and shared exporter
port variables in group_vars/all.yml.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 16:03:05 -08:00