Add Grafana Alloy and Loki for unified observability #11

Merged
eblume merged 4 commits from feature/alloy-loki-logging into main 2026-01-15 12:24:14 -08:00

4 commits

Author SHA1 Message Date
47640511c4 Add Loki metrics scraping and Grafana dashboard
- Add Loki scrape job to Prometheus config
- Add Loki dashboard with storage/ingestion metrics:
  - Chunks in memory
  - Active chunks/streams
  - Heap usage
  - Ingestion rate (bytes/sec)
  - Log lines rate
  - Memory usage over time
  - Storage growth (24h rolling)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 12:14:32 -08:00
8291a8cb22 Remove ordering comments from playbook (handled by meta dependencies) 2026-01-15 12:02:53 -08:00
7fd601d864 Add role dependencies via meta/main.yml
- alloy depends on prometheus and loki
- grafana depends on prometheus and loki
- loki has no dependencies

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 12:01:18 -08:00
09c432e0c1 Add Grafana Alloy and Loki for unified observability
- Add ansible/roles/alloy/ - replaces node_exporter for metrics collection
  - Uses prometheus.exporter.unix with textfile collector
  - Pushes metrics to Prometheus via remote_write
  - Collects logs from brew services and mcquack LaunchAgents
  - Forwards logs to Loki

- Add ansible/roles/loki/ - log storage and query engine
  - Single-node filesystem-based deployment
  - TSDB storage with 31-day retention
  - Integrated with Grafana as datasource

- Update Prometheus to enable remote_write receiver
  - Remove node-exporter-indri scrape job (Alloy pushes instead)
  - Keep sifaka scraping via traditional node_exporter

- Update Grafana datasources to include Loki

- Update indri-services-check to verify Loki and Alloy

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 11:51:50 -08:00