Log filtering cleanup and observability improvements #45

Merged
eblume merged 7 commits from feature/log-filtering-cleanup into main 2026-01-22 17:30:08 -08:00

7 commits

Author SHA1 Message Date
9de2e961cc Fix JSON escaping in devpi dashboard
Was using \\" (double backslash) but YAML literal blocks only need \"
to produce a backslash-quote in the output.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-22 17:27:44 -08:00
848fea9e1f Silence logfmt parse errors for non-logfmt logs
The logfmt stage tries to parse all logs, including JSON. When it fails,
it adds __error__ labels. Drop these labels since the logs are still
processed correctly (JSON parser already extracted fields).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-22 17:23:44 -08:00
d887d4cdcb Extract more fields from JSON logs (zot compatibility)
Zot uses "message" instead of "msg" for the message field. Also extract
caller and repository from JSON logs for better filtering.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-22 17:10:30 -08:00
aa2f562120 Add logfmt parser for k8s log level extraction
The JSON parser only works for JSON-formatted logs. Many Go services
(Loki, Prometheus, etc.) use logfmt format. Add logfmt parser to extract
level, caller, and component labels for better log filtering in Grafana.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-22 17:08:26 -08:00
358bbcdffb Add macOS power/thermal metrics collection and dashboard
- Add powermetrics collector to Alloy role (via LaunchDaemon, requires root)
- Collect CPU, GPU, ANE power (watts) and thermal pressure level
- Add "Power & Thermal" section to macOS Grafana dashboard with:
  - Total power stat
  - Thermal pressure indicator (Nominal/Moderate/Heavy/Critical)
  - Stacked power consumption graph (CPU/GPU/ANE)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-22 16:59:07 -08:00
3886c43de8 Disable thermal collector on indri Alloy
The thermal collector fails on macOS M1 with "no CPU power status has
been recorded" - this hardware doesn't expose power metrics in a way
the collector understands.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-22 16:48:32 -08:00
f0201c06d9 Suppress storage-provisioner Endpoints deprecation warning
Minikube's storage-provisioner uses the deprecated v1 Endpoints API which
spams warnings every 2 seconds on K8s 1.33+. This is an upstream issue
(kubernetes/minikube#21009) with no fix yet.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-22 16:46:07 -08:00