Add Tempo reference doc and changelog fragment (docs-first)
Tempo is the new distributed tracing backend for BlumeOps, completing the third observability pillar alongside Prometheus (metrics) and Loki (logs). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
d15071aaf9
commit
3be9fa948c
4 changed files with 65 additions and 2 deletions
1
docs/changelog.d/feature-otel-tracing.feature.md
Normal file
1
docs/changelog.d/feature-otel-tracing.feature.md
Normal file
|
|
@ -0,0 +1 @@
|
|||
Add distributed tracing via Grafana Tempo and Beyla eBPF auto-instrumentation. Tempo runs on minikube-indri for trace storage, while a privileged Alloy DaemonSet on ringtail uses Beyla to instrument HTTP services (Frigate, ntfy, Ollama, Immich) without code changes. Grafana gets trace-to-log and trace-to-metrics correlation.
|
||||
|
|
@ -7,11 +7,12 @@ tags:
|
|||
|
||||
# Observability
|
||||
|
||||
Metrics, logs, and dashboards for BlumeOps infrastructure.
|
||||
Metrics, logs, traces, and dashboards for BlumeOps infrastructure.
|
||||
|
||||
## Components
|
||||
|
||||
- [[prometheus]] - Metrics storage and querying
|
||||
- [[loki]] - Log aggregation
|
||||
- [[alloy|Alloy]] - Metrics and log collection
|
||||
- [[tempo]] - Distributed tracing
|
||||
- [[alloy|Alloy]] - Metrics, log, and trace collection
|
||||
- [[grafana]] - Dashboards and visualization
|
||||
|
|
|
|||
|
|
@ -27,6 +27,7 @@ Individual service reference cards with URLs and configuration details.
|
|||
| [[jellyfin]] | Media server | indri |
|
||||
| [[kiwix]] | Offline Wikipedia & ZIM archives | k8s |
|
||||
| [[loki]] | Log aggregation | k8s |
|
||||
| [[tempo]] | Distributed tracing | k8s |
|
||||
| [[miniflux]] | RSS feed reader | k8s |
|
||||
| [[navidrome]] | Music streaming | k8s |
|
||||
| [[ntfy]] | Push notifications | k8s (ringtail) |
|
||||
|
|
|
|||
60
docs/reference/services/tempo.md
Normal file
60
docs/reference/services/tempo.md
Normal file
|
|
@ -0,0 +1,60 @@
|
|||
---
|
||||
title: Tempo
|
||||
modified: 2026-03-05
|
||||
tags:
|
||||
- service
|
||||
- observability
|
||||
---
|
||||
|
||||
# Grafana Tempo
|
||||
|
||||
Distributed tracing backend for BlumeOps infrastructure. Receives traces via OTLP, stores them locally, and generates RED metrics (rate, error, duration) for [[prometheus]].
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| **URL** | https://tempo.ops.eblu.me (when Caddy route added) |
|
||||
| **Tailscale URL** | https://tempo.tail8d86e.ts.net |
|
||||
| **OTLP Endpoint** | https://tempo-otlp.tail8d86e.ts.net |
|
||||
| **Namespace** | `monitoring` |
|
||||
| **Image** | `grafana/tempo:2.7.2` |
|
||||
| **Storage** | 10Gi PVC (local filesystem) |
|
||||
| **Retention** | 7 days |
|
||||
|
||||
## Architecture
|
||||
|
||||
- Single-node deployment with local filesystem storage
|
||||
- OTLP receivers: gRPC (4317) and HTTP (4318)
|
||||
- `metrics_generator` produces span-metrics and service-graphs, remote-written to [[prometheus]]
|
||||
- Queried via [[grafana]] Tempo datasource
|
||||
- Two Tailscale Ingresses: one for query API (3200), one for OTLP HTTP receiver (4318)
|
||||
|
||||
## Trace Sources
|
||||
|
||||
**From ringtail (via Beyla eBPF in Alloy):**
|
||||
|
||||
| Service | Protocol | Coverage |
|
||||
|---------|----------|----------|
|
||||
| [[frigate]] | HTTP REST | Request rate, error rate, latency, trace spans |
|
||||
| [[ntfy]] | HTTP | Same |
|
||||
| [[ollama]] | HTTP REST | Same (model inference latency) |
|
||||
| [[immich]] | HTTP REST | Same |
|
||||
|
||||
Beyla auto-instruments HTTP services via eBPF kernel hooks — no code changes needed. MQTT (Mosquitto) is not instrumented (no eBPF parser for MQTT).
|
||||
|
||||
**Future: SDK instrumentation**
|
||||
Services with OTel SDK support (e.g., Hermes) can send traces directly to the OTLP endpoint for deeper internal spans (DB queries, business logic) alongside eBPF envelope traces.
|
||||
|
||||
## Grafana Integration
|
||||
|
||||
- **Tempo datasource** with trace-to-log and trace-to-metrics correlation
|
||||
- **Service map** and **node graph** visualization
|
||||
- **Loki derived fields** link trace IDs in logs back to Tempo
|
||||
|
||||
## Related
|
||||
|
||||
- [[alloy|Alloy]] - Trace collector (Beyla eBPF on ringtail)
|
||||
- [[prometheus]] - Receives span-metrics from Tempo
|
||||
- [[loki]] - Log correlation via trace IDs
|
||||
- [[grafana]] - Trace visualization
|
||||
Loading…
Add table
Add a link
Reference in a new issue