blumeops/docs/reference/reference.md
Erich Blume c281fb5403 Add OpenTelemetry distributed tracing (Tempo + Beyla eBPF) (#286)
## Summary

Adds the third observability pillar — **distributed tracing** — alongside existing metrics (Prometheus) and logs (Loki).

- **Grafana Tempo 2.10.1** on minikube-indri for trace storage with 7d retention, OTLP receivers, and `metrics_generator` that remote-writes span-metrics (RED) to Prometheus
- **Beyla eBPF auto-instrumentation** via a privileged Alloy DaemonSet on ringtail — instruments HTTP services (Frigate, ntfy, Ollama, Immich) without code changes
- **Grafana integration** — Tempo datasource with trace↔log and trace↔metrics correlation, plus Loki derivedFields for trace ID linking
- **Prometheus** scrapes Tempo operational metrics

### Architecture

```
ringtail (k3s)                                indri (minikube)
┌──────────────────────┐                      ┌─────────────────────┐
│ Alloy+Beyla (eBPF)   │──OTLP HTTP────────→ │ Tempo               │
│  ↳ Frigate, ntfy,    │  via tailnet         │  ↳ trace storage    │
│    Ollama, Immich     │                      │  ↳ RED → Prometheus │
└──────────────────────┘                      │                     │
                                              │ Grafana             │
                                              │  ↳ Tempo datasource │
                                              └─────────────────────┘
```

### New files (12)
- `docs/reference/services/tempo.md` — reference doc
- `docs/changelog.d/feature-otel-tracing.feature.md`
- `argocd/apps/tempo.yaml` + `argocd/manifests/tempo/` (6 files)
- `argocd/apps/alloy-tracing-ringtail.yaml` + `argocd/manifests/alloy-tracing-ringtail/` (4 files)

### Modified files (6)
- `argocd/manifests/grafana/datasources.yaml` — Tempo datasource + Loki derivedFields
- `argocd/manifests/prometheus/prometheus.yml` — Tempo scrape target
- `service-versions.yaml` — tempo + alloy-tracing-ringtail entries
- `docs/reference/services/grafana.md` — Tempo in datasources table
- `docs/reference/reference.md` — Tempo in services index
- `docs/reference/operations/observability.md` — Tempo in components list

## Deployment and Testing

- [ ] Sync `apps` app to pick up new Application definitions
- [ ] `argocd app set tempo --revision feature/otel-tracing && argocd app sync tempo`
- [ ] Verify Tempo pod: `kubectl --context=minikube-indri get pods -n monitoring -l app=tempo`
- [ ] Verify Tempo ready: port-forward 3200 and `curl localhost:3200/ready`
- [ ] Verify Tailscale ingresses: `kubectl --context=minikube-indri get ingress -n monitoring`
- [ ] `argocd app set alloy-tracing-ringtail --revision feature/otel-tracing && argocd app sync alloy-tracing-ringtail`
- [ ] Check Beyla discovery in alloy-tracing logs on ringtail
- [ ] Sync grafana-config for updated datasources
- [ ] Sync prometheus for updated scrape config
- [ ] Test Grafana Tempo datasource connection
- [ ] Generate test traffic and search traces in Grafana Explore → Tempo
- [ ] After merge: reset all ArgoCD app revisions back to main

Reviewed-on: #286
2026-03-05 10:51:07 -08:00

3 KiB

title modified tags
Reference 2026-03-04
reference

Reference

Technical specifications, inventories, and configuration details for BlumeOps infrastructure.

Services

Individual service reference cards with URLs and configuration details.

Service Description Location
[[alloy Alloy]] Observability collector (metrics & logs)
argocd GitOps continuous delivery k8s
borgmatic Backup system indri
caddy Reverse proxy & TLS termination indri
1password Secrets management cloud + k8s
forgejo Git forge & CI/CD indri
frigate Network video recorder k8s (ringtail)
grafana Dashboards & visualization k8s
immich Photo management k8s
jellyfin Media server indri
kiwix Offline Wikipedia & ZIM archives k8s
loki Log aggregation k8s
tempo Distributed tracing k8s
miniflux RSS feed reader k8s
navidrome Music streaming k8s
ntfy Push notifications k8s (ringtail)
postgresql Database cluster k8s
prometheus Metrics collection k8s
teslamate Tesla data logger k8s
transmission BitTorrent daemon k8s
zot Container registry indri
devpi PyPI caching proxy k8s
cv Resume / CV site k8s
authentik OIDC identity provider k8s (ringtail)
docs Documentation site (Quartz) k8s
flyio-proxy Public reverse proxy (Fly.io + Tailscale) Fly.io
ollama LLM inference server k8s (ringtail)
automounter SMB share automounter indri

Infrastructure

Host inventory and network configuration.

  • hosts - Device inventory
  • indri - Primary server
  • ringtail - Service host & gaming PC
  • gilbert - Development workstation
  • tailscale - ACLs, groups, tags
  • gandi - DNS hosting for eblu.me
  • unifi - Home WiFi router (UniFi Express 7)
  • routing - DNS domains, port mappings
  • power - Battery-backed power chain

Tools

Build, deployment, and IaC tool reference.

  • mise-tasks - Operational task runner (all mise run tasks)
  • dagger - CI/CD build engine (Python SDK)
  • argocd-cli - ArgoCD CLI workflows
  • ansible - Configuration management for indri
  • pulumi - Infrastructure-as-Code (DNS, Tailscale ACLs)

Kubernetes

Cluster configuration and application registry.

Storage

Network storage and backup configuration.

Operations

Operational concerns and their components.