## Summary - Embed Grafana Alloy in the Fly.io proxy container to collect nginx JSON access logs (→ Loki) and derive request rate, latency histogram, cache status, and bandwidth metrics (→ Prometheus) - Add nginx `stub_status` endpoint for connection-level metrics (active/reading/writing/waiting) - Create two Grafana dashboards: **Docs APM** (per-service view filtered by `host="docs.eblu.me"`) and **Fly.io Proxy Health** (aggregate proxy health across all upstream services) ## Changed Files | File | Change | |------|--------| | `fly/nginx.conf` | Add JSON `log_format` + `access_log`, add `stub_status` endpoint | | `fly/Dockerfile` | COPY Alloy binary from `grafana/alloy:v1.5.1`, COPY `alloy.river` config | | `fly/alloy.river` | **New** — Alloy config: log tailing, metric extraction, remote_write | | `fly/start.sh` | Start Alloy after Tailscale, before nginx | | `argocd/manifests/grafana-config/dashboards/configmap-docs-apm.yaml` | **New** — Docs APM dashboard | | `argocd/manifests/grafana-config/dashboards/configmap-flyio.yaml` | **New** — Fly.io Proxy Health dashboard | | `argocd/manifests/grafana-config/kustomization.yaml` | Register new dashboard configmaps | | `docs/reference/services/flyio-proxy.md` | Document observability setup | ## Deployment and Testing - [ ] `mise run fly-deploy` — rebuild container with Alloy - [ ] `curl https://docs.eblu.me/` — generate traffic - [ ] `fly logs -a blumeops-proxy` — verify Alloy startup - [ ] Query Prometheus: `flyio_nginx_http_requests_total{instance="flyio-proxy"}` - [ ] Query Loki: `{instance="flyio-proxy", job="flyio-nginx"}` - [ ] `argocd app sync grafana-config` — deploy dashboards - [ ] Verify dashboards show data in Grafana - [ ] `mise run services-check` — no regressions Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/123
3.2 KiB
| title | tags | |||
|---|---|---|---|---|
| Caddy |
|
Caddy
Reverse proxy for *.ops.eblu.me services with automatic TLS via ACME DNS-01.
Quick Reference
| Property | Value |
|---|---|
| Domain | *.ops.eblu.me |
| HTTPS Port | 443 |
| Config | ansible/roles/caddy/templates/Caddyfile.j2 |
| Binary | Custom build with Gandi DNS plugin |
Why Caddy?
Caddy provides a single TLS termination point for all BlumeOps services:
- Wildcard certificate for
*.ops.eblu.mevia Let's Encrypt - DNS-01 challenge using Gandi API (no port 80 needed)
- Unified access from k8s pods, containers, and tailnet clients
See routing for when to use *.ops.eblu.me vs *.tail8d86e.ts.net.
Proxied Services
Indri-Local Services
| Subdomain | Backend | Service |
|---|---|---|
forge.ops.eblu.me |
localhost:3001 |
forgejo |
registry.ops.eblu.me |
localhost:5050 |
zot |
jellyfin.ops.eblu.me |
localhost:8096 |
jellyfin |
Kubernetes Services
K8s services are proxied via their Tailscale Ingress endpoints:
| Subdomain | Backend | Service |
|---|---|---|
grafana.ops.eblu.me |
grafana.tail8d86e.ts.net |
grafana |
argocd.ops.eblu.me |
argocd.tail8d86e.ts.net |
argocd |
docs.ops.eblu.me |
docs.tail8d86e.ts.net |
docs (now publicly available at docs.eblu.me via flyio-proxy) |
feed.ops.eblu.me |
feed.tail8d86e.ts.net |
miniflux |
| ... | ... | (see defaults/main.yml for full list) |
TCP Services (Layer 4)
| Port | Backend | Service |
|---|---|---|
| 2222 | localhost:2200 |
Forgejo SSH |
| 5432 | pg.tail8d86e.ts.net:5432 |
postgresql |
Configuration
Caddy is managed via the caddy Ansible role:
# Deploy caddy changes
mise run provision-indri -- --tags caddy
Key files:
ansible/roles/caddy/defaults/main.yml- Service definitionsansible/roles/caddy/templates/Caddyfile.j2- Caddy config template
Secrets
| Secret | Source | Description |
|---|---|---|
GANDI_BEARER_TOKEN |
1Password | API token for DNS-01 challenges |
The token is written to ~/.config/caddy/gandi-token (chmod 0600) and sourced by the Caddy wrapper script.
Security Considerations
Caddy has no authentication layer — it is a plain reverse proxy. Access control relies entirely on Tailscale ACLs restricting which devices can reach indri on port 443. Currently tag:homelab, autogroup:admin, and tag:flyio-proxy can reach Caddy. The flyio-proxy grant exists so Alloy can push metrics/logs to Loki and Prometheus, but it means the Fly.io container can technically reach all Caddy-proxied services. See flyio-proxy#Security Considerations for the threat model.
Custom Build
Caddy is built from source with the Gandi DNS plugin:
# Build location
~/code/3rd/caddy/bin/caddy
The build includes the github.com/caddy-dns/gandi plugin for ACME DNS-01 challenges.
Related
- gandi - DNS hosting and ACME DNS-01 provider
- routing - Service routing architecture
- forgejo - Git forge (proxied by Caddy)
- zot - Container registry (proxied by Caddy)
- tailscale-operator - K8s services use Tailscale Ingress, then Caddy