blumeops/pulumi/gandi
Erich Blume 292d354902
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 1m12s
C1: deploy adelaide-baby-shower-app to ringtail k3s (#349)
## Summary

Brings up the Adelaide / Heidi / Addie baby shower app on ringtail k3s with the public/private split that the app's hosting contract calls for: `shower.eblu.me` (public, via Fly proxy) and `shower.ops.eblu.me` (tailnet). App is consumed as a wheel from the Forgejo PyPI index — source lives at [`adelaide-baby-shower-app`](https://forge.eblu.me/eblume/adelaide-baby-shower-app).

### What's included

- **ArgoCD app + manifests** under `argocd/manifests/shower/` (deployment, service, ProxyGroup ingress, ConfigMap for `DJANGO_DEBUG`/`DJANGO_ADMIN_URL`, ExternalSecret for `DJANGO_SECRET_KEY` from 1Password item `Shower (blumeops)`, NFS PV on sifaka, RWX media PVC, RWO local-path data PVC for SQLite). Recreate rollout because SQLite is single-writer.
- **Public surface** (`fly/`): new `shower.eblu.me` server block proxying to `shower.ops.eblu.me`. `/admin/` returns 403 at the edge except `/admin/login/` and `/admin/logout/`, which are rate-limited via a new `shower_auth` zone. `X-Clacks-Overhead` on. GNU Terry Pratchett.
- **fail2ban** filter (`shower-admin-login.conf`) matching 401/403/429 on `/admin/login/` and jail (`shower.conf`) with `maxretry=5/findtime=600/bantime=3600`. The `nginx-deny` action was generalized to take a per-jail `nginx_deny_file` so the shower has its own deny list (forge keeps using the legacy default).
- **Caddy** route on indri (`shower.ops.eblu.me` → `https://shower.tail8d86e.ts.net`).
- **Pulumi** Gandi CNAME `shower.eblu.me → blumeops-proxy.fly.dev.`.
- **Grafana** APM dashboard `configmap-shower-apm.yaml` (request rate, error rate, failed admin login count, latency percentiles, bandwidth, access logs) mirroring `docs-apm.json` with a `host="shower.eblu.me"` filter.
- **Container** `containers/shower/default.nix` — `dockerTools.buildLayeredImage` with a nixpkgs Python and a startup wrapper that creates `/app/data/.venv`, pip-installs `adelaide-baby-shower-app==1.0.0` from the forge PyPI index on first boot, runs migrations + collectstatic, and execs gunicorn. A `local_settings.py` shim pins `DATABASES.NAME`/`MEDIA_ROOT`/`STATIC_ROOT` to absolute paths so they don't end up in site-packages.
- **Docs** runbook at `docs/how-to/operations/shower-app.md` linked from the apps registry, plus changelog fragments.

### Defense layers on the public surface

1. fly nginx geo+fail2ban `$shower_banned` (per-service deny list)
2. fly nginx `limit_req zone=shower_auth` (3 r/s per Fly-Client-IP)
3. django-axes (5 fails / 1h, keyed on username+ip_address)
4. edge `/admin/` block (returns 403 for anything that isn't login/logout)

## Prerequisites for the user to do (NOT in this PR)

Halted on these per request — they touch shared/manual systems:

- [x] **NFS share** on sifaka: `/volume1/shower`, NFS rule for ringtail RW, `chown 1000:1000`
- [ ] **1Password item** `Shower (blumeops)` in the blumeops vault with a freshly minted `secret-key` field (`openssl rand -base64 48`) — do NOT reuse anything that has lived in git
- [ ] **Container build**: `mise run container-build-and-release shower`, then update `images[].newTag` in `argocd/manifests/shower/kustomization.yaml` to the resulting `v1.0.0-<sha>-nix`
- [x] **DNS**: `mise run dns-up` after merge
- [x] **Fly cert**: `fly certs add shower.eblu.me -a blumeops-proxy`
- [ ] **Caddy push**: `mise run provision-indri -- --tags caddy`
- [ ] **Fly redeploy** to pick up the new nginx block + fail2ban jail: `mise run fly-deploy`
- [ ] **ArgoCD sync**: `argocd app set shower --revision shower-app-deploy && argocd app sync shower` to test from this branch before merging

## Test plan

- [ ] Container builds successfully on nix-container-builder runner
- [ ] Pod starts, migrations run, gunicorn answers on :8000
- [ ] `kubectl --context=k3s-ringtail -n shower logs deploy/shower` clean
- [ ] `curl -sf https://shower.ops.eblu.me/` returns the splash page (tailnet)
- [ ] `curl -I -H "Host: shower.eblu.me" https://blumeops-proxy.fly.dev/` returns 200 (pre-DNS verification)
- [ ] `curl -I -H "Host: shower.eblu.me" https://blumeops-proxy.fly.dev/admin/users/` returns 403 (edge block)
- [ ] `curl -I -H "Host: shower.eblu.me" https://blumeops-proxy.fly.dev/admin/login/` returns a Django login response
- [ ] After DNS is up: `curl -I https://shower.eblu.me/` returns 200 with `X-Clacks-Overhead`
- [ ] Grafana dashboard "Shower APM" appears and starts showing traffic
- [ ] `mise run services-check` passes

Reviewed-on: #349
2026-05-11 13:47:18 -07:00
..
.gitignore Add Gandi DNS management via Pulumi (#54) 2026-01-25 08:15:46 -08:00
__main__.py C1: deploy adelaide-baby-shower-app to ringtail k3s (#349) 2026-05-11 13:47:18 -07:00
Pulumi.eblu-me.yaml Add Gandi DNS management via Pulumi (#54) 2026-01-25 08:15:46 -08:00
Pulumi.yaml Add Gandi DNS management via Pulumi (#54) 2026-01-25 08:15:46 -08:00
pyproject.toml Add Gandi DNS management via Pulumi (#54) 2026-01-25 08:15:46 -08:00
README.md C0: split gandi-operations docs; add dns-acme-cleanup mise task 2026-04-27 09:48:46 -07:00
uv.lock Add Fly.io public reverse proxy for docs.eblu.me (#120) 2026-02-08 02:36:19 -08:00

Gandi DNS Management

This Pulumi project manages DNS records for eblu.me via Gandi LiveDNS.

What It Does

Creates DNS records that point *.ops.eblu.me to indri's Tailscale IP.

Why indri? indri hosts Caddy, the reverse proxy for all blumeops services. All *.ops.eblu.me requests route through Caddy, which proxies to the appropriate backend service (either on indri itself or in the k8s cluster).

Since Tailscale IPs (100.x.x.x) are not routable on the public internet, these DNS records effectively make services accessible only from within the tailnet, while still using real, resolvable DNS names.

The target IP is resolved dynamically from indri.tail8d86e.ts.net at deploy time, so if indri's Tailscale IP changes, just re-run the deployment.

Setup

cd pulumi/gandi
uv sync
pulumi stack select eblu-me  # or: pulumi stack init eblu-me

Authentication

This project uses a Gandi Personal Access Token (PAT) shared with Caddy. See the Gandi reference card and Rotate the Gandi PAT.

The mise tasks handle fetching the PAT from 1Password:

mise run dns-preview   # Preview only
mise run dns-up        # Preview and apply

Or manually:

export GANDI_PERSONAL_ACCESS_TOKEN=$(op read "op://blumeops/gandi - blumeops/pat")
pulumi up

DNS Records Created

Record Type Value Purpose
*.ops.eblu.me A (indri's Tailscale IP) Wildcard for all services
ops.eblu.me A (indri's Tailscale IP) Base subdomain

Service Hostnames

Once Caddy is configured on indri, services will be accessible at:

  • forge.ops.eblu.me - Forgejo git server
  • registry.ops.eblu.me - Zot container registry
  • grafana.ops.eblu.me - Grafana dashboards
  • argocd.ops.eblu.me - ArgoCD
  • feed.ops.eblu.me - Miniflux RSS reader
  • pypi.ops.eblu.me - DevPI Python index
  • kiwix.ops.eblu.me - Kiwix offline content
  • tesla.ops.eblu.me - TeslaMate
  • torrent.ops.eblu.me - Transmission
  • prometheus.ops.eblu.me - Prometheus metrics
  • loki.ops.eblu.me - Loki logs