New rotation card documenting the 75-day cadence for the Fly.io API token. Recommends `fly tokens create org` (single-org scope) over `deploy` (single-app scope): both have effectively the same blast radius for a single-app personal org, and `org` silences the "Metrics token unavailable: ... context canceled" warning that `fly status` emits when called with an app-scoped token. Linked from manage-flyio-proxy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3.3 KiB
| title | modified | last-reviewed | tags | ||||
|---|---|---|---|---|---|---|---|
| Manage Fly.io Proxy | 2026-04-18 | 2026-04-18 |
|
Manage Fly.io Proxy
Operational tasks for the flyio-proxy public reverse proxy.
Deploy Changes
After modifying files in fly/:
mise run fly-deploy
Pushes to fly/ on main also trigger automatic deployment via the Forgejo CI workflow.
Add a New Public Service
See expose-service-publicly#Per-service setup for the full walkthrough. In short:
- Add a
serverblock tofly/nginx.conf - Add a Fly.io certificate:
fly certs add <domain> -a blumeops-proxy - Deploy:
mise run fly-deploy - Verify against
blumeops-proxy.fly.devwith aHostheader - Add DNS CNAME via Pulumi:
mise run dns-previewthenmise run dns-up
Emergency Shutoff
If the proxy is causing issues (DDoS, unexpected traffic, bandwidth consumption on the home network):
Level 1 — Stop the container (seconds, reversible):
mise run fly-shutoff
# or: fly scale count 0 -a blumeops-proxy --yes
All public services go offline immediately. Tailscale tunnel drops. Zero traffic reaches indri. Restore with fly scale count 1 -a blumeops-proxy.
Level 2 — Revoke Tailscale access (seconds):
Remove the flyio-proxy node in the Tailscale admin console. Even if the container is running, it cannot reach the tailnet. Use this if the container itself may be compromised.
Level 3 — Remove DNS (minutes to hours): Delete the CNAME records at Gandi. Takes time for DNS propagation but is the permanent shutoff.
Level 1 is the primary response. It is a single command, takes effect in seconds, and is trivially reversible. Keep mise run fly-shutoff somewhere easily accessible (e.g., pinned in a notes app) so it can be run quickly under stress.
Check Status
# App and machine status
fly status -a blumeops-proxy
# Live logs
fly logs -a blumeops-proxy
# Health check
curl -sf https://blumeops-proxy.fly.dev/healthz
# Certificate status
fly certs list -a blumeops-proxy
Rotate Tailscale Auth Key
The auth key expires every 90 days. To rotate:
- Re-apply Pulumi to generate a new key:
mise run tailnet-up - Re-run setup to stage the new secret:
mise run fly-setup - Deploy to pick up the new secret:
mise run fly-deploy
Rotate Fly.io API Token
See rotate-fly-deploy-token for the full rotation procedure (75-day cadence, org-scoped).
Troubleshooting
502 Bad Gateway on fresh deploy: MagicDNS may not be ready when nginx starts. The start.sh script polls nslookup before launching nginx, but if it still fails, check that tailscale status is healthy inside the container.
Health check failing: fly ssh console -a blumeops-proxy then curl localhost:8080/healthz to test locally.
TLS errors on custom domain: Check cert status with fly certs show <domain> -a blumeops-proxy. Certs auto-provision via Let's Encrypt and may take a few minutes.
High latency (>1s p50): Check if direct WireGuard peering is established: fly ssh console -a blumeops-proxy -C "tailscale ping indri". If it shows via DERP, the tunnel is relayed and latency will be 10-30s. See tailscale#Direct Peering vs DERP Relay for diagnosis.
Related
- flyio-proxy - Service reference card
- expose-service-publicly - Full setup guide and architecture