blumeops/docs/how-to/operations/manage-flyio-proxy.md

---
title: Manage Fly.io Proxy
modified: 2026-04-18
last-reviewed: 2026-04-18
tags:
  - how-to
  - fly-io
  - networking
  - operations
---

# Manage Fly.io Proxy

Operational tasks for the [[flyio-proxy]] public reverse proxy.

## Deploy Changes

After modifying files in `fly/`:

```bash
mise run fly-deploy
```

Pushes to `fly/` on main also trigger automatic deployment via the Forgejo CI workflow.

## Add a New Public Service

See [[expose-service-publicly#Per-service setup]] for the full walkthrough. In short:

1. Add a `server` block to `fly/nginx.conf`
2. Add a Fly.io certificate: `fly certs add <domain> -a blumeops-proxy`
3. Deploy: `mise run fly-deploy`
4. Verify against `blumeops-proxy.fly.dev` with a `Host` header
5. Add DNS CNAME via Pulumi: `mise run dns-preview` then `mise run dns-up`

## Emergency Shutoff

If the proxy is causing issues (DDoS, unexpected traffic, bandwidth consumption on the home network):

**Level 1 — Stop the container (seconds, reversible):**
```bash
mise run fly-shutoff
# or: fly scale count 0 -a blumeops-proxy --yes
```
All public services go offline immediately. Tailscale tunnel drops. Zero traffic reaches indri. Restore with `fly scale count 1 -a blumeops-proxy`.

**Level 2 — Revoke Tailscale access (seconds):**
Remove the `flyio-proxy` node in the Tailscale admin console. Even if the container is running, it cannot reach the tailnet. Use this if the container itself may be compromised.

**Level 3 — Remove DNS (minutes to hours):**
Delete the CNAME records at Gandi. Takes time for DNS propagation but is the permanent shutoff.

**Level 1 is the primary response.** It is a single command, takes effect in seconds, and is trivially reversible. Keep `mise run fly-shutoff` somewhere easily accessible (e.g., pinned in a notes app) so it can be run quickly under stress.

## Check Status

```bash
# App and machine status
fly status -a blumeops-proxy

# Live logs
fly logs -a blumeops-proxy

# Health check
curl -sf https://blumeops-proxy.fly.dev/healthz

# Certificate status
fly certs list -a blumeops-proxy
```

## Rotate Tailscale Auth Key

The auth key expires every 90 days. To rotate:

1. Re-apply Pulumi to generate a new key: `mise run tailnet-up`
2. Re-run setup to stage the new secret: `mise run fly-setup`
3. Deploy to pick up the new secret: `mise run fly-deploy`

## Rotate Fly.io API Token

See [[rotate-fly-deploy-token]] for the full rotation procedure (75-day cadence, `org`-scoped).

## Troubleshooting

**502 Bad Gateway on fresh deploy**: MagicDNS may not be ready when nginx starts. The `start.sh` script polls `nslookup` before launching nginx, but if it still fails, check that `tailscale status` is healthy inside the container.

**Health check failing**: `fly ssh console -a blumeops-proxy` then `curl localhost:8080/healthz` to test locally.

**TLS errors on custom domain**: Check cert status with `fly certs show <domain> -a blumeops-proxy`. Certs auto-provision via Let's Encrypt and may take a few minutes.

**High latency (>1s p50)**: Check if direct WireGuard peering is established: `fly ssh console -a blumeops-proxy -C "tailscale ping indri"`. If it shows `via DERP`, the tunnel is relayed and latency will be 10-30s. See [[tailscale#Direct Peering vs DERP Relay]] for diagnosis.

## Related

- [[flyio-proxy]] - Service reference card
- [[expose-service-publicly]] - Full setup guide and architecture
Add Fly.io public reverse proxy for docs.eblu.me (#120) ## Summary - Adds a Fly.io reverse proxy (`blumeops-proxy`) that tunnels public traffic to homelab services over Tailscale - First service exposed: `docs.eblu.me` — the Quartz static docs site - Includes Pulumi IaC for Tailscale auth key/ACLs and Gandi DNS CNAME - Adds mise tasks (`fly-deploy`, `fly-setup`, `fly-shutoff`) and Forgejo CI workflow ## Key details - Fly.io Firecracker VMs support TUN devices natively — no userspace networking needed - Tailscale auth key is `preauthorized=True` to avoid device approval hangs on container restarts - nginx caches aggressively for the static site; health check is on the default_server block - ACLs restrict `tag:flyio-proxy` to `tag:k8s` on port 443 only - DNS CNAME deployed and verified: `docs.eblu.me` → `blumeops-proxy.fly.dev` ## Test plan - [x] `curl -sf https://blumeops-proxy.fly.dev/healthz` returns `ok` - [x] `curl -I -H "Host: docs.eblu.me" https://blumeops-proxy.fly.dev/` returns 200 with `X-Cache-Status` - [x] `curl -I https://docs.eblu.me/` returns 200 with valid Let's Encrypt cert - [x] `dig forge.ops.eblu.me` still resolves to 100.98.163.89 (private services unaffected) - [x] Set `FLY_DEPLOY_TOKEN` Forgejo Actions secret for CI auto-deploy 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/120 2026-02-08 02:36:19 -08:00			`---`
			`title: Manage Fly.io Proxy`
Update docs for Caddy routing and direct WireGuard peering Comprehensive docs pass reflecting the new Fly proxy architecture: - Fly proxy routes through Caddy on indri (not per-service TS Ingress) - Direct WireGuard peering via --port=41641 pinning - DERP relay performance lesson in Tailscale docs - Caddy now in public traffic path - indri tagged as flyio-target - Removed fly-reload references - Updated architecture diagrams and per-service setup guide - Added changelog fragment Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-04-18 09:57:30 -07:00			`modified: 2026-04-18`
			`last-reviewed: 2026-04-18`
Add Fly.io public reverse proxy for docs.eblu.me (#120) ## Summary - Adds a Fly.io reverse proxy (`blumeops-proxy`) that tunnels public traffic to homelab services over Tailscale - First service exposed: `docs.eblu.me` — the Quartz static docs site - Includes Pulumi IaC for Tailscale auth key/ACLs and Gandi DNS CNAME - Adds mise tasks (`fly-deploy`, `fly-setup`, `fly-shutoff`) and Forgejo CI workflow ## Key details - Fly.io Firecracker VMs support TUN devices natively — no userspace networking needed - Tailscale auth key is `preauthorized=True` to avoid device approval hangs on container restarts - nginx caches aggressively for the static site; health check is on the default_server block - ACLs restrict `tag:flyio-proxy` to `tag:k8s` on port 443 only - DNS CNAME deployed and verified: `docs.eblu.me` → `blumeops-proxy.fly.dev` ## Test plan - [x] `curl -sf https://blumeops-proxy.fly.dev/healthz` returns `ok` - [x] `curl -I -H "Host: docs.eblu.me" https://blumeops-proxy.fly.dev/` returns 200 with `X-Cache-Status` - [x] `curl -I https://docs.eblu.me/` returns 200 with valid Let's Encrypt cert - [x] `dig forge.ops.eblu.me` still resolves to 100.98.163.89 (private services unaffected) - [x] Set `FLY_DEPLOY_TOKEN` Forgejo Actions secret for CI auto-deploy 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/120 2026-02-08 02:36:19 -08:00			`tags:`
			`- how-to`
			`- fly-io`
			`- networking`
			`- operations`
			`---`

			`# Manage Fly.io Proxy`

			`Operational tasks for the [[flyio-proxy]] public reverse proxy.`

			`## Deploy Changes`

			After modifying files in `fly/`:

			```bash
			`mise run fly-deploy`
			```

			Pushes to `fly/` on main also trigger automatic deployment via the Forgejo CI workflow.

			`## Add a New Public Service`

			`See [[expose-service-publicly#Per-service setup]] for the full walkthrough. In short:`

			1. Add a `server` block to `fly/nginx.conf`
			2. Add a Fly.io certificate: `fly certs add <domain> -a blumeops-proxy`
			3. Deploy: `mise run fly-deploy`
			4. Verify against `blumeops-proxy.fly.dev` with a `Host` header
			5. Add DNS CNAME via Pulumi: `mise run dns-preview` then `mise run dns-up`

			`## Emergency Shutoff`

			`If the proxy is causing issues (DDoS, unexpected traffic, bandwidth consumption on the home network):`

			`Level 1 — Stop the container (seconds, reversible):`
			```bash
			`mise run fly-shutoff`
			`# or: fly scale count 0 -a blumeops-proxy --yes`
			```
			All public services go offline immediately. Tailscale tunnel drops. Zero traffic reaches indri. Restore with `fly scale count 1 -a blumeops-proxy`.

			`Level 2 — Revoke Tailscale access (seconds):`
			Remove the `flyio-proxy` node in the Tailscale admin console. Even if the container is running, it cannot reach the tailnet. Use this if the container itself may be compromised.

			`Level 3 — Remove DNS (minutes to hours):`
			`Delete the CNAME records at Gandi. Takes time for DNS propagation but is the permanent shutoff.`

			Level 1 is the primary response. It is a single command, takes effect in seconds, and is trivially reversible. Keep `mise run fly-shutoff` somewhere easily accessible (e.g., pinned in a notes app) so it can be run quickly under stress.

			`## Check Status`

			```bash
			`# App and machine status`
			`fly status -a blumeops-proxy`

			`# Live logs`
			`fly logs -a blumeops-proxy`

			`# Health check`
			`curl -sf https://blumeops-proxy.fly.dev/healthz`

			`# Certificate status`
			`fly certs list -a blumeops-proxy`
			```

			`## Rotate Tailscale Auth Key`

			`The auth key expires every 90 days. To rotate:`

			1. Re-apply Pulumi to generate a new key: `mise run tailnet-up`
			2. Re-run setup to stage the new secret: `mise run fly-setup`
			3. Deploy to pick up the new secret: `mise run fly-deploy`

C1: SHA-pin tooling dependencies (2026-04 cycle) (#344) ## Summary Monthly tooling dependency refresh, with a one-time conversion from version-tag pins (`rev = "vX.Y.Z"`, `image:tag`, `>=`) to SHA / digest pins everywhere. ## Changes - prek hooks: all `rev = "vX.Y.Z"` → commit SHA + `# vX.Y.Z` comment. Bumped trufflehog (3.94.0→3.95.2), kingfisher (1.91.0→1.97.0), ruff (0.15.7→0.15.12), shfmt (3.13.0→3.13.1), prettier (3.8.1→3.8.3), actionlint (1.7.11→1.7.12). - fly/Dockerfile: tag pins → `image@sha256:...` digest pins. Bumped nginx (1.29.6→1.30.0-alpine), tailscale (v1.94.1→v1.94.2 — still inside the safe pre-1.96.5 range), alloy (v1.14.1→v1.16.0). - mise-tasks: PEP 723 inline deps converted from `>=` to `==` (PEP 508 doesn't support hashes inline). All scripts pinned to current latest: rich 15.0.0, typer 0.25.0, pyyaml 6.0.3, httpx 0.28.1. - prek `additional_dependencies`: ansible-lint==26.4.0, ansible-core==2.20.5. - taplo-lint: pass `--no-schema`. Upstream's `--default-schema-catalogs` returns a format taplo v0.9.3 can't parse — we don't validate against TOML schemas anyway, so this turns off the broken catalog fetch. - docs/update-tooling-dependencies: documents the SHA-pin convention, `docker buildx imagetools inspect` for digest lookup, and `prek clean` before re-verifying (cache grows to several GiB). Forgejo workflow `actions/checkout@v6.0.2` was already at the latest SHA — no change. ## Test plan - [x] `prek run --all-files` passes after `prek clean` - [x] `deploy-fly` workflow builds and deploys the new fly image on merge - [x] `fly status -a blumeops-proxy` healthy after deploy - [x] Spot-check a few mise tasks (`mise run blumeops-tasks`, `mise run docs-check-links`) to confirm pinned deps resolve cleanly Reviewed-on: https://forge.eblu.me/eblume/blumeops/pulls/344 2026-04-30 16:51:43 -07:00			`## Rotate Fly.io API Token`

			See [[rotate-fly-deploy-token]] for the full rotation procedure (75-day cadence, `org`-scoped).

Add Fly.io public reverse proxy for docs.eblu.me (#120) ## Summary - Adds a Fly.io reverse proxy (`blumeops-proxy`) that tunnels public traffic to homelab services over Tailscale - First service exposed: `docs.eblu.me` — the Quartz static docs site - Includes Pulumi IaC for Tailscale auth key/ACLs and Gandi DNS CNAME - Adds mise tasks (`fly-deploy`, `fly-setup`, `fly-shutoff`) and Forgejo CI workflow ## Key details - Fly.io Firecracker VMs support TUN devices natively — no userspace networking needed - Tailscale auth key is `preauthorized=True` to avoid device approval hangs on container restarts - nginx caches aggressively for the static site; health check is on the default_server block - ACLs restrict `tag:flyio-proxy` to `tag:k8s` on port 443 only - DNS CNAME deployed and verified: `docs.eblu.me` → `blumeops-proxy.fly.dev` ## Test plan - [x] `curl -sf https://blumeops-proxy.fly.dev/healthz` returns `ok` - [x] `curl -I -H "Host: docs.eblu.me" https://blumeops-proxy.fly.dev/` returns 200 with `X-Cache-Status` - [x] `curl -I https://docs.eblu.me/` returns 200 with valid Let's Encrypt cert - [x] `dig forge.ops.eblu.me` still resolves to 100.98.163.89 (private services unaffected) - [x] Set `FLY_DEPLOY_TOKEN` Forgejo Actions secret for CI auto-deploy 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/120 2026-02-08 02:36:19 -08:00			`## Troubleshooting`

Switch Fly proxy to upstream keepalive pools (#337) ## Summary - Replace per-request DNS resolution (variable-based `proxy_pass`) with static `upstream` blocks and `keepalive` connection pools - Reuses TLS connections through the Tailscale tunnel instead of handshaking per request - Add `mise run fly-reload` for nginx config reload without full redeploy (re-resolves upstream DNS) ## Trade-off DNS is resolved at config load, not per-request. If Tailscale Ingress pods get new IPs (restart, reschedule), `mise run fly-reload` is needed. A Grafana alert will be added to detect this. ## Still TODO on this branch - [ ] Grafana alert for upstream unreachable (triggers fly-reload reminder) - [ ] Docs pass - [ ] Deploy from branch and verify latency improvement - [ ] Changelog fragment 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.eblu.me/eblume/blumeops/pulls/337 2026-04-17 16:39:52 -07:00			502 Bad Gateway on fresh deploy: MagicDNS may not be ready when nginx starts. The `start.sh` script polls `nslookup` before launching nginx, but if it still fails, check that `tailscale status` is healthy inside the container.
Add Fly.io public reverse proxy for docs.eblu.me (#120) ## Summary - Adds a Fly.io reverse proxy (`blumeops-proxy`) that tunnels public traffic to homelab services over Tailscale - First service exposed: `docs.eblu.me` — the Quartz static docs site - Includes Pulumi IaC for Tailscale auth key/ACLs and Gandi DNS CNAME - Adds mise tasks (`fly-deploy`, `fly-setup`, `fly-shutoff`) and Forgejo CI workflow ## Key details - Fly.io Firecracker VMs support TUN devices natively — no userspace networking needed - Tailscale auth key is `preauthorized=True` to avoid device approval hangs on container restarts - nginx caches aggressively for the static site; health check is on the default_server block - ACLs restrict `tag:flyio-proxy` to `tag:k8s` on port 443 only - DNS CNAME deployed and verified: `docs.eblu.me` → `blumeops-proxy.fly.dev` ## Test plan - [x] `curl -sf https://blumeops-proxy.fly.dev/healthz` returns `ok` - [x] `curl -I -H "Host: docs.eblu.me" https://blumeops-proxy.fly.dev/` returns 200 with `X-Cache-Status` - [x] `curl -I https://docs.eblu.me/` returns 200 with valid Let's Encrypt cert - [x] `dig forge.ops.eblu.me` still resolves to 100.98.163.89 (private services unaffected) - [x] Set `FLY_DEPLOY_TOKEN` Forgejo Actions secret for CI auto-deploy 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/120 2026-02-08 02:36:19 -08:00
			Health check failing: `fly ssh console -a blumeops-proxy` then `curl localhost:8080/healthz` to test locally.

			TLS errors on custom domain: Check cert status with `fly certs show <domain> -a blumeops-proxy`. Certs auto-provision via Let's Encrypt and may take a few minutes.

Update docs for Caddy routing and direct WireGuard peering Comprehensive docs pass reflecting the new Fly proxy architecture: - Fly proxy routes through Caddy on indri (not per-service TS Ingress) - Direct WireGuard peering via --port=41641 pinning - DERP relay performance lesson in Tailscale docs - Caddy now in public traffic path - indri tagged as flyio-target - Removed fly-reload references - Updated architecture diagrams and per-service setup guide - Added changelog fragment Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-04-18 09:57:30 -07:00			High latency (>1s p50): Check if direct WireGuard peering is established: `fly ssh console -a blumeops-proxy -C "tailscale ping indri"`. If it shows `via DERP`, the tunnel is relayed and latency will be 10-30s. See [[tailscale#Direct Peering vs DERP Relay]] for diagnosis.
Switch Fly proxy to upstream keepalive pools (#337) ## Summary - Replace per-request DNS resolution (variable-based `proxy_pass`) with static `upstream` blocks and `keepalive` connection pools - Reuses TLS connections through the Tailscale tunnel instead of handshaking per request - Add `mise run fly-reload` for nginx config reload without full redeploy (re-resolves upstream DNS) ## Trade-off DNS is resolved at config load, not per-request. If Tailscale Ingress pods get new IPs (restart, reschedule), `mise run fly-reload` is needed. A Grafana alert will be added to detect this. ## Still TODO on this branch - [ ] Grafana alert for upstream unreachable (triggers fly-reload reminder) - [ ] Docs pass - [ ] Deploy from branch and verify latency improvement - [ ] Changelog fragment 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.eblu.me/eblume/blumeops/pulls/337 2026-04-17 16:39:52 -07:00
Add Fly.io public reverse proxy for docs.eblu.me (#120) ## Summary - Adds a Fly.io reverse proxy (`blumeops-proxy`) that tunnels public traffic to homelab services over Tailscale - First service exposed: `docs.eblu.me` — the Quartz static docs site - Includes Pulumi IaC for Tailscale auth key/ACLs and Gandi DNS CNAME - Adds mise tasks (`fly-deploy`, `fly-setup`, `fly-shutoff`) and Forgejo CI workflow ## Key details - Fly.io Firecracker VMs support TUN devices natively — no userspace networking needed - Tailscale auth key is `preauthorized=True` to avoid device approval hangs on container restarts - nginx caches aggressively for the static site; health check is on the default_server block - ACLs restrict `tag:flyio-proxy` to `tag:k8s` on port 443 only - DNS CNAME deployed and verified: `docs.eblu.me` → `blumeops-proxy.fly.dev` ## Test plan - [x] `curl -sf https://blumeops-proxy.fly.dev/healthz` returns `ok` - [x] `curl -I -H "Host: docs.eblu.me" https://blumeops-proxy.fly.dev/` returns 200 with `X-Cache-Status` - [x] `curl -I https://docs.eblu.me/` returns 200 with valid Let's Encrypt cert - [x] `dig forge.ops.eblu.me` still resolves to 100.98.163.89 (private services unaffected) - [x] Set `FLY_DEPLOY_TOKEN` Forgejo Actions secret for CI auto-deploy 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/120 2026-02-08 02:36:19 -08:00			`## Related`

			`- [[flyio-proxy]] - Service reference card`
			`- [[expose-service-publicly]] - Full setup guide and architecture`