Zero-downtime Fly.io deploys: bluegreen + startup reorder
Three changes to eliminate 502s during proxy deploys: 1. Start nginx after Tailscale connects (not before) so MagicDNS is always available when the first request arrives. This is the community-recommended pattern for Tailscale sidecars on Fly.io. 2. Switch deploy strategy to bluegreen — the old machine keeps serving traffic until the new one passes health checks, then Fly.io cuts over. Rolling deploys with a single machine always cause downtime. 3. Replace top-level [checks] with [[http_service.checks]]. Top-level checks only monitor; they don't gate traffic routing. Service-level checks tell the Fly Proxy to hold traffic until the app is ready. The sentinel file (/tmp/tailscale-ready) and nginx if-check are removed since nginx no longer starts before Tailscale. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
bd61da4f85
commit
4bbe4e7c20
4 changed files with 20 additions and 24 deletions
1
docs/changelog.d/fix-zero-downtime-deploy.infra.md
Normal file
1
docs/changelog.d/fix-zero-downtime-deploy.infra.md
Normal file
|
|
@ -0,0 +1 @@
|
|||
Eliminate 502 errors during Fly.io proxy deploys by starting nginx after Tailscale, switching to bluegreen deploys, and using service-level health checks for traffic gating.
|
||||
Loading…
Add table
Add a link
Reference in a new issue