Fix 502 errors during Fly.io proxy deploys
The health check returned 200 immediately on nginx start, before Tailscale connected. Fly.io routed traffic to the new machine with a cold proxy cache and no MagicDNS, causing upstream DNS timeouts. Defer the health check by returning 503 until a sentinel file (/tmp/tailscale-ready) is created after Tailscale connects. This keeps the old machine serving traffic during the startup window. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
3415cad38c
commit
b667f21e10
3 changed files with 9 additions and 4 deletions
1
docs/changelog.d/fix-deploy-healthcheck-race.bugfix.md
Normal file
1
docs/changelog.d/fix-deploy-healthcheck-race.bugfix.md
Normal file
|
|
@ -0,0 +1 @@
|
|||
Fix 502 errors during Fly.io proxy deploys by deferring health check until Tailscale is connected.
|
||||
|
|
@ -76,6 +76,9 @@ http {
|
|||
listen 8080 default_server;
|
||||
|
||||
location /healthz {
|
||||
if (!-f /tmp/tailscale-ready) {
|
||||
return 503 "starting\n";
|
||||
}
|
||||
return 200 "ok\n";
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -1,9 +1,9 @@
|
|||
#!/bin/sh
|
||||
set -e
|
||||
|
||||
# Start nginx immediately so port 8080 is bound before Fly's deploy checks.
|
||||
# Upstream DNS resolution is deferred via resolver + variable in nginx.conf,
|
||||
# so nginx starts cleanly even before Tailscale connects.
|
||||
# Start nginx immediately so port 8080 is bound (avoids connection refused).
|
||||
# Health check returns 503 until /tmp/tailscale-ready exists, so Fly.io
|
||||
# keeps the old machine serving traffic until Tailscale connects.
|
||||
nginx -g "daemon off;" &
|
||||
NGINX_PID=$!
|
||||
echo "Nginx started (waiting for Tailscale before proxying)"
|
||||
|
|
@ -16,8 +16,9 @@ sleep 2
|
|||
# Authenticate and join tailnet
|
||||
tailscale up --authkey="${TS_AUTHKEY}" --hostname=flyio-proxy
|
||||
|
||||
# Wait for tailscale to be ready
|
||||
# Wait for tailscale to be ready, then signal nginx health check
|
||||
until tailscale status > /dev/null 2>&1; do sleep 1; done
|
||||
touch /tmp/tailscale-ready
|
||||
echo "Tailscale connected"
|
||||
|
||||
# Start Alloy for observability (logs → Loki, metrics → Prometheus)
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue