blumeops/fly/start.sh
Erich Blume b667f21e10 Fix 502 errors during Fly.io proxy deploys
The health check returned 200 immediately on nginx start, before
Tailscale connected. Fly.io routed traffic to the new machine with
a cold proxy cache and no MagicDNS, causing upstream DNS timeouts.

Defer the health check by returning 503 until a sentinel file
(/tmp/tailscale-ready) is created after Tailscale connects. This
keeps the old machine serving traffic during the startup window.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 11:06:41 -08:00

31 lines
1.1 KiB
Bash

#!/bin/sh
set -e
# Start nginx immediately so port 8080 is bound (avoids connection refused).
# Health check returns 503 until /tmp/tailscale-ready exists, so Fly.io
# keeps the old machine serving traffic until Tailscale connects.
nginx -g "daemon off;" &
NGINX_PID=$!
echo "Nginx started (waiting for Tailscale before proxying)"
# Start tailscale daemon. Fly.io runs Firecracker microVMs which support
# TUN devices natively — no need for --tun=userspace-networking.
tailscaled --statedir=/var/lib/tailscale &
sleep 2
# Authenticate and join tailnet
tailscale up --authkey="${TS_AUTHKEY}" --hostname=flyio-proxy
# Wait for tailscale to be ready, then signal nginx health check
until tailscale status > /dev/null 2>&1; do sleep 1; done
touch /tmp/tailscale-ready
echo "Tailscale connected"
# Start Alloy for observability (logs → Loki, metrics → Prometheus)
alloy run /etc/alloy/config.alloy \
--server.http.listen-addr=127.0.0.1:12345 \
--storage.path=/tmp/alloy-data &
echo "Alloy started"
# Block on nginx — container exits if nginx stops
wait $NGINX_PID