Rewrite public exposure guide to Fly.io + Tailscale approach
Replace the Cloudflare Tunnel plan with a Fly.io reverse proxy architecture that tunnels back to indri over Tailscale. Covers: - Full architecture with nginx proxy cache + rate limiting - One-time setup vs per-service steps - Fly.io container (Dockerfile, fly.toml, nginx.conf, start.sh) - Pulumi IaC for Tailscale auth key + DNS CNAMEs - Forgejo CI workflow for automated deploys - Security model, DDoS considerations, break-glass shutoff - Mise tasks: fly-deploy, fly-setup, fly-shutoff Also fix docs-check-links to handle in-page anchor links ([[#Heading]]) and cross-file anchors ([[file#Heading]]). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
bc263f3ee8
commit
1de5492d6c
4 changed files with 471 additions and 135 deletions
|
|
@ -1 +1 @@
|
||||||
Add how-to guide for exposing services publicly via Cloudflare Tunnel.
|
Add how-to guide for exposing services publicly via Fly.io reverse proxy + Tailscale tunnel.
|
||||||
|
|
|
||||||
|
|
@ -2,197 +2,522 @@
|
||||||
title: Expose a Service Publicly
|
title: Expose a Service Publicly
|
||||||
tags:
|
tags:
|
||||||
- how-to
|
- how-to
|
||||||
- cloudflare
|
- fly-io
|
||||||
|
- tailscale
|
||||||
- networking
|
- networking
|
||||||
---
|
---
|
||||||
|
|
||||||
# Expose a Service Publicly via Cloudflare Tunnel
|
# Expose a Service Publicly via Fly.io + Tailscale
|
||||||
|
|
||||||
> **Status:** Plan — not yet implemented. Execute phases in order when ready.
|
> **Status:** Plan — not yet implemented. First target: `docs.eblu.me`.
|
||||||
|
|
||||||
This guide describes how to expose a BlumeOps service to the public internet securely using Cloudflare as a CDN and DDoS shield, with a Cloudflare Tunnel creating an outbound-only connection that never exposes the home IP.
|
This guide describes how to expose a BlumeOps service to the public internet using a reverse proxy container on [Fly.io](https://fly.io) that tunnels back to [[indri]] over [[tailscale]]. The approach keeps the home IP hidden, requires no changes to existing infrastructure (`*.ops.eblu.me`, [[caddy]], DNS), and is reusable for multiple services.
|
||||||
|
|
||||||
The first service to expose is `docs.eblu.me`. The pattern is reusable for future services.
|
|
||||||
|
|
||||||
## Architecture
|
## Architecture
|
||||||
|
|
||||||
```
|
```
|
||||||
Internet → docs.eblu.me (Cloudflare proxied CNAME)
|
Internet → <service>.eblu.me
|
||||||
│
|
│
|
||||||
Cloudflare Edge (CDN, WAF, DDoS protection)
|
Fly.io edge (Anycast, TLS via Let's Encrypt)
|
||||||
│
|
│
|
||||||
Cloudflare Tunnel (outbound from k8s)
|
Fly.io VM (nginx reverse proxy + Tailscale)
|
||||||
|
│ (WireGuard tunnel)
|
||||||
|
tailnet (tail8d86e.ts.net)
|
||||||
│
|
│
|
||||||
cloudflared pod in minikube
|
<service>.tail8d86e.ts.net (Tailscale ingress)
|
||||||
│
|
│
|
||||||
docs k8s Service (ClusterIP, port 80)
|
k8s Service → pod
|
||||||
│
|
|
||||||
docs pod (nginx + Quartz static site)
|
|
||||||
|
|
||||||
Tailnet → *.ops.eblu.me (unchanged, DNS-only to Tailscale IP)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
All existing `*.ops.eblu.me` services remain private behind Tailscale. Only explicitly configured subdomains (like `docs.eblu.me`) are exposed publicly through Cloudflare.
|
A single Fly.io container serves as the public-facing proxy for all exposed services. Each service gets a `server` block in the nginx config and a DNS CNAME. The container joins the tailnet via an ephemeral auth key and reaches backend services through Tailscale ingress endpoints.
|
||||||
|
|
||||||
## Key Decisions
|
Existing `*.ops.eblu.me` services remain private behind Tailscale — this approach does not touch [[caddy]], [[gandi]] DNS-01, or any other existing infrastructure.
|
||||||
|
|
||||||
|
## Key decisions
|
||||||
|
|
||||||
| Decision | Choice | Rationale |
|
| Decision | Choice | Rationale |
|
||||||
|----------|--------|-----------|
|
|----------|--------|-----------|
|
||||||
| DNS hosting | Move from [[gandi]] to Cloudflare (free) | CNAME/partial setup needs Business plan @ $200/mo |
|
| Proxy host | Fly.io (free tier) | Managed container, no server to maintain via Ansible |
|
||||||
| Gandi role | Registrar only | Domain renewal, WHOIS. No more DNS hosting. |
|
| Tunnel | Tailscale (existing) | Already in use, WireGuard encryption, ACL control |
|
||||||
| Tunnel host | Kubernetes | ArgoCD managed, direct ClusterIP access, no Tailscale hop |
|
| DNS | CNAME at [[gandi]] | No DNS migration needed, no Cloudflare dependency |
|
||||||
| [[caddy]] TLS | Migrate to Cloudflare DNS-01 plugin | Gandi DNS-01 won't work after nameserver change |
|
| TLS (public) | Fly.io auto-provisions Let's Encrypt | No cert management, `$0.10/mo` per hostname |
|
||||||
| Cloudflare account | Recover existing, instrument with IaC | |
|
| TLS (origin) | Tailscale handles encryption | WireGuard tunnel encrypts all traffic |
|
||||||
|
| CDN/cache | nginx `proxy_cache` in container | Aggressive caching for static content, sufficient for personal sites |
|
||||||
|
| DDoS | Fly.io Anycast + nginx rate limiting | Not enterprise-grade; see [[#Break-glass shutoff]] |
|
||||||
|
| IaC | `fly/` directory in repo, Pulumi for DNS + TS key | No well-maintained Fly.io Pulumi provider; `fly.toml` is the app's IaC |
|
||||||
|
|
||||||
## Prerequisites
|
## TLS in this architecture
|
||||||
|
|
||||||
- Cloudflare account with `eblu.me` zone added (free plan)
|
There are three independent TLS segments — none involve Caddy:
|
||||||
- Cloudflare API token stored in 1Password with scopes: Zone:DNS:Edit, Zone:Zone:Read, Account:Cloudflare Tunnel:Edit, Account:Account Settings:Read
|
|
||||||
- Cloudflare account ID and zone ID noted
|
|
||||||
|
|
||||||
## Phase 0: Preparation (manual)
|
1. **Browser → Fly.io edge**: Fly.io auto-provisions a Let's Encrypt certificate for each custom domain (e.g., `docs.eblu.me`). Validated via TLS-ALPN challenge — no DNS API needed.
|
||||||
|
2. **nginx → Tailscale ingress**: nginx proxies to `https://<service>.tail8d86e.ts.net`. The Tailscale ingress serves a Tailscale-issued cert. nginx uses `proxy_ssl_verify off` since the underlying tunnel is already encrypted.
|
||||||
|
3. **WireGuard tunnel**: All Tailscale traffic is encrypted at the network layer regardless of application-level TLS.
|
||||||
|
|
||||||
1. Recover Cloudflare account access
|
Caddy continues to serve `*.ops.eblu.me` with its existing Gandi DNS-01 certificates. The two TLS domains are completely independent.
|
||||||
2. Add `eblu.me` zone (free plan) — Cloudflare scans existing records from Gandi
|
|
||||||
3. **Do not change nameservers yet** — wait until Phase 3
|
|
||||||
4. Create API token with the scopes listed above
|
|
||||||
5. Store API token and account ID in 1Password (blumeops vault)
|
|
||||||
|
|
||||||
## Phase 1: Caddy TLS migration
|
## External references
|
||||||
|
|
||||||
**Why first**: Blocking dependency for the nameserver change. Once nameservers move to Cloudflare, Gandi LiveDNS can't serve DNS-01 ACME challenges.
|
- [Tailscale on Fly.io](https://tailscale.com/kb/1132/flydotio) — official guide for running Tailscale in a Fly.io container
|
||||||
|
- [Fly.io Custom Domains](https://fly.io/docs/networking/custom-domain/) — how Fly handles TLS for custom domains
|
||||||
|
- [Home Assistant + Fly.io + Tailscale](https://community.home-assistant.io/t/expose-ha-to-the-internet-via-a-cloud-reverse-proxy-fly-io-and-a-vpn-tailscale-for-free-for-now-without-opening-ports/352118) — community guide describing this exact pattern
|
||||||
|
|
||||||
### Caddy binary rebuild
|
---
|
||||||
|
|
||||||
Rebuild Caddy with `github.com/caddy-dns/cloudflare` instead of `github.com/caddy-dns/gandi` using `xcaddy` in `~/code/3rd/caddy/`.
|
## One-time setup (first service)
|
||||||
|
|
||||||
### Files to modify
|
These steps establish the Fly.io proxy infrastructure. They only need to be done once.
|
||||||
|
|
||||||
- `ansible/roles/caddy/templates/Caddyfile.j2` — change `dns gandi {env.GANDI_BEARER_TOKEN}` to `dns cloudflare {env.CF_API_TOKEN}`
|
### Step 1: Fly.io account and app
|
||||||
- `ansible/roles/caddy/templates/caddy-wrapper.sh.j2` — source Cloudflare API token instead of Gandi PAT
|
|
||||||
- `ansible/roles/caddy/defaults/main.yml` — update token variable name
|
|
||||||
- `ansible/playbooks/indri.yml` — add pre_task to fetch Cloudflare API token from 1Password, replace Gandi PAT fetch
|
|
||||||
|
|
||||||
### Deployment sequence
|
1. Create or recover a Fly.io account at https://fly.io (requires credit card for free tier)
|
||||||
|
2. Install `flyctl`: `brew install flyctl`
|
||||||
|
3. Authenticate: `fly auth login`
|
||||||
|
4. Create the app: `fly apps create blumeops-proxy`
|
||||||
|
5. Store the Fly.io deploy token in 1Password (blumeops vault):
|
||||||
|
- Generate: `fly tokens create deploy -a blumeops-proxy`
|
||||||
|
- Store as `fly-deploy-token` field
|
||||||
|
|
||||||
1. Set up Cloudflare zone with all records (Phase 2)
|
### Step 2: Repository structure
|
||||||
2. Prepare Caddy migration on a branch (this phase)
|
|
||||||
3. Change nameservers at Gandi (Phase 3)
|
|
||||||
4. Immediately deploy Caddy update: `mise run provision-indri -- --tags caddy`
|
|
||||||
5. Caddy's next TLS renewal uses Cloudflare DNS-01
|
|
||||||
|
|
||||||
Existing certificates are valid for ~90 days, providing a grace window.
|
Create the `fly/` directory at the repository root. This is separate from `containers/` because the image is built and deployed directly to Fly.io by `fly deploy` — it never goes through `registry.ops.eblu.me`.
|
||||||
|
|
||||||
## Phase 2: Pulumi — Cloudflare IaC
|
```
|
||||||
|
fly/
|
||||||
|
├── README.md # Setup notes and context
|
||||||
|
├── fly.toml # Fly.io app configuration
|
||||||
|
├── Dockerfile # nginx + tailscale
|
||||||
|
├── nginx.conf # Reverse proxy + cache config
|
||||||
|
└── start.sh # Entrypoint: start tailscale, then nginx
|
||||||
|
```
|
||||||
|
|
||||||
Create a new Pulumi project at `pulumi/cloudflare/`.
|
**`fly/fly.toml`** — app configuration:
|
||||||
|
|
||||||
### Files to create
|
```toml
|
||||||
|
app = "blumeops-proxy"
|
||||||
|
primary_region = "sjc"
|
||||||
|
|
||||||
- `pulumi/cloudflare/Pulumi.yaml` — project definition (`blumeops-cloudflare`, python/uv)
|
[build]
|
||||||
- `pulumi/cloudflare/Pulumi.eblu-me.yaml` — stack config (domain, account-id)
|
|
||||||
- `pulumi/cloudflare/pyproject.toml` — deps: `pulumi>=3.0.0`, `pulumi-cloudflare>=5.0.0`
|
|
||||||
- `pulumi/cloudflare/__main__.py`
|
|
||||||
|
|
||||||
### Pulumi program manages
|
[http_service]
|
||||||
|
internal_port = 8080
|
||||||
|
force_https = true
|
||||||
|
auto_stop_machines = false
|
||||||
|
auto_start_machines = true
|
||||||
|
min_machines_running = 1
|
||||||
|
|
||||||
- Zone lookup for `eblu.me`
|
[checks]
|
||||||
- DNS records:
|
[checks.health]
|
||||||
- `*.ops.eblu.me` A record → Tailscale IP, **proxied=False** (grey cloud, private)
|
port = 8080
|
||||||
- `ops.eblu.me` A record → Tailscale IP, **proxied=False**
|
type = "http"
|
||||||
- `docs.eblu.me` CNAME → `<tunnel-id>.cfargotunnel.com`, **proxied=True** (orange cloud, CDN)
|
interval = "30s"
|
||||||
- Cloudflare Tunnel resource
|
timeout = "5s"
|
||||||
- Tunnel config (ingress: `docs.eblu.me` → `http://docs.docs.svc.cluster.local:80`)
|
path = "/healthz"
|
||||||
- Cache rules for static docs site (edge TTL: 1 day, browser TTL: 1 hour)
|
```
|
||||||
- Zone security settings (SSL: full, min TLS 1.2, always HTTPS)
|
|
||||||
|
|
||||||
### New mise tasks
|
**`fly/Dockerfile`** — nginx + tailscale:
|
||||||
|
|
||||||
Following the `dns-preview`/`dns-up` pattern:
|
```dockerfile
|
||||||
|
FROM nginx:alpine
|
||||||
|
|
||||||
- `mise-tasks/cloudflare-preview` — `pulumi preview` with 1Password token injection
|
# Copy tailscale binaries from official image
|
||||||
- `mise-tasks/cloudflare-up` — `pulumi up` with 1Password token injection
|
COPY --from=docker.io/tailscale/tailscale:stable \
|
||||||
|
/usr/local/bin/tailscaled /usr/local/bin/tailscaled
|
||||||
|
COPY --from=docker.io/tailscale/tailscale:stable \
|
||||||
|
/usr/local/bin/tailscale /usr/local/bin/tailscale
|
||||||
|
|
||||||
Keep `pulumi/gandi/` until migration is confirmed working. Then `pulumi destroy` the Gandi stack and archive the code.
|
RUN mkdir -p /var/run/tailscale /var/lib/tailscale
|
||||||
|
|
||||||
## Phase 3: DNS migration
|
COPY nginx.conf /etc/nginx/nginx.conf
|
||||||
|
COPY start.sh /start.sh
|
||||||
|
RUN chmod +x /start.sh
|
||||||
|
|
||||||
### Pre-migration checklist
|
EXPOSE 8080
|
||||||
|
|
||||||
- [ ] Cloudflare zone active with all records (Phase 2)
|
CMD ["/start.sh"]
|
||||||
- [ ] Caddy migration branch ready (Phase 1)
|
```
|
||||||
- [ ] Cloudflare Tunnel created and configured (Phase 2)
|
|
||||||
- [ ] cloudflared running in k8s (Phase 4)
|
|
||||||
|
|
||||||
### Steps
|
**`fly/start.sh`** — entrypoint:
|
||||||
|
|
||||||
1. At Gandi registrar dashboard: change nameservers to Cloudflare's assigned NS
|
```bash
|
||||||
2. Deploy Caddy update immediately: `mise run provision-indri -- --tags caddy`
|
#!/bin/sh
|
||||||
3. Monitor propagation: `dig +trace docs.eblu.me`, `dig +trace forge.ops.eblu.me`
|
set -e
|
||||||
4. Verify tailnet services still work from tailnet clients
|
|
||||||
5. Verify `docs.eblu.me` resolves publicly
|
|
||||||
|
|
||||||
### Rollback
|
# Start tailscale in userspace networking mode (no TUN device needed)
|
||||||
|
tailscaled --tun=userspace-networking --statedir=/var/lib/tailscale &
|
||||||
|
sleep 2
|
||||||
|
|
||||||
Change nameservers back to Gandi's at registrar. Everything reverts.
|
# Authenticate and join tailnet
|
||||||
|
tailscale up --authkey="${TS_AUTHKEY}" --hostname=flyio-proxy
|
||||||
|
|
||||||
## Phase 4: cloudflared in Kubernetes
|
# Wait for tailscale to be ready
|
||||||
|
until tailscale status > /dev/null 2>&1; do sleep 1; done
|
||||||
|
echo "Tailscale connected"
|
||||||
|
|
||||||
### Files to create
|
# Start nginx
|
||||||
|
nginx -g "daemon off;"
|
||||||
|
```
|
||||||
|
|
||||||
- `argocd/apps/cloudflare-tunnel.yaml` — ArgoCD Application
|
**`fly/nginx.conf`** — reverse proxy with caching and rate limiting:
|
||||||
- `argocd/manifests/cloudflare-tunnel/deployment.yaml` — cloudflared Deployment
|
|
||||||
- Image: `cloudflare/cloudflared:latest` (or pinned version)
|
|
||||||
- Args: `tunnel --no-autoupdate run --token <tunnel-token>`
|
|
||||||
- Single replica, tunnel token injected from a Secret
|
|
||||||
- `argocd/manifests/cloudflare-tunnel/external-secret.yaml` — ExternalSecret to pull tunnel token from 1Password
|
|
||||||
- `argocd/manifests/cloudflare-tunnel/kustomization.yaml`
|
|
||||||
|
|
||||||
### Tunnel routing (managed by Pulumi)
|
```nginx
|
||||||
|
worker_processes auto;
|
||||||
|
|
||||||
- `docs.eblu.me` → `http://docs.docs.svc.cluster.local:80` (direct k8s service access)
|
events {
|
||||||
- Catch-all → `http_status:404`
|
worker_connections 1024;
|
||||||
|
}
|
||||||
|
|
||||||
Namespace: `cloudflare-tunnel` (dedicated, reusable for future public services)
|
http {
|
||||||
|
include /etc/nginx/mime.types;
|
||||||
|
default_type application/octet-stream;
|
||||||
|
|
||||||
## Phase 5: Documentation and cleanup
|
# Rate limiting: 10 requests/sec per IP, burst of 20
|
||||||
|
limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;
|
||||||
|
|
||||||
### Files to create
|
# Proxy cache: 200MB, evict after 24h of no access
|
||||||
|
proxy_cache_path /tmp/cache levels=1:2 keys_zone=services:10m
|
||||||
|
max_size=200m inactive=24h;
|
||||||
|
|
||||||
- `docs/reference/infrastructure/cloudflare.md` — reference card
|
# --- docs.eblu.me ---
|
||||||
- `docs/changelog.d/<branch>.feature.md` — changelog fragment
|
server {
|
||||||
|
listen 8080;
|
||||||
|
server_name docs.eblu.me;
|
||||||
|
|
||||||
### Files to modify
|
limit_req zone=general burst=20 nodelay;
|
||||||
|
|
||||||
- `docs/reference/infrastructure/routing.md` — add public services section
|
location / {
|
||||||
- `docs/reference/infrastructure/gandi.md` — update to registrar-only role
|
proxy_pass https://docs.tail8d86e.ts.net;
|
||||||
- `docs/reference/services/docs.md` — add public URL `https://docs.eblu.me`
|
proxy_ssl_verify off;
|
||||||
- `docs/reference/reference.md` — add Cloudflare to infrastructure section
|
|
||||||
- `CLAUDE.md` — update routing table, add cloudflare tasks
|
# Cache aggressively — static site
|
||||||
|
proxy_cache services;
|
||||||
|
proxy_cache_valid 200 1d;
|
||||||
|
proxy_cache_valid 404 1m;
|
||||||
|
proxy_cache_use_stale error timeout updating;
|
||||||
|
proxy_cache_lock on;
|
||||||
|
|
||||||
|
# Prevent cache-busting: ignore query strings and
|
||||||
|
# client cache-control headers
|
||||||
|
proxy_cache_key $host$uri;
|
||||||
|
proxy_ignore_headers Cache-Control Set-Cookie;
|
||||||
|
|
||||||
|
add_header X-Cache-Status $upstream_cache_status;
|
||||||
|
}
|
||||||
|
|
||||||
|
location /healthz {
|
||||||
|
return 200 "ok\n";
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
# Catch-all: reject unknown hosts
|
||||||
|
server {
|
||||||
|
listen 8080 default_server;
|
||||||
|
return 444;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 3: Tailscale auth key and ACLs (Pulumi)
|
||||||
|
|
||||||
|
Extend the existing `pulumi/tailscale/` project.
|
||||||
|
|
||||||
|
**Add to `pulumi/tailscale/__main__.py`:**
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Auth key for Fly.io proxy container
|
||||||
|
flyio_key = tailscale.TailscaleKey(
|
||||||
|
"flyio-proxy-key",
|
||||||
|
reusable=True,
|
||||||
|
ephemeral=True,
|
||||||
|
tags=["tag:flyio-proxy"],
|
||||||
|
expiry=7776000, # 90 days
|
||||||
|
)
|
||||||
|
pulumi.export("flyio_authkey", flyio_key.key)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Add to `pulumi/tailscale/policy.hujson`:**
|
||||||
|
|
||||||
|
Tag owner:
|
||||||
|
```
|
||||||
|
"tag:flyio-proxy": ["autogroup:admin", "tag:blumeops"],
|
||||||
|
```
|
||||||
|
|
||||||
|
Access grant (Fly.io proxy → k8s services on HTTPS only):
|
||||||
|
```
|
||||||
|
{
|
||||||
|
"src": ["tag:flyio-proxy"],
|
||||||
|
"dst": ["tag:k8s"],
|
||||||
|
"ip": ["tcp:443"],
|
||||||
|
},
|
||||||
|
```
|
||||||
|
|
||||||
|
ACL test:
|
||||||
|
```
|
||||||
|
{
|
||||||
|
"src": "tag:flyio-proxy",
|
||||||
|
"accept": ["tag:k8s:443"],
|
||||||
|
"deny": ["tag:homelab:22", "tag:nas:445", "tag:registry:443"],
|
||||||
|
},
|
||||||
|
```
|
||||||
|
|
||||||
|
Deploy: `mise run tailnet-preview` then `mise run tailnet-up`.
|
||||||
|
|
||||||
|
After deploying, extract the auth key and set it as a Fly.io secret:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Get the key from Pulumi state
|
||||||
|
cd pulumi/tailscale && pulumi stack output flyio_authkey --show-secrets
|
||||||
|
|
||||||
|
# Set it in Fly.io
|
||||||
|
fly secrets set TS_AUTHKEY="tskey-auth-..." -a blumeops-proxy
|
||||||
|
```
|
||||||
|
|
||||||
|
Store the auth key in 1Password as well for the `fly-setup` mise task.
|
||||||
|
|
||||||
|
### Step 4: Mise tasks
|
||||||
|
|
||||||
|
**`mise-tasks/fly-deploy`:**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
#!/usr/bin/env bash
|
||||||
|
#MISE description="Deploy the Fly.io public proxy"
|
||||||
|
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
cd "$(dirname "$0")/../fly"
|
||||||
|
fly deploy "$@"
|
||||||
|
```
|
||||||
|
|
||||||
|
**`mise-tasks/fly-setup`:**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
#!/usr/bin/env bash
|
||||||
|
#MISE description="One-time setup: configure Fly.io secrets and certs (idempotent)"
|
||||||
|
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
APP="blumeops-proxy"
|
||||||
|
|
||||||
|
# Fetch Tailscale auth key from 1Password
|
||||||
|
TS_AUTHKEY=$(op --vault vg6xf6vvfmoh5hqjjhlhbeoaie item get <FLY_ITEM_ID> --fields ts-authkey --reveal)
|
||||||
|
fly secrets set TS_AUTHKEY="$TS_AUTHKEY" -a "$APP"
|
||||||
|
echo "Tailscale auth key set"
|
||||||
|
|
||||||
|
# Add certs for all public domains (idempotent — fly ignores duplicates)
|
||||||
|
fly certs add docs.eblu.me -a "$APP" 2>/dev/null || true
|
||||||
|
# fly certs add wiki.eblu.me -a "$APP" 2>/dev/null || true # future services
|
||||||
|
echo "Certificates configured"
|
||||||
|
|
||||||
|
echo "Done. Run 'mise run fly-deploy' to deploy."
|
||||||
|
```
|
||||||
|
|
||||||
|
**`mise-tasks/fly-shutoff`:**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
#!/usr/bin/env bash
|
||||||
|
#MISE description="Emergency shutoff: stop all Fly.io proxy machines"
|
||||||
|
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
APP="blumeops-proxy"
|
||||||
|
|
||||||
|
echo "EMERGENCY SHUTOFF: Stopping all machines for $APP"
|
||||||
|
fly scale count 0 -a "$APP" --yes
|
||||||
|
echo "All machines stopped. Public services are offline."
|
||||||
|
echo "To restore: fly scale count 1 -a $APP"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 5: Forgejo CI workflow
|
||||||
|
|
||||||
|
**`.forgejo/workflows/deploy-fly.yaml`:**
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
name: Deploy Fly.io Proxy
|
||||||
|
|
||||||
|
on:
|
||||||
|
workflow_dispatch:
|
||||||
|
push:
|
||||||
|
branches: [main]
|
||||||
|
paths:
|
||||||
|
- 'fly/**'
|
||||||
|
|
||||||
|
jobs:
|
||||||
|
deploy:
|
||||||
|
runs-on: k8s
|
||||||
|
steps:
|
||||||
|
- name: Checkout
|
||||||
|
uses: actions/checkout@v4
|
||||||
|
|
||||||
|
- name: Install flyctl
|
||||||
|
run: |
|
||||||
|
curl -L https://fly.io/install.sh | sh
|
||||||
|
echo "/root/.fly/bin" >> "$GITHUB_PATH"
|
||||||
|
|
||||||
|
- name: Deploy to Fly.io
|
||||||
|
env:
|
||||||
|
FLY_API_TOKEN: ${{ secrets.FLY_DEPLOY_TOKEN }}
|
||||||
|
run: |
|
||||||
|
cd fly
|
||||||
|
fly deploy
|
||||||
|
|
||||||
|
- name: Verify health
|
||||||
|
env:
|
||||||
|
FLY_API_TOKEN: ${{ secrets.FLY_DEPLOY_TOKEN }}
|
||||||
|
run: |
|
||||||
|
fly status -a blumeops-proxy
|
||||||
|
echo ""
|
||||||
|
echo "Health check:"
|
||||||
|
sleep 10
|
||||||
|
curl -sf https://blumeops-proxy.fly.dev/healthz || echo "Warning: health check failed (may need DNS propagation)"
|
||||||
|
```
|
||||||
|
|
||||||
|
The `FLY_DEPLOY_TOKEN` Forgejo Actions secret must be set via the [[forgejo]] API or UI, following the pattern in the `forgejo_actions_secrets` Ansible role.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Per-service setup
|
||||||
|
|
||||||
|
To expose an additional service (example: `wiki.eblu.me`):
|
||||||
|
|
||||||
|
### 1. Add nginx server block
|
||||||
|
|
||||||
|
Edit `fly/nginx.conf` — add a new `server` block:
|
||||||
|
|
||||||
|
```nginx
|
||||||
|
# --- wiki.eblu.me ---
|
||||||
|
server {
|
||||||
|
listen 8080;
|
||||||
|
server_name wiki.eblu.me;
|
||||||
|
|
||||||
|
limit_req zone=general burst=20 nodelay;
|
||||||
|
|
||||||
|
location / {
|
||||||
|
proxy_pass https://wiki.tail8d86e.ts.net;
|
||||||
|
proxy_ssl_verify off;
|
||||||
|
|
||||||
|
proxy_cache services;
|
||||||
|
proxy_cache_valid 200 1d;
|
||||||
|
proxy_cache_valid 404 1m;
|
||||||
|
proxy_cache_use_stale error timeout updating;
|
||||||
|
proxy_cache_lock on;
|
||||||
|
proxy_cache_key $host$uri;
|
||||||
|
proxy_ignore_headers Cache-Control Set-Cookie;
|
||||||
|
|
||||||
|
add_header X-Cache-Status $upstream_cache_status;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Adjust `proxy_cache_valid` and `proxy_cache_key` based on the service. For dynamic services with user sessions, you'll want shorter cache TTLs and may need to include query strings or cookies in the cache key.
|
||||||
|
|
||||||
|
### 2. Add DNS CNAME (Pulumi)
|
||||||
|
|
||||||
|
Add to `pulumi/gandi/__main__.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
wiki_public = gandi.livedns.Record(
|
||||||
|
"wiki-public",
|
||||||
|
zone=domain,
|
||||||
|
name="wiki",
|
||||||
|
type="CNAME",
|
||||||
|
ttl=300,
|
||||||
|
values=["blumeops-proxy.fly.dev."],
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
Deploy: `mise run dns-preview` then `mise run dns-up`.
|
||||||
|
|
||||||
|
### 3. Add Fly.io certificate
|
||||||
|
|
||||||
|
```bash
|
||||||
|
fly certs add wiki.eblu.me -a blumeops-proxy
|
||||||
|
```
|
||||||
|
|
||||||
|
Or add it to `mise-tasks/fly-setup` so it's captured for future runs.
|
||||||
|
|
||||||
|
### 4. Deploy
|
||||||
|
|
||||||
|
```bash
|
||||||
|
mise run fly-deploy
|
||||||
|
```
|
||||||
|
|
||||||
|
Or push the `fly/nginx.conf` change to main — the Forgejo workflow deploys automatically.
|
||||||
|
|
||||||
|
### 5. Verify
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -I https://wiki.eblu.me
|
||||||
|
# Should return 200 with X-Cache-Status header
|
||||||
|
```
|
||||||
|
|
||||||
|
### 6. Update Tailscale ACLs if needed
|
||||||
|
|
||||||
|
If the new service uses a Tailscale tag not already in the `tag:flyio-proxy` grant, add it to `policy.hujson`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Security
|
||||||
|
|
||||||
|
### DDoS and rate limiting
|
||||||
|
|
||||||
|
This approach provides basic protection, not enterprise-grade:
|
||||||
|
|
||||||
|
- **Fly.io Anycast** absorbs volumetric L3/L4 attacks
|
||||||
|
- **nginx `limit_req`** caps per-IP request rates at the container level
|
||||||
|
- **nginx `proxy_cache`** serves most requests from cache — only cache misses traverse the Tailscale tunnel to indri
|
||||||
|
- **`proxy_cache_key $host$uri`** ignores query strings, preventing trivial cache-busting
|
||||||
|
- **`proxy_ignore_headers Cache-Control`** prevents clients from forcing cache misses
|
||||||
|
|
||||||
|
This is sufficient for a personal documentation site. It is **not** sufficient for a service that might attract targeted attacks. For enterprise-grade DDoS protection, Cloudflare Tunnel is the better approach (requires DNS migration, see plan history in git).
|
||||||
|
|
||||||
|
### What fail2ban is (and why it doesn't apply)
|
||||||
|
|
||||||
|
fail2ban monitors logs for repeated failed authentication attempts (SSH brute force, bad login passwords) and bans IPs via firewall rules. A static site with no authentication has no login surface for fail2ban to monitor. It is a tool for services with user sessions, not for CDN/proxy protection.
|
||||||
|
|
||||||
|
### Break-glass shutoff
|
||||||
|
|
||||||
|
If the proxy is causing issues (DDoS, unexpected traffic, bandwidth consumption on the home network):
|
||||||
|
|
||||||
|
**Level 1 — Stop the container (seconds, reversible):**
|
||||||
|
```bash
|
||||||
|
mise run fly-shutoff
|
||||||
|
# or: fly scale count 0 -a blumeops-proxy --yes
|
||||||
|
```
|
||||||
|
All public services go offline immediately. Tailscale tunnel drops. Zero traffic reaches indri. Restore with `fly scale count 1 -a blumeops-proxy`.
|
||||||
|
|
||||||
|
**Level 2 — Revoke Tailscale access (seconds):**
|
||||||
|
Remove the `flyio-proxy` node in the Tailscale admin console. Even if the container is running, it cannot reach the tailnet. Use this if the container itself may be compromised.
|
||||||
|
|
||||||
|
**Level 3 — Remove DNS (minutes to hours):**
|
||||||
|
Delete the CNAME records at Gandi. Takes time for DNS propagation but is the permanent shutoff.
|
||||||
|
|
||||||
|
**Level 1 is the primary response.** It is a single command, takes effect in seconds, and is trivially reversible. Document the `mise run fly-shutoff` command somewhere easily accessible (e.g., pinned in a notes app) so it can be run quickly under stress.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## IaC summary
|
||||||
|
|
||||||
|
| Component | Managed by | Declarative? |
|
||||||
|
|-----------|------------|:---:|
|
||||||
|
| Tailscale auth key | Pulumi (`pulumi/tailscale/`) | yes |
|
||||||
|
| Tailscale ACLs | Pulumi (`pulumi/tailscale/policy.hujson`) | yes |
|
||||||
|
| DNS CNAMEs | Pulumi (`pulumi/gandi/`) | yes |
|
||||||
|
| Container + app config | `fly/Dockerfile` + `fly/fly.toml` in repo | yes |
|
||||||
|
| Deployment | Forgejo CI on push to `fly/`, or `mise run fly-deploy` | yes |
|
||||||
|
| Fly.io secrets + certs | `mise run fly-setup` (one-time, idempotent) | semi |
|
||||||
|
|
||||||
|
The "semi" for Fly.io secrets is a one-time operation backed by a repeatable mise task. Fly.io does not have a mature Pulumi or Terraform provider, so `fly.toml` + `flyctl` is the standard IaC model for Fly.io apps.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Verification
|
## Verification
|
||||||
|
|
||||||
1. `curl -I https://docs.eblu.me` from public internet — returns 200 with `cf-ray` header
|
After initial deployment of a service (using `docs.eblu.me` as example):
|
||||||
2. `dig docs.eblu.me` — shows Cloudflare IPs (not Tailscale IP)
|
|
||||||
3. `dig forge.ops.eblu.me` — still shows `100.98.163.89` (Tailscale IP)
|
1. `curl -I https://docs.eblu.me` — returns 200 with `X-Cache-Status` header
|
||||||
4. All `*.ops.eblu.me` services accessible from tailnet
|
2. `dig docs.eblu.me` — resolves to Fly.io IPs (not Tailscale IP)
|
||||||
|
3. `dig forge.ops.eblu.me` — still resolves to `100.98.163.89` (unchanged)
|
||||||
|
4. All `*.ops.eblu.me` services work from tailnet
|
||||||
5. `mise run services-check` passes
|
5. `mise run services-check` passes
|
||||||
6. Caddy TLS renewal works (force test with `caddy reload` if needed)
|
6. `fly status -a blumeops-proxy` shows healthy machine
|
||||||
7. Cloudflare dashboard shows tunnel healthy and cache hits
|
7. Second request to same URL shows `X-Cache-Status: HIT`
|
||||||
|
|
||||||
## Risks
|
|
||||||
|
|
||||||
| Risk | Mitigation |
|
|
||||||
|------|------------|
|
|
||||||
| Caddy TLS renewal fails after NS change | Deploy Caddy update immediately; existing certs valid ~90 days |
|
|
||||||
| DNS propagation delay (24-48h) | Set low TTLs before migration; monitor with `dig +trace` |
|
|
||||||
| cloudflared crashes | K8s restarts it; Cloudflare serves cached content |
|
|
||||||
| Tunnel credentials leak | 1Password + ExternalSecret; tunnel only routes to docs |
|
|
||||||
|
|
||||||
## Adding more public services
|
|
||||||
|
|
||||||
To expose another service publicly (e.g., `wiki.eblu.me`):
|
|
||||||
|
|
||||||
1. Add DNS record + tunnel ingress rule in `pulumi/cloudflare/__main__.py`
|
|
||||||
2. Run `mise run cloudflare-up`
|
|
||||||
3. No changes to cloudflared deployment (remotely-managed tunnel config)
|
|
||||||
|
|
|
||||||
|
|
@ -22,7 +22,7 @@ Task-oriented instructions for common BlumeOps operations. These guides assume y
|
||||||
| [[update-tailscale-acls]] | Update Tailscale access control policies |
|
| [[update-tailscale-acls]] | Update Tailscale access control policies |
|
||||||
| [[gandi-operations]] | Manage DNS records and cycle the Gandi API token |
|
| [[gandi-operations]] | Manage DNS records and cycle the Gandi API token |
|
||||||
| [[use-pypi-proxy]] | Configure pip and publish packages to devpi |
|
| [[use-pypi-proxy]] | Configure pip and publish packages to devpi |
|
||||||
| [[expose-service-publicly]] | Expose a service to the public internet via Cloudflare Tunnel |
|
| [[expose-service-publicly]] | Expose a service to the public internet via Fly.io + Tailscale |
|
||||||
|
|
||||||
## Documentation
|
## Documentation
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -125,17 +125,28 @@ def main() -> int:
|
||||||
if has_spaces:
|
if has_spaces:
|
||||||
# Links with spaces in target or around pipe are not allowed
|
# Links with spaces in target or around pipe are not allowed
|
||||||
spaced_links.append((rel_path, line_num, target))
|
spaced_links.append((rel_path, line_num, target))
|
||||||
elif "/" in target:
|
continue
|
||||||
|
|
||||||
|
# Handle anchor links: [[#Heading]] or [[file#Heading]]
|
||||||
|
# Strip the #fragment for validation; pure anchors (#Heading) skip file check
|
||||||
|
file_target = target
|
||||||
|
if "#" in target:
|
||||||
|
file_target = target.split("#", 1)[0]
|
||||||
|
if not file_target:
|
||||||
|
# Pure in-page anchor like [[#Break-glass shutoff]] — always valid
|
||||||
|
continue
|
||||||
|
|
||||||
|
if "/" in file_target:
|
||||||
# Path-based links are not allowed - use simple filenames only
|
# Path-based links are not allowed - use simple filenames only
|
||||||
path_links.append((rel_path, line_num, target))
|
path_links.append((rel_path, line_num, target))
|
||||||
elif target in ambiguous_filenames:
|
elif file_target in ambiguous_filenames:
|
||||||
# Link uses an ambiguous filename - needs to be renamed
|
# Link uses an ambiguous filename - needs to be renamed
|
||||||
ambiguous_links.append((rel_path, line_num, target, filename_counts[target]))
|
ambiguous_links.append((rel_path, line_num, target, filename_counts[file_target]))
|
||||||
elif target not in valid_targets:
|
elif file_target not in valid_targets:
|
||||||
broken_links.append((rel_path, line_num, target))
|
broken_links.append((rel_path, line_num, target))
|
||||||
elif target != source_stem:
|
elif file_target != source_stem:
|
||||||
# Valid link to a different doc — record it for orphan detection
|
# Valid link to a different doc — record it for orphan detection
|
||||||
linked_stems.add(target)
|
linked_stems.add(file_target)
|
||||||
|
|
||||||
# Print results
|
# Print results
|
||||||
console.print("[bold]Wiki-Link Validation[/bold]")
|
console.print("[bold]Wiki-Link Validation[/bold]")
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue