Rewrite public exposure guide to Fly.io + Tailscale approach

Replace the Cloudflare Tunnel plan with a Fly.io reverse proxy architecture that tunnels back to indri over Tailscale. Covers: - Full architecture with nginx proxy cache + rate limiting - One-time setup vs per-service steps - Fly.io container (Dockerfile, fly.toml, nginx.conf, start.sh) - Pulumi IaC for Tailscale auth key + DNS CNAMEs - Forgejo CI workflow for automated deploys - Security model, DDoS considerations, break-glass shutoff - Mise tasks: fly-deploy, fly-setup, fly-shutoff Also fix docs-check-links to handle in-page anchor links ([[#Heading]]) and cross-file anchors ([[file#Heading]]). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 00:08:23 -08:00 · 2026-02-08 00:08:23 -08:00 · 1de5492d6c
commit 1de5492d6c
parent bc263f3ee8
4 changed files with 471 additions and 135 deletions
--- a/docs/changelog.d/docs-expose-service-publicly.doc.md
+++ b/docs/changelog.d/docs-expose-service-publicly.doc.md
@ -1 +1 @@
-Add how-to guide for exposing services publicly via Cloudflare Tunnel.
+Add how-to guide for exposing services publicly via Fly.io reverse proxy + Tailscale tunnel.
--- a/docs/how-to/expose-service-publicly.md
+++ b/docs/how-to/expose-service-publicly.md
@ -2,197 +2,522 @@
 title: Expose a Service Publicly
 tags:
  - how-to
-  - cloudflare
+  - fly-io
  - tailscale
  - networking
 ---
-# Expose a Service Publicly via Cloudflare Tunnel
+# Expose a Service Publicly via Fly.io + Tailscale
-> **Status:** Plan — not yet implemented. Execute phases in order when ready.
+> **Status:** Plan — not yet implemented. First target: `docs.eblu.me`.
-This guide describes how to expose a BlumeOps service to the public internet securely using Cloudflare as a CDN and DDoS shield, with a Cloudflare Tunnel creating an outbound-only connection that never exposes the home IP.
+This guide describes how to expose a BlumeOps service to the public internet using a reverse proxy container on [Fly.io](https://fly.io) that tunnels back to [[indri]] over [[tailscale]]. The approach keeps the home IP hidden, requires no changes to existing infrastructure (`*.ops.eblu.me`, [[caddy]], DNS), and is reusable for multiple services.
 The first service to expose is `docs.eblu.me`. The pattern is reusable for future services.
 ## Architecture
 ```
-Internet → docs.eblu.me (Cloudflare proxied CNAME)
+Internet → <service>.eblu.me
               │
-         Cloudflare Edge (CDN, WAF, DDoS protection)
+         Fly.io edge (Anycast, TLS via Let's Encrypt)
               │
-         Cloudflare Tunnel (outbound from k8s)
+         Fly.io VM (nginx reverse proxy + Tailscale)
               │  (WireGuard tunnel)
         tailnet (tail8d86e.ts.net)
               │
-         cloudflared pod in minikube
+         <service>.tail8d86e.ts.net (Tailscale ingress)
               │
-         docs k8s Service (ClusterIP, port 80)
+         k8s Service → pod
               │
         docs pod (nginx + Quartz static site)
 Tailnet → *.ops.eblu.me (unchanged, DNS-only to Tailscale IP)
 ```
-All existing `*.ops.eblu.me` services remain private behind Tailscale. Only explicitly configured subdomains (like `docs.eblu.me`) are exposed publicly through Cloudflare.
+A single Fly.io container serves as the public-facing proxy for all exposed services. Each service gets a `server` block in the nginx config and a DNS CNAME. The container joins the tailnet via an ephemeral auth key and reaches backend services through Tailscale ingress endpoints.
-## Key Decisions
+Existing `*.ops.eblu.me` services remain private behind Tailscale — this approach does not touch [[caddy]], [[gandi]] DNS-01, or any other existing infrastructure.
 ## Key decisions
 | Decision | Choice | Rationale |
 |----------|--------|-----------|
-| DNS hosting | Move from [[gandi]] to Cloudflare (free) | CNAME/partial setup needs Business plan @ $200/mo |
+| Proxy host | Fly.io (free tier) | Managed container, no server to maintain via Ansible |
-| Gandi role | Registrar only | Domain renewal, WHOIS. No more DNS hosting. |
+| Tunnel | Tailscale (existing) | Already in use, WireGuard encryption, ACL control |
-| Tunnel host | Kubernetes | ArgoCD managed, direct ClusterIP access, no Tailscale hop |
+| DNS | CNAME at [[gandi]] | No DNS migration needed, no Cloudflare dependency |
-| [[caddy]] TLS | Migrate to Cloudflare DNS-01 plugin | Gandi DNS-01 won't work after nameserver change |
+| TLS (public) | Fly.io auto-provisions Let's Encrypt | No cert management, `$0.10/mo` per hostname |
-| Cloudflare account | Recover existing, instrument with IaC | |
+| TLS (origin) | Tailscale handles encryption | WireGuard tunnel encrypts all traffic |
 | CDN/cache | nginx `proxy_cache` in container | Aggressive caching for static content, sufficient for personal sites |
 | DDoS | Fly.io Anycast + nginx rate limiting | Not enterprise-grade; see [[#Break-glass shutoff]] |
 | IaC | `fly/` directory in repo, Pulumi for DNS + TS key | No well-maintained Fly.io Pulumi provider; `fly.toml` is the app's IaC |
-## Prerequisites
+## TLS in this architecture
- Cloudflare account with `eblu.me` zone added (free plan)
+There are three independent TLS segments — none involve Caddy:
 - Cloudflare API token stored in 1Password with scopes: Zone:DNS:Edit, Zone:Zone:Read, Account:Cloudflare Tunnel:Edit, Account:Account Settings:Read
 - Cloudflare account ID and zone ID noted
-## Phase 0: Preparation (manual)
+1. **Browser → Fly.io edge**: Fly.io auto-provisions a Let's Encrypt certificate for each custom domain (e.g., `docs.eblu.me`). Validated via TLS-ALPN challenge — no DNS API needed.
 2. **nginx → Tailscale ingress**: nginx proxies to `https://<service>.tail8d86e.ts.net`. The Tailscale ingress serves a Tailscale-issued cert. nginx uses `proxy_ssl_verify off` since the underlying tunnel is already encrypted.
 3. **WireGuard tunnel**: All Tailscale traffic is encrypted at the network layer regardless of application-level TLS.
-1. Recover Cloudflare account access
+Caddy continues to serve `*.ops.eblu.me` with its existing Gandi DNS-01 certificates. The two TLS domains are completely independent.
 2. Add `eblu.me` zone (free plan) — Cloudflare scans existing records from Gandi
 3. **Do not change nameservers yet** — wait until Phase 3
 4. Create API token with the scopes listed above
 5. Store API token and account ID in 1Password (blumeops vault)
-## Phase 1: Caddy TLS migration
+## External references
-**Why first**: Blocking dependency for the nameserver change. Once nameservers move to Cloudflare, Gandi LiveDNS can't serve DNS-01 ACME challenges.
+- [Tailscale on Fly.io](https://tailscale.com/kb/1132/flydotio) — official guide for running Tailscale in a Fly.io container
 - [Fly.io Custom Domains](https://fly.io/docs/networking/custom-domain/) — how Fly handles TLS for custom domains
 - [Home Assistant + Fly.io + Tailscale](https://community.home-assistant.io/t/expose-ha-to-the-internet-via-a-cloud-reverse-proxy-fly-io-and-a-vpn-tailscale-for-free-for-now-without-opening-ports/352118) — community guide describing this exact pattern
-### Caddy binary rebuild
+---
-Rebuild Caddy with `github.com/caddy-dns/cloudflare` instead of `github.com/caddy-dns/gandi` using `xcaddy` in `~/code/3rd/caddy/`.
+## One-time setup (first service)
-### Files to modify
+These steps establish the Fly.io proxy infrastructure. They only need to be done once.
- `ansible/roles/caddy/templates/Caddyfile.j2` — change `dns gandi {env.GANDI_BEARER_TOKEN}` to `dns cloudflare {env.CF_API_TOKEN}`
+### Step 1: Fly.io account and app
 - `ansible/roles/caddy/templates/caddy-wrapper.sh.j2` — source Cloudflare API token instead of Gandi PAT
 - `ansible/roles/caddy/defaults/main.yml` — update token variable name
 - `ansible/playbooks/indri.yml` — add pre_task to fetch Cloudflare API token from 1Password, replace Gandi PAT fetch
-### Deployment sequence
+1. Create or recover a Fly.io account at https://fly.io (requires credit card for free tier)
 2. Install `flyctl`: `brew install flyctl`
 3. Authenticate: `fly auth login`
 4. Create the app: `fly apps create blumeops-proxy`
 5. Store the Fly.io deploy token in 1Password (blumeops vault):
   - Generate: `fly tokens create deploy -a blumeops-proxy`
   - Store as `fly-deploy-token` field
-1. Set up Cloudflare zone with all records (Phase 2)
+### Step 2: Repository structure
 2. Prepare Caddy migration on a branch (this phase)
 3. Change nameservers at Gandi (Phase 3)
 4. Immediately deploy Caddy update: `mise run provision-indri -- --tags caddy`
 5. Caddy's next TLS renewal uses Cloudflare DNS-01
-Existing certificates are valid for ~90 days, providing a grace window.
+Create the `fly/` directory at the repository root. This is separate from `containers/` because the image is built and deployed directly to Fly.io by `fly deploy` — it never goes through `registry.ops.eblu.me`.
-## Phase 2: Pulumi — Cloudflare IaC
+```
 fly/
 ├── README.md           # Setup notes and context
 ├── fly.toml            # Fly.io app configuration
 ├── Dockerfile          # nginx + tailscale
 ├── nginx.conf          # Reverse proxy + cache config
 └── start.sh            # Entrypoint: start tailscale, then nginx
 ```
-Create a new Pulumi project at `pulumi/cloudflare/`.
+**`fly/fly.toml`** — app configuration:
-### Files to create
+```toml
 app = "blumeops-proxy"
 primary_region = "sjc"
- `pulumi/cloudflare/Pulumi.yaml` — project definition (`blumeops-cloudflare`, python/uv)
+[build]
 - `pulumi/cloudflare/Pulumi.eblu-me.yaml` — stack config (domain, account-id)
 - `pulumi/cloudflare/pyproject.toml` — deps: `pulumi>=3.0.0`, `pulumi-cloudflare>=5.0.0`
 - `pulumi/cloudflare/__main__.py`
-### Pulumi program manages
+[http_service]
  internal_port = 8080
  force_https = true
  auto_stop_machines = false
  auto_start_machines = true
  min_machines_running = 1
- Zone lookup for `eblu.me`
+[checks]
- DNS records:
+  [checks.health]
-  - `*.ops.eblu.me` A record → Tailscale IP, **proxied=False** (grey cloud, private)
+    port = 8080
-  - `ops.eblu.me` A record → Tailscale IP, **proxied=False**
+    type = "http"
-  - `docs.eblu.me` CNAME → `<tunnel-id>.cfargotunnel.com`, **proxied=True** (orange cloud, CDN)
+    interval = "30s"
- Cloudflare Tunnel resource
+    timeout = "5s"
- Tunnel config (ingress: `docs.eblu.me` → `http://docs.docs.svc.cluster.local:80`)
+    path = "/healthz"
- Cache rules for static docs site (edge TTL: 1 day, browser TTL: 1 hour)
+```
 - Zone security settings (SSL: full, min TLS 1.2, always HTTPS)
-### New mise tasks
+**`fly/Dockerfile`** — nginx + tailscale:
-Following the `dns-preview`/`dns-up` pattern:
+```dockerfile
 FROM nginx:alpine
- `mise-tasks/cloudflare-preview` — `pulumi preview` with 1Password token injection
+# Copy tailscale binaries from official image
- `mise-tasks/cloudflare-up` — `pulumi up` with 1Password token injection
+COPY --from=docker.io/tailscale/tailscale:stable \
    /usr/local/bin/tailscaled /usr/local/bin/tailscaled
 COPY --from=docker.io/tailscale/tailscale:stable \
    /usr/local/bin/tailscale /usr/local/bin/tailscale
-Keep `pulumi/gandi/` until migration is confirmed working. Then `pulumi destroy` the Gandi stack and archive the code.
+RUN mkdir -p /var/run/tailscale /var/lib/tailscale
-## Phase 3: DNS migration
+COPY nginx.conf /etc/nginx/nginx.conf
 COPY start.sh /start.sh
 RUN chmod +x /start.sh
-### Pre-migration checklist
+EXPOSE 8080
- [ ] Cloudflare zone active with all records (Phase 2)
+CMD ["/start.sh"]
- [ ] Caddy migration branch ready (Phase 1)
+```
 - [ ] Cloudflare Tunnel created and configured (Phase 2)
 - [ ] cloudflared running in k8s (Phase 4)
-### Steps
+**`fly/start.sh`** — entrypoint:
-1. At Gandi registrar dashboard: change nameservers to Cloudflare's assigned NS
+```bash
-2. Deploy Caddy update immediately: `mise run provision-indri -- --tags caddy`
+#!/bin/sh
-3. Monitor propagation: `dig +trace docs.eblu.me`, `dig +trace forge.ops.eblu.me`
+set -e
 4. Verify tailnet services still work from tailnet clients
 5. Verify `docs.eblu.me` resolves publicly
-### Rollback
+# Start tailscale in userspace networking mode (no TUN device needed)
 tailscaled --tun=userspace-networking --statedir=/var/lib/tailscale &
 sleep 2
-Change nameservers back to Gandi's at registrar. Everything reverts.
+# Authenticate and join tailnet
 tailscale up --authkey="${TS_AUTHKEY}" --hostname=flyio-proxy
-## Phase 4: cloudflared in Kubernetes
+# Wait for tailscale to be ready
 until tailscale status > /dev/null 2>&1; do sleep 1; done
 echo "Tailscale connected"
-### Files to create
+# Start nginx
 nginx -g "daemon off;"
 ```
- `argocd/apps/cloudflare-tunnel.yaml` — ArgoCD Application
+**`fly/nginx.conf`** — reverse proxy with caching and rate limiting:
 - `argocd/manifests/cloudflare-tunnel/deployment.yaml` — cloudflared Deployment
  - Image: `cloudflare/cloudflared:latest` (or pinned version)
  - Args: `tunnel --no-autoupdate run --token <tunnel-token>`
  - Single replica, tunnel token injected from a Secret
 - `argocd/manifests/cloudflare-tunnel/external-secret.yaml` — ExternalSecret to pull tunnel token from 1Password
 - `argocd/manifests/cloudflare-tunnel/kustomization.yaml`
-### Tunnel routing (managed by Pulumi)
+```nginx
 worker_processes auto;
- `docs.eblu.me` → `http://docs.docs.svc.cluster.local:80` (direct k8s service access)
+events {
- Catch-all → `http_status:404`
+    worker_connections 1024;
 }
-Namespace: `cloudflare-tunnel` (dedicated, reusable for future public services)
+http {
    include /etc/nginx/mime.types;
    default_type application/octet-stream;
-## Phase 5: Documentation and cleanup
+    # Rate limiting: 10 requests/sec per IP, burst of 20
    limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;
-### Files to create
+    # Proxy cache: 200MB, evict after 24h of no access
    proxy_cache_path /tmp/cache levels=1:2 keys_zone=services:10m
                     max_size=200m inactive=24h;
- `docs/reference/infrastructure/cloudflare.md` — reference card
+    # --- docs.eblu.me ---
- `docs/changelog.d/<branch>.feature.md` — changelog fragment
+    server {
        listen 8080;
        server_name docs.eblu.me;
-### Files to modify
+        limit_req zone=general burst=20 nodelay;
- `docs/reference/infrastructure/routing.md` — add public services section
+        location / {
- `docs/reference/infrastructure/gandi.md` — update to registrar-only role
+            proxy_pass https://docs.tail8d86e.ts.net;
- `docs/reference/services/docs.md` — add public URL `https://docs.eblu.me`
+            proxy_ssl_verify off;
- `docs/reference/reference.md` — add Cloudflare to infrastructure section
+
- `CLAUDE.md` — update routing table, add cloudflare tasks
+            # Cache aggressively — static site
            proxy_cache services;
            proxy_cache_valid 200 1d;
            proxy_cache_valid 404 1m;
            proxy_cache_use_stale error timeout updating;
            proxy_cache_lock on;
            # Prevent cache-busting: ignore query strings and
            # client cache-control headers
            proxy_cache_key $host$uri;
            proxy_ignore_headers Cache-Control Set-Cookie;
            add_header X-Cache-Status $upstream_cache_status;
        }
        location /healthz {
            return 200 "ok\n";
        }
    }
    # Catch-all: reject unknown hosts
    server {
        listen 8080 default_server;
        return 444;
    }
 }
 ```
 ### Step 3: Tailscale auth key and ACLs (Pulumi)
 Extend the existing `pulumi/tailscale/` project.
 **Add to `pulumi/tailscale/__main__.py`:**
 ```python
 # Auth key for Fly.io proxy container
 flyio_key = tailscale.TailscaleKey(
    "flyio-proxy-key",
    reusable=True,
    ephemeral=True,
    tags=["tag:flyio-proxy"],
    expiry=7776000,  # 90 days
 )
 pulumi.export("flyio_authkey", flyio_key.key)
 ```
 **Add to `pulumi/tailscale/policy.hujson`:**
 Tag owner:
 ```
 "tag:flyio-proxy": ["autogroup:admin", "tag:blumeops"],
 ```
 Access grant (Fly.io proxy → k8s services on HTTPS only):
 ```
 {
    "src": ["tag:flyio-proxy"],
    "dst": ["tag:k8s"],
    "ip":  ["tcp:443"],
 },
 ```
 ACL test:
 ```
 {
    "src":  "tag:flyio-proxy",
    "accept": ["tag:k8s:443"],
    "deny":   ["tag:homelab:22", "tag:nas:445", "tag:registry:443"],
 },
 ```
 Deploy: `mise run tailnet-preview` then `mise run tailnet-up`.
 After deploying, extract the auth key and set it as a Fly.io secret:
 ```bash
 # Get the key from Pulumi state
 cd pulumi/tailscale && pulumi stack output flyio_authkey --show-secrets
 # Set it in Fly.io
 fly secrets set TS_AUTHKEY="tskey-auth-..." -a blumeops-proxy
 ```
 Store the auth key in 1Password as well for the `fly-setup` mise task.
 ### Step 4: Mise tasks
 **`mise-tasks/fly-deploy`:**
 ```bash
 #!/usr/bin/env bash
 #MISE description="Deploy the Fly.io public proxy"
 set -euo pipefail
 cd "$(dirname "$0")/../fly"
 fly deploy "$@"
 ```
 **`mise-tasks/fly-setup`:**
 ```bash
 #!/usr/bin/env bash
 #MISE description="One-time setup: configure Fly.io secrets and certs (idempotent)"
 set -euo pipefail
 APP="blumeops-proxy"
 # Fetch Tailscale auth key from 1Password
 TS_AUTHKEY=$(op --vault vg6xf6vvfmoh5hqjjhlhbeoaie item get <FLY_ITEM_ID> --fields ts-authkey --reveal)
 fly secrets set TS_AUTHKEY="$TS_AUTHKEY" -a "$APP"
 echo "Tailscale auth key set"
 # Add certs for all public domains (idempotent — fly ignores duplicates)
 fly certs add docs.eblu.me -a "$APP" 2>/dev/null || true
 # fly certs add wiki.eblu.me -a "$APP" 2>/dev/null || true  # future services
 echo "Certificates configured"
 echo "Done. Run 'mise run fly-deploy' to deploy."
 ```
 **`mise-tasks/fly-shutoff`:**
 ```bash
 #!/usr/bin/env bash
 #MISE description="Emergency shutoff: stop all Fly.io proxy machines"
 set -euo pipefail
 APP="blumeops-proxy"
 echo "EMERGENCY SHUTOFF: Stopping all machines for $APP"
 fly scale count 0 -a "$APP" --yes
 echo "All machines stopped. Public services are offline."
 echo "To restore: fly scale count 1 -a $APP"
 ```
 ### Step 5: Forgejo CI workflow
 **`.forgejo/workflows/deploy-fly.yaml`:**
 ```yaml
 name: Deploy Fly.io Proxy
 on:
  workflow_dispatch:
  push:
    branches: [main]
    paths:
      - 'fly/**'
 jobs:
  deploy:
    runs-on: k8s
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Install flyctl
        run: |
          curl -L https://fly.io/install.sh | sh
          echo "/root/.fly/bin" >> "$GITHUB_PATH"
      - name: Deploy to Fly.io
        env:
          FLY_API_TOKEN: ${{ secrets.FLY_DEPLOY_TOKEN }}
        run: |
          cd fly
          fly deploy
      - name: Verify health
        env:
          FLY_API_TOKEN: ${{ secrets.FLY_DEPLOY_TOKEN }}
        run: |
          fly status -a blumeops-proxy
          echo ""
          echo "Health check:"
          sleep 10
          curl -sf https://blumeops-proxy.fly.dev/healthz || echo "Warning: health check failed (may need DNS propagation)"
 ```
 The `FLY_DEPLOY_TOKEN` Forgejo Actions secret must be set via the [[forgejo]] API or UI, following the pattern in the `forgejo_actions_secrets` Ansible role.
 ---
 ## Per-service setup
 To expose an additional service (example: `wiki.eblu.me`):
 ### 1. Add nginx server block
 Edit `fly/nginx.conf` — add a new `server` block:
 ```nginx
 # --- wiki.eblu.me ---
 server {
    listen 8080;
    server_name wiki.eblu.me;
    limit_req zone=general burst=20 nodelay;
    location / {
        proxy_pass https://wiki.tail8d86e.ts.net;
        proxy_ssl_verify off;
        proxy_cache services;
        proxy_cache_valid 200 1d;
        proxy_cache_valid 404 1m;
        proxy_cache_use_stale error timeout updating;
        proxy_cache_lock on;
        proxy_cache_key $host$uri;
        proxy_ignore_headers Cache-Control Set-Cookie;
        add_header X-Cache-Status $upstream_cache_status;
    }
 }
 ```
 Adjust `proxy_cache_valid` and `proxy_cache_key` based on the service. For dynamic services with user sessions, you'll want shorter cache TTLs and may need to include query strings or cookies in the cache key.
 ### 2. Add DNS CNAME (Pulumi)
 Add to `pulumi/gandi/__main__.py`:
 ```python
 wiki_public = gandi.livedns.Record(
    "wiki-public",
    zone=domain,
    name="wiki",
    type="CNAME",
    ttl=300,
    values=["blumeops-proxy.fly.dev."],
 )
 ```
 Deploy: `mise run dns-preview` then `mise run dns-up`.
 ### 3. Add Fly.io certificate
 ```bash
 fly certs add wiki.eblu.me -a blumeops-proxy
 ```
 Or add it to `mise-tasks/fly-setup` so it's captured for future runs.
 ### 4. Deploy
 ```bash
 mise run fly-deploy
 ```
 Or push the `fly/nginx.conf` change to main — the Forgejo workflow deploys automatically.
 ### 5. Verify
 ```bash
 curl -I https://wiki.eblu.me
 # Should return 200 with X-Cache-Status header
 ```
 ### 6. Update Tailscale ACLs if needed
 If the new service uses a Tailscale tag not already in the `tag:flyio-proxy` grant, add it to `policy.hujson`.
 ---
 ## Security
 ### DDoS and rate limiting
 This approach provides basic protection, not enterprise-grade:
 - **Fly.io Anycast** absorbs volumetric L3/L4 attacks
 - **nginx `limit_req`** caps per-IP request rates at the container level
 - **nginx `proxy_cache`** serves most requests from cache — only cache misses traverse the Tailscale tunnel to indri
 - **`proxy_cache_key $host$uri`** ignores query strings, preventing trivial cache-busting
 - **`proxy_ignore_headers Cache-Control`** prevents clients from forcing cache misses
 This is sufficient for a personal documentation site. It is **not** sufficient for a service that might attract targeted attacks. For enterprise-grade DDoS protection, Cloudflare Tunnel is the better approach (requires DNS migration, see plan history in git).
 ### What fail2ban is (and why it doesn't apply)
 fail2ban monitors logs for repeated failed authentication attempts (SSH brute force, bad login passwords) and bans IPs via firewall rules. A static site with no authentication has no login surface for fail2ban to monitor. It is a tool for services with user sessions, not for CDN/proxy protection.
 ### Break-glass shutoff
 If the proxy is causing issues (DDoS, unexpected traffic, bandwidth consumption on the home network):
 **Level 1 — Stop the container (seconds, reversible):**
 ```bash
 mise run fly-shutoff
 # or: fly scale count 0 -a blumeops-proxy --yes
 ```
 All public services go offline immediately. Tailscale tunnel drops. Zero traffic reaches indri. Restore with `fly scale count 1 -a blumeops-proxy`.
 **Level 2 — Revoke Tailscale access (seconds):**
 Remove the `flyio-proxy` node in the Tailscale admin console. Even if the container is running, it cannot reach the tailnet. Use this if the container itself may be compromised.
 **Level 3 — Remove DNS (minutes to hours):**
 Delete the CNAME records at Gandi. Takes time for DNS propagation but is the permanent shutoff.
 **Level 1 is the primary response.** It is a single command, takes effect in seconds, and is trivially reversible. Document the `mise run fly-shutoff` command somewhere easily accessible (e.g., pinned in a notes app) so it can be run quickly under stress.
 ---
 ## IaC summary
 | Component | Managed by | Declarative? |
 |-----------|------------|:---:|
 | Tailscale auth key | Pulumi (`pulumi/tailscale/`) | yes |
 | Tailscale ACLs | Pulumi (`pulumi/tailscale/policy.hujson`) | yes |
 | DNS CNAMEs | Pulumi (`pulumi/gandi/`) | yes |
 | Container + app config | `fly/Dockerfile` + `fly/fly.toml` in repo | yes |
 | Deployment | Forgejo CI on push to `fly/`, or `mise run fly-deploy` | yes |
 | Fly.io secrets + certs | `mise run fly-setup` (one-time, idempotent) | semi |
 The "semi" for Fly.io secrets is a one-time operation backed by a repeatable mise task. Fly.io does not have a mature Pulumi or Terraform provider, so `fly.toml` + `flyctl` is the standard IaC model for Fly.io apps.
 ---
 ## Verification
-1. `curl -I https://docs.eblu.me` from public internet — returns 200 with `cf-ray` header
+After initial deployment of a service (using `docs.eblu.me` as example):
-2. `dig docs.eblu.me` — shows Cloudflare IPs (not Tailscale IP)
+
-3. `dig forge.ops.eblu.me` — still shows `100.98.163.89` (Tailscale IP)
+1. `curl -I https://docs.eblu.me` — returns 200 with `X-Cache-Status` header
-4. All `*.ops.eblu.me` services accessible from tailnet
+2. `dig docs.eblu.me` — resolves to Fly.io IPs (not Tailscale IP)
 3. `dig forge.ops.eblu.me` — still resolves to `100.98.163.89` (unchanged)
 4. All `*.ops.eblu.me` services work from tailnet
 5. `mise run services-check` passes
-6. Caddy TLS renewal works (force test with `caddy reload` if needed)
+6. `fly status -a blumeops-proxy` shows healthy machine
-7. Cloudflare dashboard shows tunnel healthy and cache hits
+7. Second request to same URL shows `X-Cache-Status: HIT`
 ## Risks
 | Risk | Mitigation |
 |------|------------|
 | Caddy TLS renewal fails after NS change | Deploy Caddy update immediately; existing certs valid ~90 days |
 | DNS propagation delay (24-48h) | Set low TTLs before migration; monitor with `dig +trace` |
 | cloudflared crashes | K8s restarts it; Cloudflare serves cached content |
 | Tunnel credentials leak | 1Password + ExternalSecret; tunnel only routes to docs |
 ## Adding more public services
 To expose another service publicly (e.g., `wiki.eblu.me`):
 1. Add DNS record + tunnel ingress rule in `pulumi/cloudflare/__main__.py`
 2. Run `mise run cloudflare-up`
 3. No changes to cloudflared deployment (remotely-managed tunnel config)
--- a/docs/how-to/how-to.md
+++ b/docs/how-to/how-to.md
@ -22,7 +22,7 @@ Task-oriented instructions for common BlumeOps operations. These guides assume y
 | [[update-tailscale-acls]] | Update Tailscale access control policies |
 | [[gandi-operations]] | Manage DNS records and cycle the Gandi API token |
 | [[use-pypi-proxy]] | Configure pip and publish packages to devpi |
-| [[expose-service-publicly]] | Expose a service to the public internet via Cloudflare Tunnel |
+| [[expose-service-publicly]] | Expose a service to the public internet via Fly.io + Tailscale |
 ## Documentation
--- a/mise-tasks/docs-check-links
+++ b/mise-tasks/docs-check-links
@ -125,17 +125,28 @@ def main() -> int:
            if has_spaces:
                # Links with spaces in target or around pipe are not allowed
                spaced_links.append((rel_path, line_num, target))
-            elif "/" in target:
+                continue
            # Handle anchor links: [[#Heading]] or [[file#Heading]]
            # Strip the #fragment for validation; pure anchors (#Heading) skip file check
            file_target = target
            if "#" in target:
                file_target = target.split("#", 1)[0]
                if not file_target:
                    # Pure in-page anchor like [[#Break-glass shutoff]] — always valid
                    continue
            if "/" in file_target:
                # Path-based links are not allowed - use simple filenames only
                path_links.append((rel_path, line_num, target))
-            elif target in ambiguous_filenames:
+            elif file_target in ambiguous_filenames:
                # Link uses an ambiguous filename - needs to be renamed
-                ambiguous_links.append((rel_path, line_num, target, filename_counts[target]))
+                ambiguous_links.append((rel_path, line_num, target, filename_counts[file_target]))
-            elif target not in valid_targets:
+            elif file_target not in valid_targets:
                broken_links.append((rel_path, line_num, target))
-            elif target != source_stem:
+            elif file_target != source_stem:
                # Valid link to a different doc — record it for orphan detection
-                linked_stems.add(target)
+                linked_stems.add(file_target)
    # Print results
    console.print("[bold]Wiki-Link Validation[/bold]")
`@ -1 +1 @@`
	`Add how-to guide for exposing services publicly via Cloudflare Tunnel.`	`Add how-to guide for exposing services publicly via Fly.io reverse proxy + Tailscale tunnel.`