Add dynamic service guidance to public exposure guide

The guide was static-site-specific. Update to cover dynamic, authenticated services (e.g., Forgejo): - Add dynamic service nginx example with no blanket cache, proxy headers, WebSocket support, selective static asset caching - Expand DDoS section: explain why dynamic services are more vulnerable (no cache absorbing traffic) and what mitigations exist - Rewrite fail2ban section: irrelevant for static, essential for dynamic services; runs on indri watching service logs, needs forwarded IP headers - Add comparison table: static vs dynamic across caching, sessions, rate limits, proxy headers, fail2ban, DDoS exposure - Add pre-exposure checklist for dynamic services - Note Tailscale ACL differences for non-k8s services (e.g., Forgejo on indri needs tag:homelab grant, not tag:k8s) - Add inline comments in nginx.conf marking static-only directives Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 00:26:21 -08:00 · 2026-02-08 00:26:21 -08:00 · 35b43083a8
commit 35b43083a8
parent 9874c85d1c
1 changed files with 174 additions and 18 deletions
--- a/docs/how-to/expose-service-publicly.md
+++ b/docs/how-to/expose-service-publicly.md
@ -55,7 +55,7 @@ infrastructure. They can continue to operate in parallel for private access.
 | DNS | CNAME at [[gandi]] | No DNS migration needed, no Cloudflare dependency |
 | TLS (public) | Fly.io auto-provisions Let's Encrypt | No cert management, `$0.10/mo` per hostname |
 | TLS (origin) | Tailscale handles encryption | WireGuard tunnel encrypts all traffic |
-| CDN/cache | nginx `proxy_cache` in container | Aggressive caching for static content, sufficient for personal sites |
+| CDN/cache | nginx `proxy_cache` in container | Per-service: aggressive for static sites, selective or disabled for dynamic services |
 | DDoS | Fly.io Anycast + nginx rate limiting | Not enterprise-grade; see [[#Break-glass shutoff]] |
 | IaC | `fly/` directory in repo, Pulumi for DNS + TS key | No well-maintained Fly.io Pulumi provider; `fly.toml` is the app's IaC |

@ -173,6 +173,9 @@ nginx -g "daemon off;"

 **`fly/nginx.conf`** — reverse proxy with caching and rate limiting:

+> The example below shows a **static site** configuration (docs.eblu.me).
+> For dynamic services, see [[#Considerations for dynamic services]].
+
 ```nginx
 worker_processes auto;

@ -184,14 +187,14 @@ http {
    include /etc/nginx/mime.types;
    default_type application/octet-stream;

-    # Rate limiting: 10 requests/sec per IP, burst of 20
+    # Rate limiting zones — define per-service zones as needed
    limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;

    # Proxy cache: 200MB, evict after 24h of no access
    proxy_cache_path /tmp/cache levels=1:2 keys_zone=services:10m
                     max_size=200m inactive=24h;

-    # --- docs.eblu.me ---
+    # --- docs.eblu.me (static site) ---
    server {
        listen 8080;
        server_name docs.eblu.me;
@ -202,7 +205,8 @@ http {
            proxy_pass https://docs.tail8d86e.ts.net;
            proxy_ssl_verify off;

-            # Cache aggressively — static site
+            # Cache aggressively — static site only.
+            # Do NOT use these settings for dynamic services.
            proxy_cache services;
            proxy_cache_valid 200 1d;
            proxy_cache_valid 404 1m;
@ -210,7 +214,8 @@ http {
            proxy_cache_lock on;

            # Prevent cache-busting: ignore query strings and
-            # client cache-control headers
+            # client cache-control headers.
+            # Safe for static sites; breaks dynamic services.
            proxy_cache_key $host$uri;
            proxy_ignore_headers Cache-Control Set-Cookie;

@ -394,10 +399,13 @@ To expose an additional service (example: `wiki.eblu.me`):

 ### 1. Add nginx server block

-Edit `fly/nginx.conf` — add a new `server` block:
+Edit `fly/nginx.conf` — add a new `server` block. The configuration
+differs significantly between static and dynamic services.
+
+**Static site example** (same pattern as docs):

 ```nginx
-# --- wiki.eblu.me ---
+# --- wiki.eblu.me (static) ---
 server {
    listen 8080;
    server_name wiki.eblu.me;
@ -421,7 +429,66 @@ server {
 }
 ```

-Adjust `proxy_cache_valid` and `proxy_cache_key` based on the service. For dynamic services with user sessions, you'll want shorter cache TTLs and may need to include query strings or cookies in the cache key.
+**Dynamic service example** (e.g., Forgejo):
+
+```nginx
+# --- forge.eblu.me (dynamic, authenticated) ---
+server {
+    listen 8080;
+    server_name forge.eblu.me;
+
+    # Higher rate limit — git operations, CI webhooks, and API calls
+    # can legitimately burst. Forgejo also has its own rate limiting,
+    # so this is a safety net, not the primary control.
+    limit_req zone=general burst=50 nodelay;
+
+    # Git LFS and repo uploads can be large
+    client_max_body_size 512m;
+
+    location / {
+        proxy_pass https://forge.tail8d86e.ts.net;
+        proxy_ssl_verify off;
+
+        # NO proxy_cache — dynamic content with sessions.
+        # Caching would serve stale pages and break authentication.
+
+        # Pass through headers needed for proper proxying
+        proxy_set_header Host $host;
+        proxy_set_header X-Real-IP $remote_addr;
+        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+        proxy_set_header X-Forwarded-Proto $scheme;
+
+        # WebSocket support (Forgejo uses it for live updates)
+        proxy_http_version 1.1;
+        proxy_set_header Upgrade $http_upgrade;
+        proxy_set_header Connection "upgrade";
+    }
+
+    # Selectively cache static assets only
+    location ~* \.(css|js|png|jpg|svg|woff2?)$ {
+        proxy_pass https://forge.tail8d86e.ts.net;
+        proxy_ssl_verify off;
+
+        proxy_cache services;
+        proxy_cache_valid 200 7d;
+        proxy_cache_key $host$uri;
+
+        add_header X-Cache-Status $upstream_cache_status;
+    }
+}
+```
+
+Key differences for dynamic services:
+- **No blanket caching** — only static assets (CSS, JS, images) are cached
+- **Respect `Set-Cookie`** — do not ignore session headers
+- **Include query strings** in non-cached requests (default behavior when
+  `proxy_cache_key` is not overridden)
+- **Higher rate limits** — legitimate usage patterns are burstier
+- **Proxy headers** — pass `X-Real-IP`, `X-Forwarded-For`, `X-Forwarded-Proto`
+  so the backend sees the real client IP (important for Forgejo's audit logs
+  and its own rate limiting)
+- **WebSocket support** — many modern web apps use WebSockets
+- **Larger body size** — git pushes and file uploads need more than the default 1MB

 ### 2. Add DNS CNAME (Pulumi)

@ -465,7 +532,18 @@ curl -I https://wiki.eblu.me

 ### 6. Update Tailscale ACLs if needed

-If the new service uses a Tailscale tag not already in the `tag:flyio-proxy` grant, add it to `policy.hujson`.
+The one-time setup grants `tag:flyio-proxy` access to `tag:k8s` on port
+443. If the new service needs a different grant, add it to
+`policy.hujson`. Examples:
+
+- **Another k8s service** (e.g., Kiwix): No ACL change needed — already
+  covered by `tag:k8s:443`.
+- **Forgejo on indri**: Needs a new grant for `tag:homelab` on the
+  relevant ports (e.g., `tcp:3001` for HTTP, `tcp:2200` for SSH). Add
+  this as a separate, narrow grant — do not widen the existing one.
+- **Non-Tailscale-ingress service**: If the backend uses `tailscale
+  serve` instead of the k8s Tailscale operator, the Tailscale node will
+  have its own tag. Grant `tag:flyio-proxy` access to that specific tag.

 ---

@ -477,15 +555,61 @@ This approach provides basic protection, not enterprise-grade:

 - **Fly.io Anycast** absorbs volumetric L3/L4 attacks
 - **nginx `limit_req`** caps per-IP request rates at the container level
- **nginx `proxy_cache`** serves most requests from cache — only cache misses traverse the Tailscale tunnel to indri
- **`proxy_cache_key $host$uri`** ignores query strings, preventing trivial cache-busting
- **`proxy_ignore_headers Cache-Control`** prevents clients from forcing cache misses
+- **nginx `proxy_cache`** serves most requests from cache — only cache
+  misses traverse the Tailscale tunnel to indri

-This is sufficient for a personal documentation site. It is **not** sufficient for a service that might attract targeted attacks. For enterprise-grade DDoS protection, Cloudflare Tunnel is the better approach (requires DNS migration, see plan history in git).
+For **static sites**, the cache is the primary defense. Most requests
+never reach the origin. Cache-busting is mitigated by ignoring query
+strings (`proxy_cache_key $host$uri`) and client cache-control headers.

-### What fail2ban is (and why it doesn't apply)
+For **dynamic services**, the cache covers only static assets. Most
+requests flow through the Tailscale tunnel to indri on every hit. This
+makes dynamic services significantly more vulnerable to L7 DDoS — an
+attacker sending high volumes of legitimate-looking requests (login
+pages, API endpoints, search queries) bypasses the cache entirely.
+Mitigations for dynamic services:

-fail2ban monitors logs for repeated failed authentication attempts (SSH brute force, bad login passwords) and bans IPs via firewall rules. A static site with no authentication has no login surface for fail2ban to monitor. It is a tool for services with user sessions, not for CDN/proxy protection.
+- nginx `limit_req` is the primary defense at the proxy layer — tune
+  the rate and burst per service
+- The backend service's own rate limiting (e.g., Forgejo's built-in
+  rate limiter) provides a second layer
+- fail2ban on indri (see below) can block IPs showing abuse patterns
+- The break-glass shutoff remains the last resort
+
+If a publicly exposed dynamic service attracts targeted attacks or the
+home network bandwidth is impacted, consider migrating to Cloudflare
+Tunnel for enterprise-grade DDoS protection (requires DNS migration;
+see plan history in git).
+
+### fail2ban
+
+fail2ban monitors log files for repeated failed authentication attempts
+(SSH brute force, bad login passwords, API abuse) and bans IPs via
+firewall rules.
+
+**Static sites**: fail2ban does not apply. There is no login surface,
+no sessions, no credentials to brute force.
+
+**Dynamic services with authentication** (e.g., Forgejo): fail2ban is
+relevant and should be configured on **indri**, not on Fly.io. The
+nginx proxy is transparent — it forwards requests but does not see
+authentication outcomes. fail2ban watches the service's own logs on
+indri for patterns like repeated failed logins.
+
+Setup considerations for Forgejo specifically:
+
+- Forgejo logs failed auth attempts to its log file
+- fail2ban needs a filter matching Forgejo's log format
+- Banned IPs are blocked at indri's firewall (the Fly.io proxy IP is
+  the Tailscale address of the `flyio-proxy` node, not the end user's
+  IP)
+- **Important**: for fail2ban to see real client IPs, the nginx proxy
+  must pass `X-Real-IP` / `X-Forwarded-For` headers (included in the
+  dynamic service nginx config above), and Forgejo must be configured
+  to trust the proxy and log the forwarded IP rather than the proxy's
+  Tailscale IP
+- Disable open user registration before exposing Forgejo publicly —
+  require explicit invites

 ### Break-glass shutoff

@ -508,6 +632,38 @@ Delete the CNAME records at Gandi. Takes time for DNS propagation but is the per

 ---

+## Considerations for dynamic services
+
+The architecture described in this guide works for both static and dynamic
+services, but the nginx configuration and security posture differ
+significantly. This section summarizes what changes when exposing a
+dynamic, authenticated service like [[forgejo]].
+
+| Concern | Static site | Dynamic service |
+|---------|-------------|-----------------|
+| Caching | Aggressive (cache everything, 1d TTL) | Static assets only, or disabled |
+| Session cookies | Ignored (`proxy_ignore_headers Set-Cookie`) | Must be passed through |
+| Query strings | Ignored in cache key | Included (default behavior) |
+| Rate limiting | 10r/s is plenty | Higher burst needed; coordinate with backend rate limiter |
+| Request body size | Default 1MB is fine | Increase for uploads (`client_max_body_size`) |
+| WebSocket | Not needed | Often needed (`proxy_http_version 1.1`, `Upgrade` headers) |
+| Proxy headers | Optional | Required (`X-Real-IP`, `X-Forwarded-For`, `X-Forwarded-Proto`) |
+| fail2ban | Not applicable | Configure on indri, watching service logs |
+| DDoS exposure | Low — cache absorbs most traffic | Higher — most requests hit origin |
+| Pre-exposure checklist | Deploy and go | Disable open registration, audit access controls, configure fail2ban |
+
+### Checklist before exposing a dynamic service
+
+- [ ] Disable open user registration (require invites or admin approval)
+- [ ] Audit access controls and permissions
+- [ ] Configure the service to log the forwarded client IP (not the proxy IP)
+- [ ] Set up fail2ban on indri with a filter for the service's log format
+- [ ] Add narrow Tailscale ACL grant for `tag:flyio-proxy` to the service
+- [ ] Test the nginx config locally or in staging before deploying
+- [ ] Rehearse the break-glass shutoff (`mise run fly-shutoff`)
+
+---
+
 ## IaC summary

 | Component | Managed by | Declarative? |