Update docs for public proxy: docs.eblu.me is canonical URL

- Replace docs.ops.eblu.me with docs.eblu.me across all references
- Add Fly.io proxy reference card and operations how-to
- Move shutoff escalation levels to manage-flyio-proxy how-to
- Update index, Caddy, and docs reference cards with Fly.io context
- Update homepage link in docs ingress annotation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Erich Blume 2026-02-08 02:32:42 -08:00
commit 9a5444851c
13 changed files with 169 additions and 19 deletions

View file

@ -80,6 +80,6 @@ This repo uses [Forgejo Actions](https://forgejo.org/docs/latest/user/actions/)
## Documentation
Documentation lives in `docs/` and follows the [Diataxis](https://diataxis.fr/) framework. Published at https://docs.ops.eblu.me.
Documentation lives in `docs/` and follows the [Diataxis](https://diataxis.fr/) framework. Published at https://docs.eblu.me.
Docs use [Obsidian](https://obsidian.md) wiki-link syntax (`[[link]]`) for cross-references. Edit with any markdown editor, or use [obsidian.nvim](https://github.com/obsidian-nvim/obsidian.nvim) for enhanced navigation.

View file

@ -11,7 +11,7 @@ metadata:
gethomepage.dev/group: "Apps"
gethomepage.dev/icon: "mdi-book-open-page-variant"
gethomepage.dev/description: "BlumeOps Documentation"
gethomepage.dev/href: "https://docs.ops.eblu.me"
gethomepage.dev/href: "https://docs.eblu.me"
gethomepage.dev/pod-selector: "app=docs"
spec:
ingressClassName: tailscale

View file

@ -0,0 +1 @@
Update docs for public proxy: canonical URL is now docs.eblu.me, add Fly.io proxy reference card and operations how-to

View file

@ -655,22 +655,13 @@ Setup considerations for Forgejo specifically:
### Break-glass shutoff
If the proxy is causing issues (DDoS, unexpected traffic, bandwidth consumption on the home network):
If the proxy is causing issues, stop it immediately:
**Level 1 — Stop the container (seconds, reversible):**
```bash
mise run fly-shutoff
# or: fly scale count 0 -a blumeops-proxy --yes
```
All public services go offline immediately. Tailscale tunnel drops. Zero traffic reaches indri. Restore with `fly scale count 1 -a blumeops-proxy`.
**Level 2 — Revoke Tailscale access (seconds):**
Remove the `flyio-proxy` node in the Tailscale admin console. Even if the container is running, it cannot reach the tailnet. Use this if the container itself may be compromised.
**Level 3 — Remove DNS (minutes to hours):**
Delete the CNAME records at Gandi. Takes time for DNS propagation but is the permanent shutoff.
**Level 1 is the primary response.** It is a single command, takes effect in seconds, and is trivially reversible. Document the `mise run fly-shutoff` command somewhere easily accessible (e.g., pinned in a notes app) so it can be run quickly under stress.
This stops all machines in seconds — zero traffic reaches indri. See [[manage-flyio-proxy#Emergency Shutoff]] for the full escalation ladder (container stop → Tailscale revoke → DNS removal).
---

View file

@ -41,4 +41,5 @@ Task-oriented instructions for common BlumeOps operations. These guides assume y
| Guide | Description |
|-------|-------------|
| [[restart-indri]] | Safely shut down and restart indri |
| [[manage-flyio-proxy]] | Deploy, shutoff, and troubleshoot the public proxy |
| [[troubleshooting]] | Diagnose and fix common issues |

View file

@ -0,0 +1,88 @@
---
title: Manage Fly.io Proxy
tags:
- how-to
- fly-io
- networking
- operations
---
# Manage Fly.io Proxy
Operational tasks for the [[flyio-proxy]] public reverse proxy.
## Deploy Changes
After modifying files in `fly/`:
```bash
mise run fly-deploy
```
Pushes to `fly/` on main also trigger automatic deployment via the Forgejo CI workflow.
## Add a New Public Service
See [[expose-service-publicly#Per-service setup]] for the full walkthrough. In short:
1. Add a `server` block to `fly/nginx.conf`
2. Add a Fly.io certificate: `fly certs add <domain> -a blumeops-proxy`
3. Deploy: `mise run fly-deploy`
4. Verify against `blumeops-proxy.fly.dev` with a `Host` header
5. Add DNS CNAME via Pulumi: `mise run dns-preview` then `mise run dns-up`
## Emergency Shutoff
If the proxy is causing issues (DDoS, unexpected traffic, bandwidth consumption on the home network):
**Level 1 — Stop the container (seconds, reversible):**
```bash
mise run fly-shutoff
# or: fly scale count 0 -a blumeops-proxy --yes
```
All public services go offline immediately. Tailscale tunnel drops. Zero traffic reaches indri. Restore with `fly scale count 1 -a blumeops-proxy`.
**Level 2 — Revoke Tailscale access (seconds):**
Remove the `flyio-proxy` node in the Tailscale admin console. Even if the container is running, it cannot reach the tailnet. Use this if the container itself may be compromised.
**Level 3 — Remove DNS (minutes to hours):**
Delete the CNAME records at Gandi. Takes time for DNS propagation but is the permanent shutoff.
**Level 1 is the primary response.** It is a single command, takes effect in seconds, and is trivially reversible. Keep `mise run fly-shutoff` somewhere easily accessible (e.g., pinned in a notes app) so it can be run quickly under stress.
## Check Status
```bash
# App and machine status
fly status -a blumeops-proxy
# Live logs
fly logs -a blumeops-proxy
# Health check
curl -sf https://blumeops-proxy.fly.dev/healthz
# Certificate status
fly certs list -a blumeops-proxy
```
## Rotate Tailscale Auth Key
The auth key expires every 90 days. To rotate:
1. Re-apply Pulumi to generate a new key: `mise run tailnet-up`
2. Re-run setup to stage the new secret: `mise run fly-setup`
3. Deploy to pick up the new secret: `mise run fly-deploy`
## Troubleshooting
**502 Bad Gateway**: Check `fly logs` for nginx upstream errors. Verify the backend Tailscale service is running (`tailscale status` from inside the container via `fly ssh console`).
**Health check failing**: `fly ssh console -a blumeops-proxy` then `curl localhost:8080/healthz` to test locally.
**TLS errors on custom domain**: Check cert status with `fly certs show <domain> -a blumeops-proxy`. Certs auto-provision via Let's Encrypt and may take a few minutes.
## Related
- [[flyio-proxy]] - Service reference card
- [[expose-service-publicly]] - Full setup guide and architecture

View file

@ -8,7 +8,7 @@ tags:
# Update Documentation
How to publish documentation changes to https://docs.ops.eblu.me.
How to publish documentation changes to https://docs.eblu.me.
## Quick Release

View file

@ -22,8 +22,10 @@ editor of choice. (I recommend vim.)
These services run on my home [[hosts|infrastructure]], primarily an m1 mac
mini named [[indri]] and a Synology NAS called [[sifaka]]. The infrastructure
is networked via [[tailscale]], with the domain `eblu.me` hosted via [[gandi]]
with [[caddy]] providing a reverse proxy to resolve tailnet devices.
is networked via [[tailscale]], with the domain `eblu.me` hosted via [[gandi]],
[[caddy]] providing a private reverse proxy for tailnet devices, and
[[flyio-proxy|Fly.io]] serving public-facing services like
[this documentation site](https://docs.eblu.me).
The goal of BlumeOps is threefold:

View file

@ -34,6 +34,7 @@ Individual service reference cards with URLs and configuration details.
| [[zot]] | Container registry | indri |
| [[devpi]] | PyPI caching proxy | k8s |
| [[docs]] | Documentation site (Quartz) | k8s |
| [[flyio-proxy]] | Public reverse proxy (Fly.io + Tailscale) | Fly.io |
| [[automounter]] | SMB share automounter | indri |
## Infrastructure

View file

@ -47,7 +47,7 @@ K8s services are proxied via their Tailscale Ingress endpoints:
|-----------|---------|---------|
| `grafana.ops.eblu.me` | `grafana.tail8d86e.ts.net` | [[grafana]] |
| `argocd.ops.eblu.me` | `argocd.tail8d86e.ts.net` | [[argocd]] |
| `docs.ops.eblu.me` | `docs.tail8d86e.ts.net` | [[docs]] |
| `docs.ops.eblu.me` | `docs.tail8d86e.ts.net` | [[docs]] (now publicly available at `docs.eblu.me` via [[flyio-proxy]]) |
| `feed.ops.eblu.me` | `feed.tail8d86e.ts.net` | [[miniflux]] |
| ... | ... | (see defaults/main.yml for full list) |

View file

@ -13,11 +13,13 @@ Documentation site built with [Quartz](https://quartz.jzhao.xyz/) and served via
| Property | Value |
|----------|-------|
| **URL** | https://docs.ops.eblu.me |
| **Public URL** | https://docs.eblu.me |
| **Private URL** | `docs.ops.eblu.me` (tailnet only, via [[caddy]]) |
| **Namespace** | `docs` |
| **Container** | `registry.ops.eblu.me/blumeops/quartz:v1.0.0` |
| **Source** | `docs/` directory in blumeops repo |
| **Build** | Forgejo workflow `build-blumeops.yaml` |
| **Public proxy** | [[flyio-proxy]] (Fly.io → Tailscale tunnel) |
## Architecture

View file

@ -0,0 +1,64 @@
---
title: Fly.io Proxy
tags:
- service
- networking
- fly-io
---
# Fly.io Proxy
Public reverse proxy on [Fly.io](https://fly.io) that exposes selected BlumeOps services to the internet via a Tailscale tunnel back to the homelab.
## Quick Reference
| Property | Value |
|----------|-------|
| **App** | `blumeops-proxy` |
| **Region** | `sjc` (San Jose) |
| **Fly.io URL** | `blumeops-proxy.fly.dev` |
| **Config** | `fly/` directory in repo |
| **IaC** | `fly/fly.toml` (app), Pulumi (DNS + auth key) |
## Exposed Services
| Public domain | Backend | Service |
|---------------|---------|---------|
| `docs.eblu.me` | `docs.tail8d86e.ts.net` | [[docs]] |
## Architecture
Internet traffic hits Fly.io's Anycast edge, terminates TLS with a Let's Encrypt certificate, and is proxied by nginx to the backend service over a Tailscale WireGuard tunnel. See [[expose-service-publicly]] for the full architecture diagram.
## Key Files
| File | Purpose |
|------|---------|
| `fly/fly.toml` | App configuration |
| `fly/Dockerfile` | nginx + Tailscale container |
| `fly/nginx.conf` | Reverse proxy, caching, rate limiting |
| `fly/start.sh` | Entrypoint: start Tailscale, then nginx |
| `pulumi/tailscale/__main__.py` | Auth key (`tag:flyio-proxy`) |
| `pulumi/tailscale/policy.hujson` | ACL grants for proxy |
| `pulumi/gandi/__main__.py` | DNS CNAMEs |
## Networking
Fly.io runs Firecracker microVMs which support TUN devices natively. Tailscale runs with a real TUN interface (not userspace networking), so MagicDNS and direct Tailscale IP routing work normally.
The Tailscale auth key is `preauthorized=True` to avoid device approval hangs on container restarts.
## Secrets
| Secret | Source | Description |
|--------|--------|-------------|
| `TS_AUTHKEY` | Pulumi state → `fly secrets` | Tailscale auth key for joining tailnet |
| `FLY_DEPLOY_TOKEN` | Fly.io → 1Password | Deploy token for CI |
## Related
- [[expose-service-publicly]] - Setup guide for adding new public services
- [[manage-flyio-proxy]] - Operational tasks (deploy, shutoff, troubleshoot)
- [[caddy]] - Private reverse proxy for `*.ops.eblu.me` (separate system)
- [[tailscale]] - WireGuard mesh network
- [[gandi]] - DNS hosting

View file

@ -67,7 +67,7 @@ Documentation uses `[[wiki-links]]` for cross-references:
- `[[service-name]]` links to a reference page
- `[[page|Display Text]]` customizes the link text
When reading on the web (docs.ops.eblu.me), these render as clickable links. The backlinks panel shows what references each page.
When reading on the web (docs.eblu.me), these render as clickable links. The backlinks panel shows what references each page.
Pre-commit hooks automatically validate that all wiki-links point to existing files and that link targets are unambiguous.