Restrict flyio-proxy ACLs to dedicated tag:flyio-target endpoints #126

Merged
eblume merged 7 commits from restrict-flyio-proxy-acl into main 2026-02-08 21:54:19 -08:00
10 changed files with 56 additions and 21 deletions
Showing only changes of commit 54db8643a1 - Show all commits

Update docs to reflect public service routing via Fly.io

- security-model: Replace "no public access" with Fly.io proxy description
- routing: Add *.eblu.me as third DNS domain for public services
- architecture: Add Fly.io to network layer and service routing table
- CLAUDE.md: Add public routing domain to routing table
- gandi: Add public CNAME records section
- tailscale-operator: Document ProxyGroup, VIP routing, per-Ingress tags
- flyio-proxy: Clarify why Alloy uses direct Tailscale endpoints (ACL)
- Remove hardcoded Tailscale IP (100.98.163.89) from docs, use DNS names

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Erich Blume 2026-02-08 21:53:07 -08:00

View file

@ -69,7 +69,8 @@ mise run provision-indri -- --check --diff # dry run
| Domain | Mechanism | Reachable from |
|--------|-----------|----------------|
| `*.ops.eblu.me` | Caddy on indri (100.98.163.89) | everywhere incl. k8s pods |
| `*.eblu.me` | Fly.io proxy (Tailscale tunnel) | public internet |
| `*.ops.eblu.me` | Caddy on indri | k8s pods, containers, tailnet |
| `*.tail8d86e.ts.net` | Tailscale MagicDNS | tailnet clients only |
Check tailscale serve: `ssh indri 'tailscale serve status --json'`

View file

@ -42,15 +42,17 @@ Two always-on devices form the infrastructure backbone:
- All devices on tailnet `tail8d86e.ts.net`
- ACLs control access between devices and services
- MagicDNS provides `*.tail8d86e.ts.net` hostnames
- No port forwarding or public IPs needed
- No port forwarding or public IPs on homelab devices
- Selected services exposed publicly via [[flyio-proxy]] (Fly.io → Tailscale tunnel)
## Service Routing
Two DNS domains route to services:
Three DNS domains route to services:
| Domain | Mechanism | Reachable from |
|--------|-----------|----------------|
| `*.ops.eblu.me` | Caddy reverse proxy on indri | Everywhere (k8s pods, containers, tailnet) |
| `*.eblu.me` | [[flyio-proxy]] (Fly.io → Tailscale tunnel) | Public internet |
| `*.ops.eblu.me` | Caddy reverse proxy on indri | k8s pods, containers, tailnet clients |
| `*.tail8d86e.ts.net` | Tailscale MagicDNS | Tailnet clients only |
See [[routing]] for details on when to use which.

View file

@ -17,18 +17,22 @@ The foundational security decision is using [[tailscale]] as the network layer.
### Zero Trust Networking
BlumeOps has no public IP addresses or port forwarding. All services are only accessible via Tailscale:
BlumeOps infrastructure has no public IP addresses or port forwarding. Most services are only accessible via Tailscale:
- **No attack surface** from the public internet
- **Encrypted by default** - WireGuard encryption for all traffic
- **Identity-based access** - ACLs based on user/device identity, not IP addresses
- **Minimal public surface** - only selected services are exposed via [[flyio-proxy]]
### Public Access via Fly.io
A small number of services are exposed to the internet through a reverse proxy on Fly.io that tunnels back to the homelab over Tailscale. The proxy uses restricted ACLs (`tag:flyio-target`) so it can only reach explicitly tagged endpoints — a compromised proxy cannot route to arbitrary services on the tailnet. See [[flyio-proxy]] for details and [[expose-service-publicly]] for the security considerations.
### Defense in Depth
Even within the tailnet, access is restricted:
```
Internet ──X──▶ Services (no public access)
Internet ──▶ Fly.io proxy ──▶ tag:flyio-target only (docs, observability)
Tailnet:
Admin ────────▶ All services

View file

@ -732,5 +732,5 @@ After deploying DNS (`mise run dns-up`):
1. `curl -I https://docs.eblu.me` — returns 200 with `X-Cache-Status` header
2. `dig docs.eblu.me` — resolves to Fly.io IPs (not Tailscale IP)
3. `dig forge.ops.eblu.me` — still resolves to `100.98.163.89` (unchanged)
3. `dig forge.ops.eblu.me` — still resolves to indri's Tailscale IP (unchanged)
4. Second request to same URL shows `X-Cache-Status: HIT`

View file

@ -74,10 +74,10 @@ A successful preview confirms the new PAT is working.
## Break-Glass Override
If MagicDNS is unavailable and Pulumi can't resolve indri's IP, set the target IP manually:
If MagicDNS is unavailable and Pulumi can't resolve indri's IP, set the target IP manually. Find indri's current Tailscale IP via `tailscale status` or the admin console:
```bash
export BLUMEOPS_REVERSE_PROXY_IP=100.98.163.89
export BLUMEOPS_REVERSE_PROXY_IP=<indri-tailscale-ip>
mise run dns-up
```

View file

@ -21,18 +21,30 @@ DNS hosting provider for the `eblu.me` domain, managed via Pulumi IaC.
## What It Does
Gandi hosts the DNS records that make `*.ops.eblu.me` resolve to [[indri]]'s Tailscale IP (100.98.163.89). Since Tailscale IPs are not publicly routable, this gives services real DNS names while keeping them private to the tailnet.
Gandi hosts the DNS records that make `*.ops.eblu.me` resolve to [[indri]]'s Tailscale IP (`indri.tail8d86e.ts.net`). Since Tailscale IPs are not publicly routable, this gives services real DNS names while keeping them private to the tailnet.
The target IP is resolved dynamically from `indri.tail8d86e.ts.net` at deploy time, so if indri's Tailscale IP changes, re-running the deployment is sufficient.
## DNS Records
### Private services (Caddy on indri)
| Record | Type | Value | TTL |
|--------|------|-------|-----|
| `*.ops.eblu.me` | A | indri's Tailscale IP | 300s |
| `ops.eblu.me` | A | indri's Tailscale IP | 300s |
Both records point to [[indri]], which runs [[caddy]] as the reverse proxy for all services. See [[routing]] for the full service URL map.
Both records point to [[indri]], which runs [[caddy]] as the reverse proxy for all private services.
### Public services (Fly.io proxy)
| Record | Type | Value | TTL |
|--------|------|-------|-----|
| `docs.eblu.me` | CNAME | `blumeops-proxy.fly.dev` | 300s |
Public CNAMEs point to [[flyio-proxy]] on Fly.io. See [[expose-service-publicly]] for adding new public services.
See [[routing]] for the full service URL map.
## Pulumi Configuration

View file

@ -16,7 +16,7 @@ Primary BlumeOps server. Mac Mini M1 (2020).
| **Model** | Mac mini M1, 2020 (Macmini9,1) |
| **Storage** | 2TB internal SSD |
| **macOS** | 15.7.3 (Sequoia) |
| **Tailscale IP** | 100.98.163.89 |
| **Tailscale hostname** | `indri.tail8d86e.ts.net` |
| **Tailscale Tag** | `tag:homelab` |
| **UPS** | Anker SOLIX F2000 GaNPrime |

View file

@ -7,20 +7,21 @@ tags:
# Service Routing
Services are accessible via two DNS domains with different reachability.
Services are accessible via three DNS domains with different reachability.
## DNS Domains
| Domain | Proxy | Reachable From |
|--------|-------|----------------|
| `*.eblu.me` | [[flyio-proxy]] (Fly.io → Tailscale tunnel) | Public internet |
| `*.ops.eblu.me` | Caddy on indri | k8s pods, docker containers, tailnet clients |
| `*.tail8d86e.ts.net` | Tailscale MagicDNS | Tailnet clients only |
**Use `*.ops.eblu.me`** for services that need pod-to-service communication.
**Use `*.ops.eblu.me`** for services that need pod-to-service communication. Use `*.eblu.me` for services exposed publicly via Fly.io.
## Caddy Services (`*.ops.eblu.me`)
DNS points to indri's Tailscale IP (100.98.163.89). TLS via Let's Encrypt (ACME DNS-01 with Gandi).
DNS points to [[indri]]'s Tailscale IP. TLS via Let's Encrypt (ACME DNS-01 with Gandi).
| Service | URL | Description |
|---------|-----|-------------|
@ -40,6 +41,14 @@ DNS points to indri's Tailscale IP (100.98.163.89). TLS via Let's Encrypt (ACME
| [[postgresql]] | pg.ops.eblu.me:5432 | Database |
| [[sifaka|Sifaka]] | https://nas.ops.eblu.me | NAS dashboard |
## Public Services (`*.eblu.me`)
DNS CNAMEs point to `blumeops-proxy.fly.dev`. TLS via Fly.io-managed Let's Encrypt. Traffic tunnels back to the homelab over Tailscale. Only services tagged `tag:flyio-target` are reachable by the proxy — see [[flyio-proxy]] for details.
| Service | URL | Description |
|---------|-----|-------------|
| [[docs]] | https://docs.eblu.me | Documentation site |
## Tailscale-Only Services
| Service | URL | Description |
@ -64,3 +73,5 @@ DNS points to indri's Tailscale IP (100.98.163.89). TLS via Let's Encrypt (ACME
- [[gandi]] - DNS hosting for `eblu.me`
- [[tailscale]] - ACL configuration
- [[indri]] - Where services run
- [[flyio-proxy]] - Public reverse proxy for `*.eblu.me`
- [[expose-service-publicly]] - How to add a new public service

View file

@ -19,11 +19,16 @@ The Tailscale operator enables Kubernetes services to be exposed directly on the
## How It Works
When you create an Ingress with `ingressClassName: tailscale`:
Ingresses use a shared ProxyGroup (`ingress`) rather than per-service Tailscale nodes. When you create an Ingress with `ingressClassName: tailscale`:
1. Operator provisions a Tailscale node for the service
2. Service becomes accessible at `<hostname>.tail8d86e.ts.net`
3. TLS is handled automatically via Tailscale
1. Operator configures the shared ProxyGroup pods to serve the new Ingress
2. Service gets a VIP (Virtual IP) address on the tailnet
3. Service becomes accessible at `<hostname>.tail8d86e.ts.net`
4. TLS is handled automatically via Tailscale
Tailnet clients must have `--accept-routes` enabled to route to VIP addresses.
Services can be individually tagged (e.g., `tag:flyio-target`) via Ingress annotations to control which ACL grants apply. See [[expose-service-publicly]] for the tagging workflow.
## Limitations

View file

@ -73,7 +73,7 @@ Alloy listens on `127.0.0.1:12345` for self-scraping its `/metrics` endpoint. Al
The `tag:flyio-proxy` ACL grants access only to `tag:flyio-target:443`. Services must explicitly opt in by adding a `tailscale.com/tags: "tag:k8s,tag:flyio-target"` annotation to their Tailscale Ingress. This means the proxy can only reach endpoints that have been individually tagged — a compromised nginx config cannot route to arbitrary services on the tailnet.
Currently tagged as `tag:flyio-target`: [[docs]], [[loki]], [[prometheus]]. Loki and Prometheus are reachable so that [[alloy|Alloy]] (running inside the container) can push logs and metrics directly via their Tailscale Ingress endpoints, bypassing [[caddy]] entirely.
Currently tagged as `tag:flyio-target`: [[docs]], [[loki]], [[prometheus]]. Loki and Prometheus are tagged so that [[alloy|Alloy]] (running inside the container) can push logs and metrics directly via their Tailscale Ingress endpoints — the restricted ACL means Caddy on indri (`tag:homelab`) is not reachable from the proxy.
To expose an additional service through the proxy, add the `tag:flyio-target` annotation to its Tailscale Ingress. See [[expose-service-publicly]] for the full workflow.