Restrict flyio-proxy ACLs to dedicated tag:flyio-target endpoints #126

Merged
eblume merged 7 commits from restrict-flyio-proxy-acl into main 2026-02-08 21:54:19 -08:00
Owner

Summary

  • Introduce tag:flyio-target so services must explicitly opt in to be reachable by the fly.io proxy
  • Replace broad tag:k8s and tag:homelab grants with the new tag in the ACL rule and test
  • Add tailscale.com/tags: "tag:k8s,tag:flyio-target" annotation to docs, loki, and prometheus Ingresses
  • Switch Alloy push endpoints from *.ops.eblu.me (Caddy) to *.tail8d86e.ts.net (Tailscale Ingress)
  • Update docs: flyio-proxy, caddy, tailscale, forgejo (future public access + security checklist), expose-service-publicly

Manual step (not in PR)

Update the k8s operator OAuth client in the Tailscale admin console to include tag:flyio-target in its scope. Without this, the operator cannot assign the new tag to Ingress proxy nodes.

Deployment order

  1. Pulumi ACLsmise run tailnet-preview && mise run tailnet-up
  2. OAuth client — Manual update in Tailscale admin console
  3. K8s Ingressesargocd app sync apps && argocd app sync docs loki prometheus
  4. Fly.io proxymise run fly-deploy
  5. Verifymise run services-check, check Grafana dashboards

Test plan

  • mise run tailnet-preview shows clean diff
  • argocd app diff docs, argocd app diff loki, argocd app diff prometheus show only annotation additions
  • After deploy: Grafana dashboards show continued log/metric flow
  • curl -sf https://docs.eblu.me returns 200
  • mise run services-check passes

🤖 Generated with Claude Code

## Summary - Introduce `tag:flyio-target` so services must explicitly opt in to be reachable by the fly.io proxy - Replace broad `tag:k8s` and `tag:homelab` grants with the new tag in the ACL rule and test - Add `tailscale.com/tags: "tag:k8s,tag:flyio-target"` annotation to docs, loki, and prometheus Ingresses - Switch Alloy push endpoints from `*.ops.eblu.me` (Caddy) to `*.tail8d86e.ts.net` (Tailscale Ingress) - Update docs: flyio-proxy, caddy, tailscale, forgejo (future public access + security checklist), expose-service-publicly ## Manual step (not in PR) Update the k8s operator OAuth client in the Tailscale admin console to include `tag:flyio-target` in its scope. Without this, the operator cannot assign the new tag to Ingress proxy nodes. ## Deployment order 1. **Pulumi ACLs** — `mise run tailnet-preview && mise run tailnet-up` 2. **OAuth client** — Manual update in Tailscale admin console 3. **K8s Ingresses** — `argocd app sync apps && argocd app sync docs loki prometheus` 4. **Fly.io proxy** — `mise run fly-deploy` 5. **Verify** — `mise run services-check`, check Grafana dashboards ## Test plan - [ ] `mise run tailnet-preview` shows clean diff - [ ] `argocd app diff docs`, `argocd app diff loki`, `argocd app diff prometheus` show only annotation additions - [ ] After deploy: Grafana dashboards show continued log/metric flow - [ ] `curl -sf https://docs.eblu.me` returns 200 - [ ] `mise run services-check` passes 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Replace broad tag:k8s and tag:homelab grants with a new tag:flyio-target
tag that services must explicitly opt into. Alloy now pushes logs/metrics
directly to Loki and Prometheus via Tailscale Ingress, bypassing Caddy.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The legacy per-Ingress StatefulSet proxy model silently ignores the
tailscale.com/tags annotation, so tag:flyio-target was never applied
to docs/loki/prometheus — breaking the restricted ACL. This adds a
ProxyGroup (type: Ingress, 2 replicas) and annotates all 12 Ingresses
with tailscale.com/proxy-group: "ingress" to enable per-Ingress tag
overrides and restore connectivity.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The CRD validation requires lowercase type values.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove explicit `host:` field from Ingress rules. With ProxyGroup-based
Tailscale Ingresses, the Host header contains the FQDN (e.g.,
prometheus.tail8d86e.ts.net) which doesn't match the short name
(prometheus), causing 404s.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add autoApprovers so ProxyGroup pods (tag:k8s) can auto-approve VIP
  service routes, as required by Tailscale multi-cluster Ingress docs
- Revert Alloy endpoints from direct Tailscale Ingress back to Caddy
  (*.ops.eblu.me) to decouple observability from VIP routing
- Update changelog to reflect final state

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Revert the Caddy endpoint change — flyio-proxy ACLs only allow
tag:flyio-target, so Alloy can't reach Caddy on indri (tag:homelab).
The direct Tailscale Ingress endpoints (loki/prometheus.tail8d86e.ts.net)
are tagged tag:flyio-target specifically for this purpose.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- security-model: Replace "no public access" with Fly.io proxy description
- routing: Add *.eblu.me as third DNS domain for public services
- architecture: Add Fly.io to network layer and service routing table
- CLAUDE.md: Add public routing domain to routing table
- gandi: Add public CNAME records section
- tailscale-operator: Document ProxyGroup, VIP routing, per-Ingress tags
- flyio-proxy: Clarify why Alloy uses direct Tailscale endpoints (ACL)
- Remove hardcoded Tailscale IP (100.98.163.89) from docs, use DNS names

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
eblume merged commit e6cf7e47e0 into main 2026-02-08 21:54:19 -08:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
eblume/blumeops!126
No description provided.