blumeops/docs/explanation/security-model.md
Erich Blume e6cf7e47e0
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 1m8s
Restrict flyio-proxy ACLs to dedicated tag:flyio-target endpoints (#126)
## Summary
- Introduce `tag:flyio-target` so services must explicitly opt in to be reachable by the fly.io proxy
- Replace broad `tag:k8s` and `tag:homelab` grants with the new tag in the ACL rule and test
- Add `tailscale.com/tags: "tag:k8s,tag:flyio-target"` annotation to docs, loki, and prometheus Ingresses
- Switch Alloy push endpoints from `*.ops.eblu.me` (Caddy) to `*.tail8d86e.ts.net` (Tailscale Ingress)
- Update docs: flyio-proxy, caddy, tailscale, forgejo (future public access + security checklist), expose-service-publicly

## Manual step (not in PR)
Update the k8s operator OAuth client in the Tailscale admin console to include `tag:flyio-target` in its scope. Without this, the operator cannot assign the new tag to Ingress proxy nodes.

## Deployment order
1. **Pulumi ACLs** — `mise run tailnet-preview && mise run tailnet-up`
2. **OAuth client** — Manual update in Tailscale admin console
3. **K8s Ingresses** — `argocd app sync apps && argocd app sync docs loki prometheus`
4. **Fly.io proxy** — `mise run fly-deploy`
5. **Verify** — `mise run services-check`, check Grafana dashboards

## Test plan
- [ ] `mise run tailnet-preview` shows clean diff
- [ ] `argocd app diff docs`, `argocd app diff loki`, `argocd app diff prometheus` show only annotation additions
- [ ] After deploy: Grafana dashboards show continued log/metric flow
- [ ] `curl -sf https://docs.eblu.me` returns 200
- [ ] `mise run services-check` passes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/126
2026-02-08 21:54:18 -08:00

4.4 KiB

title tags
Security Model
explanation
security

Security Model

Note: This article was drafted by AI and reviewed by Erich. I plan to rewrite all explanatory content in my own words - these serve as placeholders to establish the documentation structure.

How BlumeOps handles network security, secrets, and access control.

Network Security: Tailscale

The foundational security decision is using tailscale as the network layer.

Zero Trust Networking

BlumeOps infrastructure has no public IP addresses or port forwarding. Most services are only accessible via Tailscale:

  • Encrypted by default - WireGuard encryption for all traffic
  • Identity-based access - ACLs based on user/device identity, not IP addresses
  • Minimal public surface - only selected services are exposed via flyio-proxy

Public Access via Fly.io

A small number of services are exposed to the internet through a reverse proxy on Fly.io that tunnels back to the homelab over Tailscale. The proxy uses restricted ACLs (tag:flyio-target) so it can only reach explicitly tagged endpoints — a compromised proxy cannot route to arbitrary services on the tailnet. See flyio-proxy for details and expose-service-publicly for the security considerations.

Defense in Depth

Even within the tailnet, access is restricted:

Internet ──▶ Fly.io proxy ──▶ tag:flyio-target only (docs, observability)

Tailnet:
  Admin ────────▶ All services
  Member ───────▶ User-facing services only
  Homelab tag ──▶ NAS (for backups)

See tailscale for the full ACL matrix.

Secrets Management

Secrets follow a hierarchy:

Source of Truth: 1Password

All secrets originate in 1Password's blumeops vault:

  • API keys, tokens, passwords
  • SSH keys and certificates
  • OAuth credentials

Kubernetes: External Secrets Operator

external-secrets syncs secrets from 1Password to Kubernetes:

1Password ──▶ 1Password Connect ──▶ ExternalSecret ──▶ K8s Secret

Services reference native Kubernetes Secrets; they don't know about 1Password.

Ansible: op CLI

Ansible playbooks fetch secrets at runtime via op CLI:

- name: Fetch secret
  command: op item get <id> --fields password --reveal
  delegate_to: localhost

Secrets are held in memory as Ansible facts, never written to disk.

Git Repository

The repository is public. Secrets must never be committed:

  • .gitignore excludes sensitive patterns
  • Pre-commit hooks scan for potential secrets (TruffleHog)
  • All config files use references to secrets, not values

Access Control Philosophy

Principle of Least Privilege

Services and devices get minimum necessary access:

Entity Access
Admin users Everything
Member users User-facing services only
Homelab servers Only what they need (NAS for backups)
K8s pods No Tailscale access (use Caddy proxy)

Tagged Devices vs User Devices

Important Tailscale concept:

  • User devices (like gilbert) have user identity and inherit user ACLs
  • Tagged devices (like indri with tag:homelab) lose user identity

Don't tag user devices - it breaks user-based access rules.

Authentication Patterns

Service-to-Service

Internal services use:

  • Kubernetes service discovery (no auth needed within cluster)
  • Tailscale identity for cross-host communication

User-to-Service

Users authenticate via:

  • Service-specific credentials (stored in 1Password)
  • Some services support Tailscale identity (future)

AI/Automation Access

Claude Code and automation use:

  • SSH keys for git operations
  • ArgoCD tokens for deployments
  • 1Password CLI for secret retrieval (requires user approval)

What's Not Protected

Honest assessment of security boundaries:

  • Local network attacks - If someone is on your home WiFi, they could potentially access the NAS directly
  • Physical access - No disk encryption on servers (trade-off for reliability)
  • Supply chain - Container images from upstream registries
  • Operator error - Misconfigured ACLs or leaked credentials

The model assumes a trusted home network and focuses on protecting against internet-based attacks.