blumeops/pulumi/tailscale/__main__.py
Erich Blume 14ca0160ba Migrate devpi from minikube to indri (launchd) (#341)
## Summary

Devpi was crash-looping under memory pressure on the minikube StatefulSet, breaking the Python toolchain across the repo (`mise run docs-mikado`, `prek`, every `uv pip install`). It moves to indri as a native LaunchAgent.

## What changed

- **New ansible role** `ansible/roles/devpi/`: installs `devpi-server` + `devpi-web` into a uv-managed venv, initializes the server-dir on first run via 1Password root password, runs as a LaunchAgent (`mcquack.eblume.devpi`) bound to `127.0.0.1:3141`. Bootstraps from upstream PyPI (so devpi can install itself on a fresh box).
- **Caddy**: `pypi.ops.eblu.me` now proxies to `http://localhost:3141`.
- **Playbook**: `indri.yml` gains pre_tasks for the root password and the new role.
- **service-versions.yaml**: devpi flipped from `type: argocd` to `type: ansible`.
- **ArgoCD**: removed `apps/devpi.yaml` and `manifests/devpi/`. The in-cluster Application, namespace, and PVC have been deleted.
- **Docs**: new how-to `docs/how-to/operations/devpi-on-indri.md`; `restart-indri.md` lists devpi in the LaunchAgent stop list.

## Already deployed (live on indri)

- Service running: `launchctl list mcquack.eblume.devpi` → PID 53888
- `curl https://pypi.ops.eblu.me/+api` returns 200 
- `mise run docs-mikado` works again 
- 1.0G of cached PyPI data was migrated from the PVC to `~erichblume/devpi/server-dir/`
- Minikube namespace and PVC fully reclaimed

## Test plan

- [ ] `mise run services-check` (after merge)
- [ ] CI workflows that use devpi succeed
- [ ] No regressions in tools that depend on `pypi.ops.eblu.me` (prek, uv-script tasks, dagger pipelines)

## Context

This is the C1 prelude to a planned C2 chain (`mikado/retire-minikube-indri`) to retire minikube on indri entirely. Doing devpi as a standalone C1 was the right call because (a) it was urgent — it was breaking the toolchain — and (b) it shakes out the migration recipe before we commit to a multi-leaf chain.

Reviewed-on: #341
2026-04-29 13:38:36 -07:00

110 lines
3.6 KiB
Python

"""Pulumi program to manage tail8d86e.ts.net tailnet configuration.
This program manages:
- ACL policy (grants, SSH rules, tag owners, tests)
- Device tags for infrastructure classification
Devices are tagged based on their role:
- tag:homelab - Server infrastructure (indri, ringtail)
- tag:workstation - Development machines that can manage homelab (gilbert)
- tag:nas - Network-attached storage (sifaka)
- tag:blumeops - Resources managed by this IaC
- Service tags (grafana, forge, etc.) - Fine-grained service access control
"""
import hashlib
import pulumi
import pulumi_tailscale as tailscale
from pathlib import Path
# Read the HuJSON policy file
policy_path = Path(__file__).parent / "policy.hujson"
policy_content = policy_path.read_text()
# Compute policy hash for change tracking
policy_hash = hashlib.sha256(policy_content.encode()).hexdigest()[:12]
# Manage the ACL - this completely overwrites the tailnet's ACL policy
acl = tailscale.Acl(
"tailnet-acl",
acl=policy_content,
)
# ============== Device Tags ==============
# Manage tags for devices in the tailnet.
# Tags control access via the ACL policy in policy.hujson.
# indri - Mac Mini M1, primary homelab server
# Hosts forge, loki, zot registry, and the k8s control plane.
# Other services (grafana, kiwix, etc.) run in k8s with their own Tailscale devices.
indri = tailscale.get_device(name="indri.tail8d86e.ts.net")
indri_tags = tailscale.DeviceTags(
"indri-tags",
device_id=indri.node_id,
tags=[
"tag:homelab", # Server role - allows SSH from workstations
"tag:blumeops", # Managed by this IaC
# Service tags for services still hosted directly on indri
"tag:forge",
"tag:loki",
"tag:registry", # Zot container registry
"tag:k8s-api", # Kubernetes API server (minikube)
"tag:flyio-target", # Fly proxy routes through Caddy on indri
],
)
# NOTE: gilbert (MacBook Air M4) is NOT tagged via Pulumi
# Tagging a user-owned device converts it to a "tagged device" which loses
# user identity, breaking user-based SSH rules. gilbert remains user-owned
# so blume.erich@gmail.com can SSH to homelab via the ACL rules.
# sifaka - Synology NAS, backup target
# Homelab and workstations can access for backups
sifaka = tailscale.get_device(name="sifaka.tail8d86e.ts.net")
sifaka_tags = tailscale.DeviceTags(
"sifaka-tags",
device_id=sifaka.node_id,
tags=[
"tag:nas", # NAS role - accessible by homelab and workstations
"tag:blumeops", # Managed by this IaC
],
)
# ringtail - NixOS gaming/compute workstation
# Managed by this IaC after initial bootstrap via auth key.
ringtail = tailscale.get_device(name="ringtail.tail8d86e.ts.net")
ringtail_tags = tailscale.DeviceTags(
"ringtail-tags",
device_id=ringtail.node_id,
tags=[
"tag:homelab", # Server role - allows SSH from workstations and homelab peers
"tag:blumeops", # Managed by this IaC
],
)
# ============== Auth Keys ==============
# Auth key for Fly.io proxy container (public reverse proxy)
flyio_key = tailscale.TailnetKey(
"flyio-proxy-key",
reusable=True,
ephemeral=True,
preauthorized=True,
tags=["tag:flyio-proxy"],
expiry=7776000, # 90 days
)
# ============== Exports ==============
pulumi.export("acl_id", acl.id)
pulumi.export("policy_hash", policy_hash)
pulumi.export("flyio_authkey", flyio_key.key)
pulumi.export("indri_device_id", indri.node_id)
pulumi.export("indri_tags", indri_tags.tags)
pulumi.export("sifaka_device_id", sifaka.node_id)
pulumi.export("sifaka_tags", sifaka_tags.tags)
pulumi.export("ringtail_device_id", ringtail.node_id)
pulumi.export("ringtail_tags", ringtail_tags.tags)