## Summary Devpi was crash-looping under memory pressure on the minikube StatefulSet, breaking the Python toolchain across the repo (`mise run docs-mikado`, `prek`, every `uv pip install`). It moves to indri as a native LaunchAgent. ## What changed - **New ansible role** `ansible/roles/devpi/`: installs `devpi-server` + `devpi-web` into a uv-managed venv, initializes the server-dir on first run via 1Password root password, runs as a LaunchAgent (`mcquack.eblume.devpi`) bound to `127.0.0.1:3141`. Bootstraps from upstream PyPI (so devpi can install itself on a fresh box). - **Caddy**: `pypi.ops.eblu.me` now proxies to `http://localhost:3141`. - **Playbook**: `indri.yml` gains pre_tasks for the root password and the new role. - **service-versions.yaml**: devpi flipped from `type: argocd` to `type: ansible`. - **ArgoCD**: removed `apps/devpi.yaml` and `manifests/devpi/`. The in-cluster Application, namespace, and PVC have been deleted. - **Docs**: new how-to `docs/how-to/operations/devpi-on-indri.md`; `restart-indri.md` lists devpi in the LaunchAgent stop list. ## Already deployed (live on indri) - Service running: `launchctl list mcquack.eblume.devpi` → PID 53888 - `curl https://pypi.ops.eblu.me/+api` returns 200 ✅ - `mise run docs-mikado` works again ✅ - 1.0G of cached PyPI data was migrated from the PVC to `~erichblume/devpi/server-dir/` - Minikube namespace and PVC fully reclaimed ## Test plan - [ ] `mise run services-check` (after merge) - [ ] CI workflows that use devpi succeed - [ ] No regressions in tools that depend on `pypi.ops.eblu.me` (prek, uv-script tasks, dagger pipelines) ## Context This is the C1 prelude to a planned C2 chain (`mikado/retire-minikube-indri`) to retire minikube on indri entirely. Doing devpi as a standalone C1 was the right call because (a) it was urgent — it was breaking the toolchain — and (b) it shakes out the migration recipe before we commit to a multi-leaf chain. Reviewed-on: #341
110 lines
3.6 KiB
Python
110 lines
3.6 KiB
Python
"""Pulumi program to manage tail8d86e.ts.net tailnet configuration.
|
|
|
|
This program manages:
|
|
- ACL policy (grants, SSH rules, tag owners, tests)
|
|
- Device tags for infrastructure classification
|
|
|
|
Devices are tagged based on their role:
|
|
- tag:homelab - Server infrastructure (indri, ringtail)
|
|
- tag:workstation - Development machines that can manage homelab (gilbert)
|
|
- tag:nas - Network-attached storage (sifaka)
|
|
- tag:blumeops - Resources managed by this IaC
|
|
- Service tags (grafana, forge, etc.) - Fine-grained service access control
|
|
"""
|
|
|
|
import hashlib
|
|
|
|
import pulumi
|
|
import pulumi_tailscale as tailscale
|
|
from pathlib import Path
|
|
|
|
# Read the HuJSON policy file
|
|
policy_path = Path(__file__).parent / "policy.hujson"
|
|
policy_content = policy_path.read_text()
|
|
|
|
# Compute policy hash for change tracking
|
|
policy_hash = hashlib.sha256(policy_content.encode()).hexdigest()[:12]
|
|
|
|
# Manage the ACL - this completely overwrites the tailnet's ACL policy
|
|
acl = tailscale.Acl(
|
|
"tailnet-acl",
|
|
acl=policy_content,
|
|
)
|
|
|
|
# ============== Device Tags ==============
|
|
# Manage tags for devices in the tailnet.
|
|
# Tags control access via the ACL policy in policy.hujson.
|
|
|
|
# indri - Mac Mini M1, primary homelab server
|
|
# Hosts forge, loki, zot registry, and the k8s control plane.
|
|
# Other services (grafana, kiwix, etc.) run in k8s with their own Tailscale devices.
|
|
indri = tailscale.get_device(name="indri.tail8d86e.ts.net")
|
|
indri_tags = tailscale.DeviceTags(
|
|
"indri-tags",
|
|
device_id=indri.node_id,
|
|
tags=[
|
|
"tag:homelab", # Server role - allows SSH from workstations
|
|
"tag:blumeops", # Managed by this IaC
|
|
# Service tags for services still hosted directly on indri
|
|
"tag:forge",
|
|
"tag:loki",
|
|
"tag:registry", # Zot container registry
|
|
"tag:k8s-api", # Kubernetes API server (minikube)
|
|
"tag:flyio-target", # Fly proxy routes through Caddy on indri
|
|
],
|
|
)
|
|
|
|
# NOTE: gilbert (MacBook Air M4) is NOT tagged via Pulumi
|
|
# Tagging a user-owned device converts it to a "tagged device" which loses
|
|
# user identity, breaking user-based SSH rules. gilbert remains user-owned
|
|
# so blume.erich@gmail.com can SSH to homelab via the ACL rules.
|
|
|
|
# sifaka - Synology NAS, backup target
|
|
# Homelab and workstations can access for backups
|
|
sifaka = tailscale.get_device(name="sifaka.tail8d86e.ts.net")
|
|
sifaka_tags = tailscale.DeviceTags(
|
|
"sifaka-tags",
|
|
device_id=sifaka.node_id,
|
|
tags=[
|
|
"tag:nas", # NAS role - accessible by homelab and workstations
|
|
"tag:blumeops", # Managed by this IaC
|
|
],
|
|
)
|
|
|
|
# ringtail - NixOS gaming/compute workstation
|
|
# Managed by this IaC after initial bootstrap via auth key.
|
|
ringtail = tailscale.get_device(name="ringtail.tail8d86e.ts.net")
|
|
ringtail_tags = tailscale.DeviceTags(
|
|
"ringtail-tags",
|
|
device_id=ringtail.node_id,
|
|
tags=[
|
|
"tag:homelab", # Server role - allows SSH from workstations and homelab peers
|
|
"tag:blumeops", # Managed by this IaC
|
|
],
|
|
)
|
|
|
|
# ============== Auth Keys ==============
|
|
|
|
# Auth key for Fly.io proxy container (public reverse proxy)
|
|
flyio_key = tailscale.TailnetKey(
|
|
"flyio-proxy-key",
|
|
reusable=True,
|
|
ephemeral=True,
|
|
preauthorized=True,
|
|
tags=["tag:flyio-proxy"],
|
|
expiry=7776000, # 90 days
|
|
)
|
|
|
|
# ============== Exports ==============
|
|
pulumi.export("acl_id", acl.id)
|
|
pulumi.export("policy_hash", policy_hash)
|
|
pulumi.export("flyio_authkey", flyio_key.key)
|
|
|
|
pulumi.export("indri_device_id", indri.node_id)
|
|
pulumi.export("indri_tags", indri_tags.tags)
|
|
|
|
pulumi.export("sifaka_device_id", sifaka.node_id)
|
|
pulumi.export("sifaka_tags", sifaka_tags.tags)
|
|
|
|
pulumi.export("ringtail_device_id", ringtail.node_id)
|
|
pulumi.export("ringtail_tags", ringtail_tags.tags)
|