Migrate devpi from minikube to indri (launchd) (#341)

## Summary

Devpi was crash-looping under memory pressure on the minikube StatefulSet, breaking the Python toolchain across the repo (`mise run docs-mikado`, `prek`, every `uv pip install`). It moves to indri as a native LaunchAgent.

## What changed

- **New ansible role** `ansible/roles/devpi/`: installs `devpi-server` + `devpi-web` into a uv-managed venv, initializes the server-dir on first run via 1Password root password, runs as a LaunchAgent (`mcquack.eblume.devpi`) bound to `127.0.0.1:3141`. Bootstraps from upstream PyPI (so devpi can install itself on a fresh box).
- **Caddy**: `pypi.ops.eblu.me` now proxies to `http://localhost:3141`.
- **Playbook**: `indri.yml` gains pre_tasks for the root password and the new role.
- **service-versions.yaml**: devpi flipped from `type: argocd` to `type: ansible`.
- **ArgoCD**: removed `apps/devpi.yaml` and `manifests/devpi/`. The in-cluster Application, namespace, and PVC have been deleted.
- **Docs**: new how-to `docs/how-to/operations/devpi-on-indri.md`; `restart-indri.md` lists devpi in the LaunchAgent stop list.

## Already deployed (live on indri)

- Service running: `launchctl list mcquack.eblume.devpi` → PID 53888
- `curl https://pypi.ops.eblu.me/+api` returns 200 
- `mise run docs-mikado` works again 
- 1.0G of cached PyPI data was migrated from the PVC to `~erichblume/devpi/server-dir/`
- Minikube namespace and PVC fully reclaimed

## Test plan

- [ ] `mise run services-check` (after merge)
- [ ] CI workflows that use devpi succeed
- [ ] No regressions in tools that depend on `pypi.ops.eblu.me` (prek, uv-script tasks, dagger pipelines)

## Context

This is the C1 prelude to a planned C2 chain (`mikado/retire-minikube-indri`) to retire minikube on indri entirely. Doing devpi as a standalone C1 was the right call because (a) it was urgent — it was breaking the toolchain — and (b) it shakes out the migration recipe before we commit to a multi-leaf chain.

Reviewed-on: #341
This commit is contained in:
Erich Blume 2026-04-29 13:38:36 -07:00
commit 14ca0160ba
24 changed files with 260 additions and 289 deletions

View file

@ -0,0 +1 @@
Migrated devpi (PyPI mirror at `pypi.ops.eblu.me`) from a minikube StatefulSet to a launchd-managed service on indri. devpi-server now runs in a uv-managed venv with pinned `devpi-server` and `devpi-web` versions, listens on `127.0.0.1:3141`, and is fronted by Caddy. The minikube StatefulSet was crash-looping under memory pressure (and breaking the Python toolchain everywhere); the new layout removes a layer of dependency on cluster health for critical-path tooling. See [[devpi-on-indri]].

View file

@ -0,0 +1,74 @@
---
title: Devpi on Indri
modified: 2026-04-29
last-reviewed: 2026-04-29
tags:
- how-to
- operations
---
# Devpi on Indri
How devpi (the PyPI caching mirror at `pypi.ops.eblu.me`) is deployed on indri as a launchd-managed native service. Replaces the prior minikube StatefulSet.
## Why native, not Kubernetes
Devpi has no runtime dependencies beyond a Python interpreter, a writable directory, and outbound HTTPS to upstream PyPI. Running it on indri natively removes a layer of operational complexity, frees minikube resources, and decouples this critical-path tooling (used by every Python build, including `mise run docs-mikado` itself) from cluster health.
## Layout
| Concern | Path / detail |
|---|---|
| Service binary | `/Users/erichblume/devpi/venv/bin/devpi-server` |
| Server-dir (data) | `/Users/erichblume/devpi/server-dir/` |
| Logs | `/Users/erichblume/Library/Logs/mcquack.devpi.{out,err}.log` |
| LaunchAgent label | `mcquack.eblume.devpi` |
| LaunchAgent plist | `~/Library/LaunchAgents/mcquack.eblume.devpi.plist` |
| Listen address | `127.0.0.1:3141` (loopback only) |
| Public URL | `https://pypi.ops.eblu.me` (via Caddy reverse proxy) |
| Root password secret | 1Password item `devpi`, field `root password` |
The venv is built fresh by ansible from a pinned `devpi-server` and `devpi-web` version; bumping versions is a config change in `ansible/roles/devpi/defaults/main.yml`.
## Deploy
```fish
mise run provision-indri -- --tags devpi
```
Ansible will:
1. Fetch the root password from 1Password (in playbook `pre_tasks`)
2. Create the venv at `~/devpi/venv` if absent and install/upgrade `devpi-server` + `devpi-web` to the pinned versions
3. Initialize the server-dir (only on first run, when `.serverversion` is missing)
4. Render and load the LaunchAgent plist
5. Restart the service if the plist or config changed
Caddy already proxies `pypi.ops.eblu.me``127.0.0.1:3141`; nothing else routes traffic.
## Verify
```fish
ssh indri 'launchctl list mcquack.eblume.devpi'
curl -fsS https://pypi.ops.eblu.me/+api | jq
uv pip install --index-url https://pypi.ops.eblu.me/root/pypi/+simple/ requests
```
## Logs
```fish
ssh indri 'tail -f ~/Library/Logs/mcquack.devpi.err.log'
```
## Bumping devpi versions
Edit `devpi_server_version` / `devpi_web_version` in `ansible/roles/devpi/defaults/main.yml`, then re-run the playbook with `--tags devpi`. The role rebuilds the venv in-place; the server-dir survives.
## Backup
The server-dir is **not** in `borgmatic_source_directories` and is not backed up. The PyPI cache (`+files/`) is re-fetchable from upstream on first request; the local `eblume/dev` index can be republished from source. If retention becomes important, add `/Users/erichblume/devpi/server-dir/` to the borgmatic source list.
## Related
- [[restart-indri]] — devpi is one of the LaunchAgents to stop on graceful shutdown
- [[connect-to-postgres]] — pattern for indri-native services (different stack, similar shape)

View file

@ -235,25 +235,7 @@ mise run services-check
## Post-Rebuild: Cold Cache Failures
### Devpi (PyPI Cache)
After a rebuild, devpi's package cache is empty. The first Dagger-based container build will trigger a flood of concurrent package downloads. Devpi uses lazy caching — it serves package metadata (simple index) immediately from upstream PyPI but fetches wheel files on demand. Under heavy concurrent load with a cold cache, the upstream fetch can race with the client request, causing devpi to return `no such file` (HTTP 404) for packages it knows about but hasn't finished downloading yet.
**Why devpi, not PyPI?** The repo's `uv.lock` was generated with devpi as the index, so every package source URL points at `pypi.ops.eblu.me`. Dagger's Python SDK runtime does a locked install (`uv sync`), not fresh resolution — it fetches from whatever URLs are in the lockfile. This is intentional (supply chain control), but means all builds — local and CI — depend on devpi being available and warm.
**Symptoms:** Forgejo Actions Dagger builds fail during module initialization with errors like:
```
Failed to download `googleapis-common-protos==1.74.0`
HTTP status client error (404 Not Found) for url (https://pypi.ops.eblu.me/root/pypi/+f/...)
```
**Fix:** Re-run the failed build. The first attempt warms the cache; subsequent builds succeed. Alternatively, warm the cache manually before triggering CI builds:
```bash
# From any machine that can reach pypi.ops.eblu.me, install the Dagger SDK
# to pre-populate the most common packages:
pip install --dry-run --index-url https://pypi.ops.eblu.me/root/pypi/+simple/ dagger-io
```
Devpi runs natively on indri (see [[devpi-on-indri]]) and is unaffected by minikube rebuilds, so the historical "devpi cold cache after rebuild" failure mode no longer applies. If devpi itself goes cold (fresh server-dir), the same lazy-cache race can still cause `404` on the first Dagger build under concurrent load — re-run the build to warm the cache, or pre-warm with `uv pip install --dry-run --index-url https://pypi.ops.eblu.me/root/pypi/+simple/ dagger-io`.
## Related

View file

@ -41,6 +41,7 @@ Native services managed by launchd will stop automatically during macOS shutdown
ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.eblume.forgejo.plist'
ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.eblume.caddy.plist'
ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.eblume.zot.plist'
ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.eblume.devpi.plist' # see [[devpi-on-indri]]
ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.jellyfin.plist'
ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.eblume.alloy.plist'
ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.eblume.borgmatic.plist'

View file

@ -33,7 +33,7 @@ ACLs managed via Pulumi in `pulumi/tailscale/policy.hujson`.
| `tag:loki` | indri | Loki log aggregation |
| `tag:k8s-api` | indri | Kubernetes API server (minikube) |
| `tag:k8s-operator` | (operator pod) | Tailscale operator for k8s — see [[tailscale-operator]] |
| `tag:k8s` | (Ingress proxy pods) | Kubernetes Tailscale Ingress nodes; each also carries a per-service tag (`tag:grafana`, `tag:kiwix`, `tag:devpi`, `tag:feed`, `tag:pg`) |
| `tag:k8s` | (Ingress proxy pods) | Kubernetes Tailscale Ingress nodes; each also carries a per-service tag (`tag:grafana`, `tag:kiwix`, `tag:feed`, `tag:pg`) |
| `tag:ci-gateway` | (ephemeral CI containers) | CI containers pushing images to registry |
| `tag:flyio-proxy` | (Fly.io proxy container) | Public reverse proxy |
| `tag:flyio-target` | indri, designated Ingress endpoints | Endpoints reachable by the Fly.io proxy (indri for Caddy routing, Ingress pods for Alloy metrics/logs) |

View file

@ -1,7 +1,7 @@
---
title: Devpi
modified: 2026-03-23
last-reviewed: 2026-03-23
modified: 2026-04-29
last-reviewed: 2026-04-29
tags:
- service
- python
@ -9,31 +9,37 @@ tags:
# devpi (PyPI Proxy)
PyPI caching proxy and private package index.
PyPI caching proxy and private package index. Runs natively on [[indri]] as a LaunchAgent (not in-cluster). See [[devpi-on-indri]] for deploy and operations.
## Quick Reference
| Property | Value |
|----------|-------|
| **URL** | https://pypi.ops.eblu.me |
| **Namespace** | `devpi` |
| **ArgoCD App** | `devpi` |
| **Storage** | 50Gi PVC |
| **Image** | `registry.ops.eblu.me/blumeops/devpi` (see `argocd/manifests/devpi/kustomization.yaml` for current tag) |
| **URL** | `https://pypi.ops.eblu.me` |
| **Listen** | `127.0.0.1:3141` (loopback only; reached via Caddy) |
| **Service** | LaunchAgent `mcquack.eblume.devpi` on indri |
| **Server-dir** | `/Users/erichblume/devpi/server-dir/` |
| **Runtime** | uv-managed venv at `/Users/erichblume/devpi/venv/` |
| **Ansible role** | `ansible/roles/devpi/` |
| **Versions** | Pinned in `ansible/roles/devpi/defaults/main.yml` (`devpi_server_version`, `devpi_web_version`) |
## Indices
| Index | Purpose |
|-------|---------|
| `root/pypi` | PyPI mirror/cache (auto-created) |
| `eblume/dev` | Private packages (inherits from root/pypi) |
| `root/pypi` | PyPI mirror/cache (auto-created by `devpi-init`) |
| `eblume/dev` | Private packages (inherits from `root/pypi`) |
## Credentials
Root password stored in 1Password (blumeops vault), injected via ExternalSecret.
Root password stored in 1Password (`blumeops` vault, item `devpi`, field `root password`). Fetched via `op read` in the `ansible/playbooks/indri.yml` `pre_tasks` and passed to the role on first init.
## Backup
The server-dir is **not** backed up. The PyPI cache (`+files/`) is re-fetchable from upstream on first request. The local `eblume/dev` index metadata is small but also not critical to retain — packages can be republished from source. If retention becomes important, add `/Users/erichblume/devpi/server-dir/` to `borgmatic_source_directories`.
## Related
- [[use-pypi-proxy]] - Client configuration and package uploads
- [[argocd]] - Deployment
- [[1password]] - Secrets management
- [[devpi-on-indri]] — Deploy, verify, and version-bump procedures
- [[use-pypi-proxy]] — Client configuration and package uploads
- [[1password]] Secrets management

View file

@ -62,7 +62,7 @@ Other data lives directly on [[sifaka]] (music via [[navidrome]], video via [[je
| ZIM archives (`~/transmission/`) | Re-downloadable via torrent |
| Prometheus metrics | Ephemeral, in k8s PVC |
| Loki logs | Ephemeral, in k8s PVC |
| devpi cache | Re-fetchable from PyPI |
| devpi cache (`~/devpi/server-dir/` on indri) | Re-fetchable from PyPI on first request |
## Retention Policy