Migrate devpi from minikube to indri (launchd) (#341)
## Summary Devpi was crash-looping under memory pressure on the minikube StatefulSet, breaking the Python toolchain across the repo (`mise run docs-mikado`, `prek`, every `uv pip install`). It moves to indri as a native LaunchAgent. ## What changed - **New ansible role** `ansible/roles/devpi/`: installs `devpi-server` + `devpi-web` into a uv-managed venv, initializes the server-dir on first run via 1Password root password, runs as a LaunchAgent (`mcquack.eblume.devpi`) bound to `127.0.0.1:3141`. Bootstraps from upstream PyPI (so devpi can install itself on a fresh box). - **Caddy**: `pypi.ops.eblu.me` now proxies to `http://localhost:3141`. - **Playbook**: `indri.yml` gains pre_tasks for the root password and the new role. - **service-versions.yaml**: devpi flipped from `type: argocd` to `type: ansible`. - **ArgoCD**: removed `apps/devpi.yaml` and `manifests/devpi/`. The in-cluster Application, namespace, and PVC have been deleted. - **Docs**: new how-to `docs/how-to/operations/devpi-on-indri.md`; `restart-indri.md` lists devpi in the LaunchAgent stop list. ## Already deployed (live on indri) - Service running: `launchctl list mcquack.eblume.devpi` → PID 53888 - `curl https://pypi.ops.eblu.me/+api` returns 200 ✅ - `mise run docs-mikado` works again ✅ - 1.0G of cached PyPI data was migrated from the PVC to `~erichblume/devpi/server-dir/` - Minikube namespace and PVC fully reclaimed ## Test plan - [ ] `mise run services-check` (after merge) - [ ] CI workflows that use devpi succeed - [ ] No regressions in tools that depend on `pypi.ops.eblu.me` (prek, uv-script tasks, dagger pipelines) ## Context This is the C1 prelude to a planned C2 chain (`mikado/retire-minikube-indri`) to retire minikube on indri entirely. Doing devpi as a standalone C1 was the right call because (a) it was urgent — it was breaking the toolchain — and (b) it shakes out the migration recipe before we commit to a multi-leaf chain. Reviewed-on: #341
This commit is contained in:
parent
f4a24595b1
commit
14ca0160ba
24 changed files with 260 additions and 289 deletions
1
docs/changelog.d/migrate-devpi-to-indri.infra.md
Normal file
1
docs/changelog.d/migrate-devpi-to-indri.infra.md
Normal file
|
|
@ -0,0 +1 @@
|
|||
Migrated devpi (PyPI mirror at `pypi.ops.eblu.me`) from a minikube StatefulSet to a launchd-managed service on indri. devpi-server now runs in a uv-managed venv with pinned `devpi-server` and `devpi-web` versions, listens on `127.0.0.1:3141`, and is fronted by Caddy. The minikube StatefulSet was crash-looping under memory pressure (and breaking the Python toolchain everywhere); the new layout removes a layer of dependency on cluster health for critical-path tooling. See [[devpi-on-indri]].
|
||||
74
docs/how-to/operations/devpi-on-indri.md
Normal file
74
docs/how-to/operations/devpi-on-indri.md
Normal file
|
|
@ -0,0 +1,74 @@
|
|||
---
|
||||
title: Devpi on Indri
|
||||
modified: 2026-04-29
|
||||
last-reviewed: 2026-04-29
|
||||
tags:
|
||||
- how-to
|
||||
- operations
|
||||
---
|
||||
|
||||
# Devpi on Indri
|
||||
|
||||
How devpi (the PyPI caching mirror at `pypi.ops.eblu.me`) is deployed on indri as a launchd-managed native service. Replaces the prior minikube StatefulSet.
|
||||
|
||||
## Why native, not Kubernetes
|
||||
|
||||
Devpi has no runtime dependencies beyond a Python interpreter, a writable directory, and outbound HTTPS to upstream PyPI. Running it on indri natively removes a layer of operational complexity, frees minikube resources, and decouples this critical-path tooling (used by every Python build, including `mise run docs-mikado` itself) from cluster health.
|
||||
|
||||
## Layout
|
||||
|
||||
| Concern | Path / detail |
|
||||
|---|---|
|
||||
| Service binary | `/Users/erichblume/devpi/venv/bin/devpi-server` |
|
||||
| Server-dir (data) | `/Users/erichblume/devpi/server-dir/` |
|
||||
| Logs | `/Users/erichblume/Library/Logs/mcquack.devpi.{out,err}.log` |
|
||||
| LaunchAgent label | `mcquack.eblume.devpi` |
|
||||
| LaunchAgent plist | `~/Library/LaunchAgents/mcquack.eblume.devpi.plist` |
|
||||
| Listen address | `127.0.0.1:3141` (loopback only) |
|
||||
| Public URL | `https://pypi.ops.eblu.me` (via Caddy reverse proxy) |
|
||||
| Root password secret | 1Password item `devpi`, field `root password` |
|
||||
|
||||
The venv is built fresh by ansible from a pinned `devpi-server` and `devpi-web` version; bumping versions is a config change in `ansible/roles/devpi/defaults/main.yml`.
|
||||
|
||||
## Deploy
|
||||
|
||||
```fish
|
||||
mise run provision-indri -- --tags devpi
|
||||
```
|
||||
|
||||
Ansible will:
|
||||
|
||||
1. Fetch the root password from 1Password (in playbook `pre_tasks`)
|
||||
2. Create the venv at `~/devpi/venv` if absent and install/upgrade `devpi-server` + `devpi-web` to the pinned versions
|
||||
3. Initialize the server-dir (only on first run, when `.serverversion` is missing)
|
||||
4. Render and load the LaunchAgent plist
|
||||
5. Restart the service if the plist or config changed
|
||||
|
||||
Caddy already proxies `pypi.ops.eblu.me` → `127.0.0.1:3141`; nothing else routes traffic.
|
||||
|
||||
## Verify
|
||||
|
||||
```fish
|
||||
ssh indri 'launchctl list mcquack.eblume.devpi'
|
||||
curl -fsS https://pypi.ops.eblu.me/+api | jq
|
||||
uv pip install --index-url https://pypi.ops.eblu.me/root/pypi/+simple/ requests
|
||||
```
|
||||
|
||||
## Logs
|
||||
|
||||
```fish
|
||||
ssh indri 'tail -f ~/Library/Logs/mcquack.devpi.err.log'
|
||||
```
|
||||
|
||||
## Bumping devpi versions
|
||||
|
||||
Edit `devpi_server_version` / `devpi_web_version` in `ansible/roles/devpi/defaults/main.yml`, then re-run the playbook with `--tags devpi`. The role rebuilds the venv in-place; the server-dir survives.
|
||||
|
||||
## Backup
|
||||
|
||||
The server-dir is **not** in `borgmatic_source_directories` and is not backed up. The PyPI cache (`+files/`) is re-fetchable from upstream on first request; the local `eblume/dev` index can be republished from source. If retention becomes important, add `/Users/erichblume/devpi/server-dir/` to the borgmatic source list.
|
||||
|
||||
## Related
|
||||
|
||||
- [[restart-indri]] — devpi is one of the LaunchAgents to stop on graceful shutdown
|
||||
- [[connect-to-postgres]] — pattern for indri-native services (different stack, similar shape)
|
||||
|
|
@ -235,25 +235,7 @@ mise run services-check
|
|||
|
||||
## Post-Rebuild: Cold Cache Failures
|
||||
|
||||
### Devpi (PyPI Cache)
|
||||
|
||||
After a rebuild, devpi's package cache is empty. The first Dagger-based container build will trigger a flood of concurrent package downloads. Devpi uses lazy caching — it serves package metadata (simple index) immediately from upstream PyPI but fetches wheel files on demand. Under heavy concurrent load with a cold cache, the upstream fetch can race with the client request, causing devpi to return `no such file` (HTTP 404) for packages it knows about but hasn't finished downloading yet.
|
||||
|
||||
**Why devpi, not PyPI?** The repo's `uv.lock` was generated with devpi as the index, so every package source URL points at `pypi.ops.eblu.me`. Dagger's Python SDK runtime does a locked install (`uv sync`), not fresh resolution — it fetches from whatever URLs are in the lockfile. This is intentional (supply chain control), but means all builds — local and CI — depend on devpi being available and warm.
|
||||
|
||||
**Symptoms:** Forgejo Actions Dagger builds fail during module initialization with errors like:
|
||||
```
|
||||
Failed to download `googleapis-common-protos==1.74.0`
|
||||
HTTP status client error (404 Not Found) for url (https://pypi.ops.eblu.me/root/pypi/+f/...)
|
||||
```
|
||||
|
||||
**Fix:** Re-run the failed build. The first attempt warms the cache; subsequent builds succeed. Alternatively, warm the cache manually before triggering CI builds:
|
||||
|
||||
```bash
|
||||
# From any machine that can reach pypi.ops.eblu.me, install the Dagger SDK
|
||||
# to pre-populate the most common packages:
|
||||
pip install --dry-run --index-url https://pypi.ops.eblu.me/root/pypi/+simple/ dagger-io
|
||||
```
|
||||
Devpi runs natively on indri (see [[devpi-on-indri]]) and is unaffected by minikube rebuilds, so the historical "devpi cold cache after rebuild" failure mode no longer applies. If devpi itself goes cold (fresh server-dir), the same lazy-cache race can still cause `404` on the first Dagger build under concurrent load — re-run the build to warm the cache, or pre-warm with `uv pip install --dry-run --index-url https://pypi.ops.eblu.me/root/pypi/+simple/ dagger-io`.
|
||||
|
||||
## Related
|
||||
|
||||
|
|
|
|||
|
|
@ -41,6 +41,7 @@ Native services managed by launchd will stop automatically during macOS shutdown
|
|||
ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.eblume.forgejo.plist'
|
||||
ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.eblume.caddy.plist'
|
||||
ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.eblume.zot.plist'
|
||||
ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.eblume.devpi.plist' # see [[devpi-on-indri]]
|
||||
ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.jellyfin.plist'
|
||||
ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.eblume.alloy.plist'
|
||||
ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.eblume.borgmatic.plist'
|
||||
|
|
|
|||
|
|
@ -33,7 +33,7 @@ ACLs managed via Pulumi in `pulumi/tailscale/policy.hujson`.
|
|||
| `tag:loki` | indri | Loki log aggregation |
|
||||
| `tag:k8s-api` | indri | Kubernetes API server (minikube) |
|
||||
| `tag:k8s-operator` | (operator pod) | Tailscale operator for k8s — see [[tailscale-operator]] |
|
||||
| `tag:k8s` | (Ingress proxy pods) | Kubernetes Tailscale Ingress nodes; each also carries a per-service tag (`tag:grafana`, `tag:kiwix`, `tag:devpi`, `tag:feed`, `tag:pg`) |
|
||||
| `tag:k8s` | (Ingress proxy pods) | Kubernetes Tailscale Ingress nodes; each also carries a per-service tag (`tag:grafana`, `tag:kiwix`, `tag:feed`, `tag:pg`) |
|
||||
| `tag:ci-gateway` | (ephemeral CI containers) | CI containers pushing images to registry |
|
||||
| `tag:flyio-proxy` | (Fly.io proxy container) | Public reverse proxy |
|
||||
| `tag:flyio-target` | indri, designated Ingress endpoints | Endpoints reachable by the Fly.io proxy (indri for Caddy routing, Ingress pods for Alloy metrics/logs) |
|
||||
|
|
|
|||
|
|
@ -1,7 +1,7 @@
|
|||
---
|
||||
title: Devpi
|
||||
modified: 2026-03-23
|
||||
last-reviewed: 2026-03-23
|
||||
modified: 2026-04-29
|
||||
last-reviewed: 2026-04-29
|
||||
tags:
|
||||
- service
|
||||
- python
|
||||
|
|
@ -9,31 +9,37 @@ tags:
|
|||
|
||||
# devpi (PyPI Proxy)
|
||||
|
||||
PyPI caching proxy and private package index.
|
||||
PyPI caching proxy and private package index. Runs natively on [[indri]] as a LaunchAgent (not in-cluster). See [[devpi-on-indri]] for deploy and operations.
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| **URL** | https://pypi.ops.eblu.me |
|
||||
| **Namespace** | `devpi` |
|
||||
| **ArgoCD App** | `devpi` |
|
||||
| **Storage** | 50Gi PVC |
|
||||
| **Image** | `registry.ops.eblu.me/blumeops/devpi` (see `argocd/manifests/devpi/kustomization.yaml` for current tag) |
|
||||
| **URL** | `https://pypi.ops.eblu.me` |
|
||||
| **Listen** | `127.0.0.1:3141` (loopback only; reached via Caddy) |
|
||||
| **Service** | LaunchAgent `mcquack.eblume.devpi` on indri |
|
||||
| **Server-dir** | `/Users/erichblume/devpi/server-dir/` |
|
||||
| **Runtime** | uv-managed venv at `/Users/erichblume/devpi/venv/` |
|
||||
| **Ansible role** | `ansible/roles/devpi/` |
|
||||
| **Versions** | Pinned in `ansible/roles/devpi/defaults/main.yml` (`devpi_server_version`, `devpi_web_version`) |
|
||||
|
||||
## Indices
|
||||
|
||||
| Index | Purpose |
|
||||
|-------|---------|
|
||||
| `root/pypi` | PyPI mirror/cache (auto-created) |
|
||||
| `eblume/dev` | Private packages (inherits from root/pypi) |
|
||||
| `root/pypi` | PyPI mirror/cache (auto-created by `devpi-init`) |
|
||||
| `eblume/dev` | Private packages (inherits from `root/pypi`) |
|
||||
|
||||
## Credentials
|
||||
|
||||
Root password stored in 1Password (blumeops vault), injected via ExternalSecret.
|
||||
Root password stored in 1Password (`blumeops` vault, item `devpi`, field `root password`). Fetched via `op read` in the `ansible/playbooks/indri.yml` `pre_tasks` and passed to the role on first init.
|
||||
|
||||
## Backup
|
||||
|
||||
The server-dir is **not** backed up. The PyPI cache (`+files/`) is re-fetchable from upstream on first request. The local `eblume/dev` index metadata is small but also not critical to retain — packages can be republished from source. If retention becomes important, add `/Users/erichblume/devpi/server-dir/` to `borgmatic_source_directories`.
|
||||
|
||||
## Related
|
||||
|
||||
- [[use-pypi-proxy]] - Client configuration and package uploads
|
||||
- [[argocd]] - Deployment
|
||||
- [[1password]] - Secrets management
|
||||
- [[devpi-on-indri]] — Deploy, verify, and version-bump procedures
|
||||
- [[use-pypi-proxy]] — Client configuration and package uploads
|
||||
- [[1password]] — Secrets management
|
||||
|
|
|
|||
|
|
@ -62,7 +62,7 @@ Other data lives directly on [[sifaka]] (music via [[navidrome]], video via [[je
|
|||
| ZIM archives (`~/transmission/`) | Re-downloadable via torrent |
|
||||
| Prometheus metrics | Ephemeral, in k8s PVC |
|
||||
| Loki logs | Ephemeral, in k8s PVC |
|
||||
| devpi cache | Re-fetchable from PyPI |
|
||||
| devpi cache (`~/devpi/server-dir/` on indri) | Re-fetchable from PyPI on first request |
|
||||
|
||||
## Retention Policy
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue