P5: Migrate devpi to Kubernetes #34

Merged
eblume merged 12 commits from feature/p5-devpi into main 2026-01-20 14:55:37 -08:00
18 changed files with 474 additions and 75 deletions

View file

@ -33,35 +33,105 @@ The user will review your work as you go, and will merge the pr as the last step
4. Use `Brewfile` and `mise.toml` to install tools needed on the development workstation (typically hostnamed "gilbert", username "eblume").
5. Services are typically hosted on hostname "indri" and are launched from LaunchAgents of the user `erichblume`. If a service is available from `brew services` that is typically used, otherwise there is a utility called `mcquack` (`mcquack --help`) hosted at `https://forge.tail8d86e.ts.net/eblume/mcquack` - but you can just edit the mcquack launchagents directly via ansible.
5. Services are hosted either on indri directly (via ansible) or in Kubernetes (via ArgoCD). See the "Service Deployment" section below for details.
6. Try to always test changes before applying them. Use syntax checkers, do dry runs (`--check --diff`), run commands manually via `ssh indri 'some command'`, etc.
7. **Wait for user review before deploying.** After creating a PR, do not run `mise run provision-indri` or other deployment commands until the user has had a chance to review the changes. The user will indicate when they're ready to deploy.
7. **Wait for user review before deploying.** After creating a PR, do not run deployment commands until the user has had a chance to review the changes. The user will indicate when they're ready to deploy.
8. After deploying changes, try to verify the result. Use `mise run indri-services-check` to do a general service health check.
## Project structure
Some important places you can look:
## Project Structure
```
./mise-tasks/ # management and utility scripts run via `mise run`
./ansible/playbooks/indri.yml # primary blumeops provisioning script
./ansible/roles/ # role dirs here give good overview of services
./pulumi/ # python (via uv) pulumi script for provisioning the tailnet and other cloud resources
~/code/personal/ # projects managed by the user
~/code/3rd/ # external projects, mirrored or downloaded
~/code/work # FORBIDDEN, never go here, avoid searching it
./mise-tasks/ # management and utility scripts run via `mise run`
./ansible/playbooks/ # ansible playbooks (indri.yml is primary)
./ansible/roles/ # ansible roles for indri-hosted services
./argocd/apps/ # ArgoCD Application definitions (app-of-apps pattern)
./argocd/manifests/ # Kubernetes manifests for each service
./pulumi/ # Pulumi IaC for tailnet ACLs and cloud resources
./plans/ # Migration and project planning documents
~/code/personal/ # projects managed by the user
~/code/3rd/ # external projects, mirrored or downloaded
~/code/work # FORBIDDEN, never go here, avoid searching it
```
## Service Deployment
### Kubernetes Services (via ArgoCD)
Most services are migrating to Kubernetes. These are managed via ArgoCD using the app-of-apps pattern:
- **Application definitions**: `argocd/apps/<service>.yaml`
- **Manifests**: `argocd/manifests/<service>/`
- **Sync policy**: Manual sync (no auto-sync on git push)
**PR workflow for k8s services:**
1. Create feature branch and add/modify manifests
2. Push branch to forge
3. Sync the `apps` application to pick up new Application definitions:
```fish
argocd app sync apps
```
4. Point the service app at the feature branch for testing:
```fish
argocd app set <service> --revision feature/branch-name
argocd app sync <service>
```
5. Test the deployment
6. After PR merge, reset to main and resync:
```fish
argocd app set <service> --revision main
argocd app sync <service>
```
**Useful commands:**
```fish
argocd app list # List all apps
argocd app get <app> # Get app details
argocd app diff <app> # Preview changes before sync
argocd app sync <app> # Sync an app
kubectl --context=minikube-indri get pods -n <namespace> # Check pods
kubectl --context=minikube-indri logs -n <namespace> <pod> # View logs
```
Note: The user has fish abbreviations `ki` for `kubectl --context=minikube-indri` and `k9i` for `k9s --context=minikube-indri`, but these only work in interactive shells.
### Indri Services (via Ansible)
Some services remain on indri outside of Kubernetes:
- **Zot Registry** - Container registry (k8s depends on it)
- **Prometheus/Loki** - Observability (must survive k8s failures)
- **Borgmatic** - Backup system
- **Grafana Alloy** - Metrics/logs collector
- **Transmission** - BitTorrent for kiwix downloads
**Deployment:**
```fish
mise run provision-indri # Full playbook
mise run provision-indri -- --tags <role> # Specific role
mise run provision-indri -- --check --diff # Dry run
```
### Tailscale Service Hostnames
When migrating a service from indri to k8s, the Tailscale hostname must be freed:
1. Stop the service on indri
2. Clear the tailscale serve entry: `ssh indri 'tailscale serve clear svc:<name>'`
3. Delete the device from Tailscale admin console (user action required)
4. Deploy the k8s Ingress - it will claim the hostname
Use `ssh indri 'tailscale serve status --json'` to check current serve entries (the non-JSON output may be empty even when entries exist).
## Third-Party Projects
When a task requires cloning or using a third-party git repository (e.g., for building from source), **ask the user to mirror it on forge first**, then clone from the mirror:
- Mirror location: `https://forge.tail8d86e.ts.net/eblume/<project>.git`
- Clone to: `~/code/3rd/<project>/`
This avoids external dependencies and ensures the project is available even if the upstream is unreachable. Example mirrors:
- `https://forge.tail8d86e.ts.net/eblume/zot.git` (container registry)
- `https://forge.tail8d86e.ts.net/eblume/devpi.git` (PyPI proxy)
This avoids external dependencies and ensures the project is available even if the upstream is unreachable.
## Task Discovery

View file

@ -42,10 +42,7 @@
tags: borgmatic_metrics
- role: forgejo
tags: forgejo
- role: devpi
tags: devpi
- role: devpi_metrics
tags: devpi_metrics
# NOTE: devpi and devpi_metrics roles removed - now hosted in k8s (see argocd/apps/devpi.yaml)
- role: zot
tags: zot
- role: zot_metrics

View file

@ -43,12 +43,7 @@ alloy_brew_logs:
# NOTE: postgresql and miniflux removed - now hosted in k8s
alloy_mcquack_logs:
- path: /Users/erichblume/Library/Logs/mcquack.devpi.out.log
service: devpi
stream: stdout
- path: /Users/erichblume/Library/Logs/mcquack.devpi.err.log
service: devpi
stream: stderr
# NOTE: devpi logs removed - now hosted in k8s
- path: /Users/erichblume/Library/Logs/mcquack.kiwix-serve.out.log
service: kiwix
stream: stdout

View file

@ -11,13 +11,13 @@ borgmatic_schedule_hour: 2
borgmatic_schedule_minute: 0
# Source directories to back up
# NOTE: devpi removed - now hosted in k8s (PVC handles persistence)
borgmatic_source_directories:
- /Users/erichblume/code/personal/zk
- /opt/homebrew/var/forgejo
- /Users/erichblume/.config/borgmatic
- /Users/erichblume/Documents
- /Users/erichblume/Pictures
- /Users/erichblume/devpi
- /opt/homebrew/var/loki
# Backup repository
@ -28,9 +28,7 @@ borgmatic_repositories:
append_only: true
# Exclude patterns
borgmatic_exclude_patterns:
# Exclude mirrored PyPI cache (only backup private packages)
- /Users/erichblume/devpi/+files/root/pypi
borgmatic_exclude_patterns: []
# Encryption passcommand (reads borg passphrase)
borgmatic_encryption_passcommand: cat /Users/erichblume/.borg/config.yaml

View file

@ -2,6 +2,22 @@
# Uses host.containers.internal which is stable across restarts
# Applied by ansible minikube role
# Direct access to Zot for private images (blumeops/*)
[[registry]]
prefix = "host.containers.internal:5050"
location = "host.containers.internal:5050"
insecure = true
# Tailscale hostname for Zot - redirects to local access
# Allows manifests to use registry.tail8d86e.ts.net which is cleaner
[[registry]]
prefix = "registry.tail8d86e.ts.net"
location = "registry.tail8d86e.ts.net"
[[registry.mirror]]
location = "host.containers.internal:5050"
insecure = true
[[registry]]
prefix = "docker.io"
location = "docker.io"

View file

@ -3,7 +3,7 @@
# Each service maps a Tailscale service name to local endpoints
tailscale_serve_services:
# NOTE: svc:grafana, svc:pg, svc:feed removed - now hosted in k8s
# NOTE: svc:grafana, svc:pg, svc:feed, svc:pypi removed - now hosted in k8s
- name: svc:forge
https:
@ -18,11 +18,6 @@ tailscale_serve_services:
port: 443
upstream: http://localhost:5501
- name: svc:pypi
https:
port: 443
upstream: http://127.0.0.1:3141
- name: svc:registry
https:
port: 443

View file

@ -15,9 +15,7 @@ spec:
server: https://kubernetes.default.svc
namespace: argocd
syncPolicy:
automated:
prune: true
# selfHeal disabled: allows manual revision changes on child apps during development
# Sync apps app manually when adding/removing Application manifests
syncOptions:
- CreateNamespace=true
# Manual sync only - no automated sync on git push
# To pick up new apps: argocd app sync apps

30
argocd/apps/devpi.yaml Normal file
View file

@ -0,0 +1,30 @@
# devpi PyPI Caching Proxy
# Provides PyPI cache and private package hosting
#
# After first deployment, initialize devpi:
# kubectl -n devpi exec -it devpi-0 -- devpi-init --serverdir /devpi --root-passwd <password>
# kubectl -n devpi rollout restart statefulset devpi
#
# Then create user/index:
# uvx devpi use https://pypi.tail8d86e.ts.net
# uvx devpi login root
# uvx devpi user -c eblume email=blume.erich@gmail.com
# uvx devpi index -c eblume/dev bases=root/pypi
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: devpi
namespace: argocd
spec:
project: default
source:
repoURL: ssh://forgejo@indri.tail8d86e.ts.net:2200/eblume/blumeops.git
targetRevision: main
path: argocd/manifests/devpi
destination:
server: https://kubernetes.default.svc
namespace: devpi
syncPolicy:
syncOptions:
- CreateNamespace=true
# Manual sync only - no automated sync on git push

View file

@ -0,0 +1,19 @@
FROM python:3.12-slim
# Install devpi-server and devpi-web
RUN pip install --no-cache-dir devpi-server devpi-web
# Create non-root user
RUN useradd -r -u 1000 devpi && mkdir -p /devpi && chown devpi:devpi /devpi
# Add startup script
COPY --chown=devpi:devpi start.sh /usr/local/bin/start.sh
RUN chmod +x /usr/local/bin/start.sh
USER devpi
WORKDIR /devpi
# Expose default port
EXPOSE 3141
ENTRYPOINT ["/usr/local/bin/start.sh"]

View file

@ -0,0 +1,72 @@
# devpi PyPI Caching Proxy
devpi-server running in Kubernetes, providing:
- PyPI caching proxy at `root/pypi`
- Private package hosting at `eblume/dev`
## Setup
### 1. Create the root password secret
```fish
kubectl create namespace devpi
op inject -i argocd/manifests/devpi/secret-root.yaml.tpl | kubectl apply -f -
```
### 2. Deploy via ArgoCD
```fish
argocd app sync apps
argocd app sync devpi
```
The container will auto-initialize on first startup using the root password from the secret.
### 3. Create user and index (first time only)
After the pod is running:
```fish
# Login to devpi as root
uvx --from devpi-client devpi use https://pypi.tail8d86e.ts.net
uvx --from devpi-client devpi login root
# Enter root password when prompted
# Create eblume user (prompts for password - use the one from 1Password)
uvx --from devpi-client devpi user -c eblume email=blume.erich@gmail.com
# Create private index inheriting from PyPI
uvx --from devpi-client devpi index -c eblume/dev bases=root/pypi
```
## Usage
### As pip index (caching proxy)
Configure `~/.config/pip/pip.conf`:
```ini
[global]
index-url = https://pypi.tail8d86e.ts.net/root/pypi/+simple/
trusted-host = pypi.tail8d86e.ts.net
```
### Upload private packages
```fish
cd ~/code/personal/your-package
uv build
uv publish --publish-url https://pypi.tail8d86e.ts.net/eblume/dev/
```
## URLs
- Web UI: https://pypi.tail8d86e.ts.net
- PyPI cache: https://pypi.tail8d86e.ts.net/root/pypi/+simple/
- Private index: https://pypi.tail8d86e.ts.net/eblume/dev/+simple/
## Credentials
Stored in 1Password vault `blumeops`, item `kyhzfifryqnuk7jeyibmmjvxxm`:
- `root password` - devpi root user
- `password` - eblume user password

View file

@ -0,0 +1,17 @@
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: devpi-tailscale
namespace: devpi
annotations:
tailscale.com/proxy-class: "crio-compat"
spec:
ingressClassName: tailscale
defaultBackend:
service:
name: devpi
port:
number: 3141
tls:
- hosts:
- pypi

View file

@ -0,0 +1,9 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: devpi
resources:
- statefulset.yaml
- service.yaml
- ingress-tailscale.yaml

View file

@ -0,0 +1,12 @@
# Template for devpi root password secret
# Create the secret before deploying:
# kubectl create namespace devpi
# op inject -i argocd/manifests/devpi/secret-root.yaml.tpl | kubectl apply -f -
apiVersion: v1
kind: Secret
metadata:
name: devpi-root
namespace: devpi
type: Opaque
stringData:
password: "{{ op://vg6xf6vvfmoh5hqjjhlhbeoaie/kyhzfifryqnuk7jeyibmmjvxxm/root password }}"

View file

@ -0,0 +1,13 @@
apiVersion: v1
kind: Service
metadata:
name: devpi
namespace: devpi
spec:
selector:
app: devpi
ports:
- name: http
port: 3141
targetPort: 3141
protocol: TCP

View file

@ -0,0 +1,31 @@
#!/bin/bash
set -e
SERVERDIR="${DEVPI_SERVERDIR:-/devpi}"
HOST="${DEVPI_HOST:-0.0.0.0}"
# Note: Can't use DEVPI_PORT - Kubernetes auto-sets it for service discovery
PORT="${DEVPI_LISTEN_PORT:-3141}"
OUTSIDE_URL="${DEVPI_OUTSIDE_URL:-}"
# Check if devpi is initialized
if [ ! -f "$SERVERDIR/.serverversion" ]; then
echo "Initializing devpi server..."
if [ -z "$DEVPI_ROOT_PASSWORD" ]; then
echo "ERROR: DEVPI_ROOT_PASSWORD environment variable must be set for initialization"
exit 1
fi
devpi-init --serverdir "$SERVERDIR" --root-passwd "$DEVPI_ROOT_PASSWORD"
echo "Devpi initialized successfully"
fi
# Build command
CMD="devpi-server --serverdir $SERVERDIR --host $HOST --port $PORT"
if [ -n "$OUTSIDE_URL" ]; then
CMD="$CMD --outside-url $OUTSIDE_URL"
fi
echo "Starting devpi-server..."
exec $CMD

View file

@ -0,0 +1,62 @@
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: devpi
namespace: devpi
spec:
serviceName: devpi
replicas: 1
selector:
matchLabels:
app: devpi
template:
metadata:
labels:
app: devpi
spec:
securityContext:
fsGroup: 1000
containers:
- name: devpi
image: registry.tail8d86e.ts.net/blumeops/devpi:latest
env:
- name: DEVPI_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: devpi-root
key: password
- name: DEVPI_OUTSIDE_URL
value: "https://pypi.tail8d86e.ts.net"
ports:
- containerPort: 3141
name: http
volumeMounts:
- name: data
mountPath: /devpi
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "2Gi" # High limit for initial PyPI index build, reclaimed after
cpu: "500m"
livenessProbe:
httpGet:
path: /+api
port: 3141
initialDelaySeconds: 30
periodSeconds: 30
readinessProbe:
httpGet:
path: /+api
port: 3141
initialDelaySeconds: 10
periodSeconds: 10
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 50Gi

View file

@ -0,0 +1,102 @@
# Phase 5: devpi Migration to Kubernetes
**Goal**: Migrate devpi PyPI caching proxy from indri to k8s
**Status**: Complete (2026-01-20)
**Prerequisites**: [Phase 4](P4_miniflux.complete.md) complete
---
## Summary
Successfully migrated devpi from mcquack LaunchAgent on indri to Kubernetes:
- Custom container image with devpi-server + devpi-web + auto-init startup script
- StatefulSet with 50Gi PVC for data persistence
- Tailscale Ingress at `pypi.tail8d86e.ts.net`
- Root password from 1Password secret, auto-initialized on first run
- Verified pip caching proxy and mcquack package upload
---
## Key Learnings
### Registry Mirror Configuration
- Minikube's CRI-O can't resolve Tailscale hostnames directly
- Added registry mirror config to redirect `registry.tail8d86e.ts.net``host.containers.internal:5050`
- Also added direct insecure registry entry for `host.containers.internal:5050`
- Config in `ansible/roles/minikube/files/zot-mirror.conf`
### Memory Requirements
- devpi-web's Whoosh search indexer needs significant memory during PyPI index build
- Initial 512Mi limit caused OOMKills
- Solution: High limit (2Gi) with low request (256Mi) - memory reclaimed after indexing
### Environment Variable Conflicts
- Kubernetes auto-sets `DEVPI_PORT` for service discovery
- Conflicted with our port config - renamed to `DEVPI_LISTEN_PORT`
### Tailscale Serve Cleanup
- Use `tailscale serve status --json` to see entries (non-JSON output can be empty)
- Use `tailscale serve clear svc:<name>` to remove entries
### ArgoCD Workflow
- Changed `apps` to manual sync (was auto-sync with prune)
- Workflow: sync apps → set revision to feature branch → sync service → test → reset to main after merge
---
## Verification Checklist
- [x] devpi pod healthy in k8s
- [x] https://pypi.tail8d86e.ts.net accessible
- [x] Web interface shows root/pypi index
- [x] `pip install <package>` works through proxy
- [x] mcquack v1.0.0 uploaded to eblume/dev
- [x] `pip install --index-url https://pypi.tail8d86e.ts.net/eblume/dev/+simple/ mcquack` works
- [x] Old devpi service removed from indri
- [ ] zk documentation updated (deferred - no existing devpi card)
---
## Files Changed
### New Files
| Path | Purpose |
|------|---------|
| `argocd/apps/devpi.yaml` | ArgoCD Application definition |
| `argocd/manifests/devpi/Dockerfile` | Container image with startup script |
| `argocd/manifests/devpi/start.sh` | Auto-init startup script |
| `argocd/manifests/devpi/statefulset.yaml` | StatefulSet with PVC |
| `argocd/manifests/devpi/service.yaml` | ClusterIP Service |
| `argocd/manifests/devpi/ingress-tailscale.yaml` | Tailscale Ingress |
| `argocd/manifests/devpi/kustomization.yaml` | Kustomize configuration |
| `argocd/manifests/devpi/secret-root.yaml.tpl` | 1Password secret template |
| `argocd/manifests/devpi/README.md` | Setup documentation |
### Modified Files
| Path | Change |
|------|--------|
| `CLAUDE.md` | Added k8s/ArgoCD workflow documentation |
| `ansible/playbooks/indri.yml` | Removed devpi and devpi_metrics roles |
| `ansible/roles/tailscale_serve/defaults/main.yml` | Removed svc:pypi |
| `ansible/roles/alloy/defaults/main.yml` | Removed devpi log collection |
| `ansible/roles/borgmatic/defaults/main.yml` | Removed devpi backup paths |
| `ansible/roles/minikube/files/zot-mirror.conf` | Added registry mirror for Tailscale hostname |
| `argocd/apps/apps.yaml` | Changed to manual sync policy |
### Roles Kept (not deleted)
- `ansible/roles/devpi/` - Kept for reference
- `ansible/roles/devpi_metrics/` - Kept for reference
---
## Post-Merge Cleanup
After PR merge, reset ArgoCD apps to main:
```fish
argocd app set apps --revision main
argocd app sync apps
argocd app set devpi --revision main
argocd app sync devpi
```

View file

@ -1,37 +0,0 @@
# Phase 5: devpi Migration
**Goal**: Migrate devpi to k8s
**Status**: Pending
**Prerequisites**: [Phase 4](P4_miniflux.md) complete
---
## Steps
### 1. Build devpi container
- Dockerfile with devpi-server + devpi-web
- Push to local Zot registry
---
### 2. Deploy as StatefulSet
- PVC for data (50Gi)
- Migrate existing data (excluding PyPI cache)
---
### 3. Configure Tailscale LoadBalancer
Tag: `svc:pypi`
---
### 4. Update pip.conf on gilbert
---
### 5. Stop mcquack devpi