## Summary
C2 Mikado chain to move the entire Immich stack (server, ML, valkey,
postgres) off `minikube-indri` and onto `k3s-ringtail`. Immich is the
largest single tenant on minikube (~1.5 GiB resident) and minikube is
currently memory-saturated (97% RAM, swapping). This is the first
concrete chain in the broader indri-k8s decommission effort.
This PR contains the planning layer only — 7 cards (1 goal + 6
prerequisites). Implementation cycles follow per the Mikado Branch
Invariant.
## Goal end-state
- Immich `server`, `machine-learning`, `valkey` on ringtail.
- ML pod uses ringtail's RTX 4080 (performance win — currently
CPU-only).
- CNPG `immich-pg` (PG17 + VectorChord) runs on ringtail.
- Library still on sifaka NFS — ringtail mounts the same path.
- `photos.ops.eblu.me` reroutes through Caddy → ringtail ingress.
- Minikube `immich` and `immich-pg` are removed.
## Cards
| Card | Depends on |
|---|---|
| `migrate-immich-to-ringtail` (goal) | all six below |
| `cnpg-on-ringtail` | — |
| `immich-pg-on-ringtail` | cnpg-on-ringtail |
| `immich-pg-data-migration` | immich-pg-on-ringtail |
| `sifaka-nfs-from-ringtail` | — |
| `immich-app-on-ringtail` | immich-pg-on-ringtail, sifaka-nfs-from-ringtail |
| `immich-cutover-and-decommission` | immich-pg-data-migration, immich-app-on-ringtail |
## Key constraints
- **No data loss.** Downtime is acceptable; data loss is not. Two
surfaces matter: postgres (ML embeddings, face data — slow to
re-derive) and the library files (don't move, but NFS access from
ringtail must be verified).
- **Migration method:** Option A is a CNPG `externalCluster`
basebackup → promote. Option B is `pg_dump`/`pg_restore` as a
documented fallback. Either way, dry-run against a scratch
cluster first.
- **Why pg moves too** (not cross-cluster): keeping pg on minikube
would block the whole decommission, and Immich is chatty with pg
so tailnet round-trips would hurt.
## Test plan
- [ ] Plan review — does the dependency graph make sense?
- [ ] `mise run docs-mikado migrate-immich-to-ringtail` shows the
chain correctly.
- [ ] Per-card implementation cycles land separately (commit
convention enforced by hook).
Reviewed-on: #356
4.9 KiB
| title | modified | last-reviewed | tags | ||||
|---|---|---|---|---|---|---|---|
| Migrate Immich to Ringtail | 2026-05-13 | 2026-05-13 |
|
Migrate Immich to Ringtail
Move the entire Immich stack (server, ML, valkey, postgres) off
minikube-indri and onto k3s-ringtail. This is the first concrete
chain in the broader indri-k8s decommission: minikube is
memory-saturated (97% RAM, swapping), and Immich is the single
largest tenant (~1.5 GiB resident).
End state
- Immich
server,machine-learning, andvalkeyDeployments run on ringtail k3s in theimmichnamespace. - The
immich-machine-learningpod uses ringtail's RTX 4080 via thenvidia-device-plugin(performance win — currently CPU-only on minikube). - A CNPG
immich-pgCluster (PostgreSQL 17 + VectorChord) runs in adatabasesnamespace on ringtail, owned by thecnpg-systemoperator on ringtail. - The photo library still lives on sifaka at
/volume1/photos, mounted via NFS from ringtail pods (RWX). - Routing:
photos.ops.eblu.me(Caddy on indri) proxies to a Tailscale ProxyGroup ingress on ringtail. No public surface today. - The ArgoCD
immichapp'sdestination.serverpoints athttps://ringtail.tail8d86e.ts.net:6443. The old minikube manifests are removed.
Non-goals
- Public exposure via Fly. Immich stays tailnet-only.
- Changing the immich version or runtime configuration. This is a lift-and-shift; bumps come later.
- Backing up to a different target. borgmatic keeps running on indri (it pulls via Tailscale and uses sifaka SMB for the library).
Critical constraint: no data loss
Downtime is acceptable (Immich is a single-user system; we can take it offline for the cutover). Data loss is not. Two surfaces matter:
- Postgres — face data, ML embeddings (vectors), album state, sharing, etc. Re-derivable in theory; weeks of recompute in practice. See immich-pg-data-migration.
- Library files —
/volume1/photos. Not moving, but the NFS path must be verified accessible from ringtail before cutover. See sifaka-nfs-from-ringtail.
borgmatic backs both up to sifaka + BorgBase nightly; restore is possible but slow. Treat it as a fallback, not a plan.
Why postgres on ringtail (not cross-cluster)
immich-pg already has a Tailscale Service we could point ringtail
at, leaving the DB on minikube. We're not doing that because:
- The whole goal is to retire minikube — keeping pg there blocks it.
- Immich is chatty against pg; tailnet round-trips would hurt.
- CNPG is the same operator on both sides — a Cluster CR on ringtail is mechanically equivalent.
Approach
This is a C2 Mikado chain. The prerequisite cards each represent a distinct surface that has to work before cutover. See agent-change-process#C2 — Mikado Chain for the discipline.
Workflow note: registering new ArgoCD apps during the chain
This chain adds three new ArgoCD Application definitions in
argocd/apps/: cloudnative-pg-ringtail, databases-ringtail,
and (later) immich-ringtail. The usual C1/C2 pattern of
argocd app set <app> --revision <branch> && argocd app sync <app>
does NOT work for the app-of-apps apps Application itself, because
apps self-manages: it re-reads apps.yaml (which declares
targetRevision: main) on every sync and reverts the override. As a
result, new app definitions added on a feature branch are never
visible to the cluster via apps.
Use kubectl apply to register each new Application directly:
kubectl --context=minikube-indri apply -f argocd/apps/<new-app>.yaml
This creates the Application resource out-of-band, bypassing apps.
For apps whose source lives in this repo (e.g.
databases-ringtail, immich-ringtail — manifest paths exist only
on the branch until merge), follow the apply with a branch override:
argocd app set <new-app> --revision mikado/migrate-immich-to-ringtail
argocd app sync <new-app>
For apps whose source is an external repo at a pinned tag (e.g.
cloudnative-pg-ringtail → mirrors/cloudnative-pg v1.27.1), no
override is needed — the source revision is independent of this PR.
After PR merge:
argocd app set <new-app> --revision main
argocd app sync <new-app>
apps itself, on its next sync from main, will discover the new
Application definitions in argocd/apps/ and adopt the already-running
resources without disruption — provided their in-cluster spec matches
the on-disk definitions (which it does because we applied the same
file).
Related
- shower-on-ringtail — a previous migration to ringtail (simpler: no upstream cluster, SQLite, no GPU)
- connect-to-postgres — getting a psql session against CNPG
- ringtail — the target cluster
- cnpg-on-ringtail, immich-pg-on-ringtail, immich-pg-data-migration, sifaka-nfs-from-ringtail, immich-app-on-ringtail, immich-cutover-and-decommission — the prerequisite cards