## Summary
C2 Mikado chain to move the entire Immich stack (server, ML, valkey,
postgres) off `minikube-indri` and onto `k3s-ringtail`. Immich is the
largest single tenant on minikube (~1.5 GiB resident) and minikube is
currently memory-saturated (97% RAM, swapping). This is the first
concrete chain in the broader indri-k8s decommission effort.
This PR contains the planning layer only — 7 cards (1 goal + 6
prerequisites). Implementation cycles follow per the Mikado Branch
Invariant.
## Goal end-state
- Immich `server`, `machine-learning`, `valkey` on ringtail.
- ML pod uses ringtail's RTX 4080 (performance win — currently
CPU-only).
- CNPG `immich-pg` (PG17 + VectorChord) runs on ringtail.
- Library still on sifaka NFS — ringtail mounts the same path.
- `photos.ops.eblu.me` reroutes through Caddy → ringtail ingress.
- Minikube `immich` and `immich-pg` are removed.
## Cards
| Card | Depends on |
|---|---|
| `migrate-immich-to-ringtail` (goal) | all six below |
| `cnpg-on-ringtail` | — |
| `immich-pg-on-ringtail` | cnpg-on-ringtail |
| `immich-pg-data-migration` | immich-pg-on-ringtail |
| `sifaka-nfs-from-ringtail` | — |
| `immich-app-on-ringtail` | immich-pg-on-ringtail, sifaka-nfs-from-ringtail |
| `immich-cutover-and-decommission` | immich-pg-data-migration, immich-app-on-ringtail |
## Key constraints
- **No data loss.** Downtime is acceptable; data loss is not. Two
surfaces matter: postgres (ML embeddings, face data — slow to
re-derive) and the library files (don't move, but NFS access from
ringtail must be verified).
- **Migration method:** Option A is a CNPG `externalCluster`
basebackup → promote. Option B is `pg_dump`/`pg_restore` as a
documented fallback. Either way, dry-run against a scratch
cluster first.
- **Why pg moves too** (not cross-cluster): keeping pg on minikube
would block the whole decommission, and Immich is chatty with pg
so tailnet round-trips would hurt.
## Test plan
- [ ] Plan review — does the dependency graph make sense?
- [ ] `mise run docs-mikado migrate-immich-to-ringtail` shows the
chain correctly.
- [ ] Per-card implementation cycles land separately (commit
convention enforced by hook).
Reviewed-on: #356
4.3 KiB
4.3 KiB
| title | modified | last-reviewed | tags | ||||
|---|---|---|---|---|---|---|---|
| Immich Cutover and Decommission | 2026-05-13 | 2026-05-13 |
|
Immich Cutover and Decommission
The user-visible flip. By the time this card opens, the ringtail stack has been proven against a copy of the data. This card does the real cutover.
Pre-cutover checklist
- immich-pg-data-migration dry-run succeeded; method is chosen.
- Ringtail immich stack has been brought up against the test pg, pods healthy, UI loaded (immich-app-on-ringtail#Verification).
- Borgmatic just ran successfully (a fresh nightly archive is a belt-and-suspenders fallback, on top of the live source pg).
- User has been told to stop uploading from the iOS app for the cutover window.
Cutover sequence
- Quiesce source.
kubectl --context=minikube-indri -n immich scale deploy/immich-server --replicas=0and same for ML. Leave valkey + pg running. Confirm no client traffic on the source pg viapg_stat_activity. - Tear down the minikube Tailscale ingress. The
photosTailscale device name must be freed before ringtail's ingress can claim it (Tailscale enforces uniqueness across the tailnet).kubectl --context=minikube-indri -n immich delete ingress immich-tailscaleand wait for the correspondingtailscale-LB StatefulSet pod to terminate. Verify thephotosdevice is gone:tailscale status | grep -i photosfrom any tailnet host. - Final sync. Per chosen method in
immich-pg-data-migration:
- Option A: promote the ringtail replica.
- Option B: take final
pg_dump, restore to ringtailimmich-pg.
- Verify. Run the row-count and schema-diff checks from immich-pg-data-migration#Verification on the real run.
- Flip the ringtail ingress to
photos. Updateargocd/manifests/immich-ringtail/ingress-tailscale.yaml:tls.hosts: [photos](was[photos-ringtail]during staging per immich-app-on-ringtail). Commit,argocd app sync immich-ringtail. Wait for thephotosdevice to register on the tailnet again. - Bring up ringtail immich against the now-promoted pg
(
argocd app sync immich-ringtail). Wait for Ready. - Flip routing. Update Caddy on indri
(
ansible/roles/caddy/defaults/main.yml):photos.ops.eblu.meupstream changes to the ringtail Tailscale ingress hostname (photos— same MagicDNS name, now pointing to the ringtail proxy).mise run provision-indri -- --tags caddy. - Smoke test. Open
photos.ops.eblu.mein a browser. Sign in. Scroll the timeline. Open an album. Trigger an ML search. - Update borgmatic. If the Tailscale hostname for pg changed,
update
borgmatic.cfgon indri to point at the ringtailimmich-pg-tailscaleservice. Run a manual backup to verify.
After cutover
argocd app set immich --revision <branch>is no longer relevant; the minikubeimmichapp gets deleted entirely.- Delete
argocd/apps/immich.yaml,argocd/manifests/immich/, and the minikubeargocd/manifests/databases/immich-pg.yaml+external-secret-immich-borgmatic.yaml+service-immich-pg-tailscale.yaml. - Rename
immich-ringtailback toimmich(the-ringtailsuffix was scaffolding for the dual-cluster window; once minikube is empty of immich, the unsuffixed name is clean). - Confirm the minikube
immich-pgPVC is no longer used, then delete it (the PV withRetainpolicy will persist — clean that up too).
Verification (definition of done)
photos.ops.eblu.meworks for a real session, including ML search.- Source minikube has no
immichpods, noimmich-pg, no PVCs. - Memory pressure on minikube has dropped (≥1.5 GiB reclaimed). Check
docker stats minikubeon indri. - Nightly borgmatic run after the cutover completes successfully, with the immich-pg archive showing the new source.
Rollback (within the cutover window)
If smoke test fails: flip Caddy back, scale ringtail immich to 0, scale source immich back up. Source pg was never destroyed. File a plan reset on the relevant prerequisite card and try again next session.
Out of scope
- Decommissioning all of minikube. This chain just removes immich. Other tenants migrate in their own chains as part of the broader indri-k8s decommission. See migrate-immich-to-ringtail for context.