C2: migrate immich from minikube to ringtail (mikado chain) (#356)
## Summary
C2 Mikado chain to move the entire Immich stack (server, ML, valkey,
postgres) off `minikube-indri` and onto `k3s-ringtail`. Immich is the
largest single tenant on minikube (~1.5 GiB resident) and minikube is
currently memory-saturated (97% RAM, swapping). This is the first
concrete chain in the broader indri-k8s decommission effort.
This PR contains the planning layer only — 7 cards (1 goal + 6
prerequisites). Implementation cycles follow per the Mikado Branch
Invariant.
## Goal end-state
- Immich `server`, `machine-learning`, `valkey` on ringtail.
- ML pod uses ringtail's RTX 4080 (performance win — currently
CPU-only).
- CNPG `immich-pg` (PG17 + VectorChord) runs on ringtail.
- Library still on sifaka NFS — ringtail mounts the same path.
- `photos.ops.eblu.me` reroutes through Caddy → ringtail ingress.
- Minikube `immich` and `immich-pg` are removed.
## Cards
| Card | Depends on |
|---|---|
| `migrate-immich-to-ringtail` (goal) | all six below |
| `cnpg-on-ringtail` | — |
| `immich-pg-on-ringtail` | cnpg-on-ringtail |
| `immich-pg-data-migration` | immich-pg-on-ringtail |
| `sifaka-nfs-from-ringtail` | — |
| `immich-app-on-ringtail` | immich-pg-on-ringtail, sifaka-nfs-from-ringtail |
| `immich-cutover-and-decommission` | immich-pg-data-migration, immich-app-on-ringtail |
## Key constraints
- **No data loss.** Downtime is acceptable; data loss is not. Two
surfaces matter: postgres (ML embeddings, face data — slow to
re-derive) and the library files (don't move, but NFS access from
ringtail must be verified).
- **Migration method:** Option A is a CNPG `externalCluster`
basebackup → promote. Option B is `pg_dump`/`pg_restore` as a
documented fallback. Either way, dry-run against a scratch
cluster first.
- **Why pg moves too** (not cross-cluster): keeping pg on minikube
would block the whole decommission, and Immich is chatty with pg
so tailnet round-trips would hurt.
## Test plan
- [ ] Plan review — does the dependency graph make sense?
- [ ] `mise run docs-mikado migrate-immich-to-ringtail` shows the
chain correctly.
- [ ] Per-card implementation cycles land separately (commit
convention enforced by hook).
Reviewed-on: #356
This commit is contained in:
parent
bc8ceb502b
commit
947e4310c3
32 changed files with 820 additions and 265 deletions
27
argocd/apps/cloudnative-pg-ringtail.yaml
Normal file
27
argocd/apps/cloudnative-pg-ringtail.yaml
Normal file
|
|
@ -0,0 +1,27 @@
|
|||
# CloudNativePG Operator for ringtail k3s cluster
|
||||
# Deploys the operator only; PostgreSQL clusters are created separately
|
||||
#
|
||||
# Sibling of cloudnative-pg.yaml (minikube). Same mirror, same release,
|
||||
# different destination. Both apps will coexist during the immich
|
||||
# migration; the minikube one is removed at the end of the broader
|
||||
# indri-k8s decommission.
|
||||
apiVersion: argoproj.io/v1alpha1
|
||||
kind: Application
|
||||
metadata:
|
||||
name: cloudnative-pg-ringtail
|
||||
namespace: argocd
|
||||
spec:
|
||||
project: default
|
||||
source:
|
||||
repoURL: ssh://forgejo@forge.ops.eblu.me:2222/mirrors/cloudnative-pg.git
|
||||
targetRevision: v1.27.1
|
||||
path: releases
|
||||
directory:
|
||||
include: 'cnpg-1.27.1.yaml'
|
||||
destination:
|
||||
server: https://ringtail.tail8d86e.ts.net:6443
|
||||
namespace: cnpg-system
|
||||
syncPolicy:
|
||||
syncOptions:
|
||||
- CreateNamespace=true
|
||||
- ServerSideApply=true # Required for large CRDs that exceed annotation size limit
|
||||
26
argocd/apps/databases-ringtail.yaml
Normal file
26
argocd/apps/databases-ringtail.yaml
Normal file
|
|
@ -0,0 +1,26 @@
|
|||
# Databases on ringtail k3s.
|
||||
#
|
||||
# Today: only immich-pg (CNPG Cluster) + its borgmatic ExternalSecret.
|
||||
# More databases may move here as the indri-k8s decommission proceeds.
|
||||
#
|
||||
# Prerequisites:
|
||||
# - cloudnative-pg-ringtail (operator must exist before the Cluster CR)
|
||||
# - external-secrets-ringtail + 1password-connect-ringtail (for the
|
||||
# immich-pg-borgmatic ExternalSecret to sync)
|
||||
apiVersion: argoproj.io/v1alpha1
|
||||
kind: Application
|
||||
metadata:
|
||||
name: databases-ringtail
|
||||
namespace: argocd
|
||||
spec:
|
||||
project: default
|
||||
source:
|
||||
repoURL: ssh://forgejo@forge.ops.eblu.me:2222/eblume/blumeops.git
|
||||
targetRevision: main
|
||||
path: argocd/manifests/databases-ringtail
|
||||
destination:
|
||||
server: https://ringtail.tail8d86e.ts.net:6443
|
||||
namespace: databases
|
||||
syncPolicy:
|
||||
syncOptions:
|
||||
- CreateNamespace=true
|
||||
31
argocd/apps/immich-ringtail.yaml
Normal file
31
argocd/apps/immich-ringtail.yaml
Normal file
|
|
@ -0,0 +1,31 @@
|
|||
# Immich on ringtail k3s.
|
||||
#
|
||||
# Staging deployment; the minikube `immich` app remains in parallel
|
||||
# until cutover. See [[immich-cutover-and-decommission]] for the
|
||||
# routing flip + minikube cleanup.
|
||||
#
|
||||
# Prerequisites:
|
||||
# - cnpg-on-ringtail + databases-ringtail (postgres)
|
||||
# - 1password-connect-ringtail + external-secrets-ringtail (not used
|
||||
# by this app today — immich-db Secret is created manually,
|
||||
# matching the minikube pattern)
|
||||
# - The immich-db Secret in the immich namespace, holding the
|
||||
# password for the `immich` postgres role (copied from the source
|
||||
# immich-pg-app Secret at migration time).
|
||||
apiVersion: argoproj.io/v1alpha1
|
||||
kind: Application
|
||||
metadata:
|
||||
name: immich-ringtail
|
||||
namespace: argocd
|
||||
spec:
|
||||
project: default
|
||||
source:
|
||||
repoURL: ssh://forgejo@forge.ops.eblu.me:2222/eblume/blumeops.git
|
||||
targetRevision: main
|
||||
path: argocd/manifests/immich-ringtail
|
||||
destination:
|
||||
server: https://ringtail.tail8d86e.ts.net:6443
|
||||
namespace: immich
|
||||
syncPolicy:
|
||||
syncOptions:
|
||||
- CreateNamespace=true
|
||||
|
|
@ -1,30 +0,0 @@
|
|||
# Immich - Self-hosted photo and video management
|
||||
# High-performance Google Photos/iCloud alternative with AI features
|
||||
#
|
||||
# Kustomize manifests in argocd/manifests/immich/
|
||||
# Components: server, machine-learning, valkey (Redis)
|
||||
#
|
||||
# Prerequisites:
|
||||
# 1. Create immich namespace and secrets:
|
||||
# kubectl create namespace immich
|
||||
# kubectl --context=minikube-indri create secret generic immich-db -n immich \
|
||||
# --from-literal=password="$(kubectl --context=minikube-indri -n databases get secret immich-pg-app -o jsonpath='{.data.password}' | base64 -d)"
|
||||
# 2. Create immich-pg database and user (see immich-pg app)
|
||||
# 3. NFS share on sifaka at /volume1/photos with read/write for indri
|
||||
apiVersion: argoproj.io/v1alpha1
|
||||
kind: Application
|
||||
metadata:
|
||||
name: immich
|
||||
namespace: argocd
|
||||
spec:
|
||||
project: default
|
||||
source:
|
||||
repoURL: ssh://forgejo@forge.ops.eblu.me:2222/eblume/blumeops.git
|
||||
targetRevision: main
|
||||
path: argocd/manifests/immich
|
||||
destination:
|
||||
server: https://kubernetes.default.svc
|
||||
namespace: immich
|
||||
syncPolicy:
|
||||
syncOptions:
|
||||
- CreateNamespace=true
|
||||
|
|
@ -1,9 +1,12 @@
|
|||
# ExternalSecret for borgmatic backup user password on immich-pg cluster
|
||||
# (ringtail k3s).
|
||||
#
|
||||
# Mirror of argocd/manifests/databases/external-secret-immich-borgmatic.yaml.
|
||||
# The onepassword-blumeops ClusterSecretStore exists on ringtail via the
|
||||
# external-secrets-ringtail app.
|
||||
#
|
||||
# Reuses the same 1Password item as blumeops-pg-borgmatic.
|
||||
# 1Password item: "borgmatic" in blumeops vault
|
||||
# Field: "db-password"
|
||||
#
|
||||
apiVersion: external-secrets.io/v1
|
||||
kind: ExternalSecret
|
||||
metadata:
|
||||
|
|
@ -23,7 +26,7 @@ spec:
|
|||
username: borgmatic
|
||||
password: "{{ .password }}"
|
||||
data:
|
||||
- secretKey: password
|
||||
remoteRef:
|
||||
key: borgmatic
|
||||
property: db-password
|
||||
- secretKey: password
|
||||
remoteRef:
|
||||
key: borgmatic
|
||||
property: db-password
|
||||
53
argocd/manifests/databases-ringtail/immich-pg.yaml
Normal file
53
argocd/manifests/databases-ringtail/immich-pg.yaml
Normal file
|
|
@ -0,0 +1,53 @@
|
|||
# PostgreSQL Cluster for Immich on ringtail k3s.
|
||||
#
|
||||
# Initially bootstrapped via CNPG pg_basebackup from the minikube
|
||||
# immich-pg cluster on 2026-05-13, then promoted to primary. The
|
||||
# externalClusters + bootstrap.pg_basebackup blocks have been pruned
|
||||
# from this manifest now that the migration is complete — leaving
|
||||
# them around is a footgun (re-enabling replica.enabled=true would
|
||||
# try to demote this cluster against a stale source). See
|
||||
# [[immich-pg-data-migration]] for the procedure used.
|
||||
apiVersion: postgresql.cnpg.io/v1
|
||||
kind: Cluster
|
||||
metadata:
|
||||
name: immich-pg
|
||||
namespace: databases
|
||||
spec:
|
||||
instances: 1
|
||||
imageName: ghcr.io/tensorchord/cloudnative-vectorchord:17-0.5.0
|
||||
|
||||
storage:
|
||||
size: 10Gi
|
||||
storageClass: local-path
|
||||
|
||||
# Managed roles
|
||||
managed:
|
||||
roles:
|
||||
- name: borgmatic
|
||||
login: true
|
||||
connectionLimit: -1
|
||||
ensure: present
|
||||
inherit: true
|
||||
inRoles:
|
||||
- pg_read_all_data
|
||||
passwordSecret:
|
||||
name: immich-pg-borgmatic
|
||||
|
||||
resources:
|
||||
requests:
|
||||
memory: "256Mi"
|
||||
cpu: "100m"
|
||||
limits:
|
||||
memory: "1Gi"
|
||||
cpu: "500m"
|
||||
|
||||
postgresql:
|
||||
shared_preload_libraries:
|
||||
- "vchord.so"
|
||||
parameters:
|
||||
max_connections: "50"
|
||||
shared_buffers: "128MB"
|
||||
password_encryption: "scram-sha-256"
|
||||
pg_hba:
|
||||
- host all all 0.0.0.0/0 scram-sha-256
|
||||
- host all all ::/0 scram-sha-256
|
||||
9
argocd/manifests/databases-ringtail/kustomization.yaml
Normal file
9
argocd/manifests/databases-ringtail/kustomization.yaml
Normal file
|
|
@ -0,0 +1,9 @@
|
|||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||||
kind: Kustomization
|
||||
|
||||
namespace: databases
|
||||
|
||||
resources:
|
||||
- immich-pg.yaml
|
||||
- external-secret-immich-borgmatic.yaml
|
||||
- service-immich-pg-tailscale.yaml
|
||||
|
|
@ -1,6 +1,8 @@
|
|||
# Tailscale LoadBalancer for immich-pg PostgreSQL access
|
||||
# Canonical hostname: immich-pg.tail8d86e.ts.net
|
||||
# Caddy L4 proxies pg.ops.eblu.me:5433 → this service for borgmatic backups
|
||||
# Tailscale LoadBalancer for immich-pg PostgreSQL access on ringtail.
|
||||
# Canonical hostname: immich-pg.tail8d86e.ts.net (claimed from the
|
||||
# minikube side after the minikube service was removed during the
|
||||
# immich-to-ringtail migration). Borgmatic on indri uses this
|
||||
# hostname for nightly backups.
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
|
|
@ -1,69 +0,0 @@
|
|||
# PostgreSQL Cluster for Immich
|
||||
# Uses VectorChord (successor to pgvecto.rs) for AI-powered vector search
|
||||
# See: https://github.com/immich-app/immich/discussions/9060
|
||||
# Managed by CloudNativePG operator
|
||||
apiVersion: postgresql.cnpg.io/v1
|
||||
kind: Cluster
|
||||
metadata:
|
||||
name: immich-pg
|
||||
namespace: databases
|
||||
spec:
|
||||
instances: 1
|
||||
# VectorChord image for PostgreSQL 17 with VectorChord 0.5.0
|
||||
# Immich v2.4.1 requires VectorChord >=0.3 <0.6
|
||||
# See: https://github.com/tensorchord/VectorChord
|
||||
imageName: ghcr.io/tensorchord/cloudnative-vectorchord:17-0.5.0
|
||||
|
||||
storage:
|
||||
size: 10Gi
|
||||
storageClass: standard
|
||||
|
||||
# Bootstrap creates initial database and owner
|
||||
bootstrap:
|
||||
initdb:
|
||||
database: immich
|
||||
owner: immich
|
||||
postInitSQL:
|
||||
# Extensions required by Immich
|
||||
- CREATE EXTENSION IF NOT EXISTS vector;
|
||||
- CREATE EXTENSION IF NOT EXISTS vchord CASCADE;
|
||||
- CREATE EXTENSION IF NOT EXISTS cube CASCADE;
|
||||
- CREATE EXTENSION IF NOT EXISTS earthdistance CASCADE;
|
||||
|
||||
# Managed roles
|
||||
# Note: connectionLimit, ensure, inherit are CNPG defaults added to prevent ArgoCD drift
|
||||
managed:
|
||||
roles:
|
||||
# borgmatic read-only user for backups
|
||||
- name: borgmatic
|
||||
login: true
|
||||
connectionLimit: -1
|
||||
ensure: present
|
||||
inherit: true
|
||||
inRoles:
|
||||
- pg_read_all_data
|
||||
passwordSecret:
|
||||
name: immich-pg-borgmatic
|
||||
|
||||
# Resource limits for minikube environment
|
||||
resources:
|
||||
requests:
|
||||
memory: "256Mi"
|
||||
cpu: "100m"
|
||||
limits:
|
||||
memory: "1Gi"
|
||||
cpu: "500m"
|
||||
|
||||
# PostgreSQL configuration
|
||||
postgresql:
|
||||
# VectorChord requires vchord.so in shared_preload_libraries
|
||||
shared_preload_libraries:
|
||||
- "vchord.so"
|
||||
parameters:
|
||||
max_connections: "50"
|
||||
shared_buffers: "128MB"
|
||||
password_encryption: "scram-sha-256"
|
||||
pg_hba:
|
||||
# Allow connections from k8s pods
|
||||
- host all all 0.0.0.0/0 scram-sha-256
|
||||
- host all all ::/0 scram-sha-256
|
||||
|
|
@ -5,13 +5,10 @@ namespace: databases
|
|||
|
||||
resources:
|
||||
- blumeops-pg.yaml
|
||||
- immich-pg.yaml
|
||||
- service-tailscale.yaml
|
||||
- service-immich-pg-tailscale.yaml
|
||||
- service-metrics-tailscale.yaml
|
||||
- external-secret-eblume.yaml
|
||||
- external-secret-borgmatic.yaml
|
||||
- external-secret-immich-borgmatic.yaml
|
||||
- external-secret-teslamate.yaml
|
||||
- external-secret-authentik.yaml
|
||||
- external-secret-paperless.yaml
|
||||
|
|
|
|||
|
|
@ -16,11 +16,16 @@ spec:
|
|||
app: immich
|
||||
component: machine-learning
|
||||
spec:
|
||||
runtimeClassName: nvidia
|
||||
securityContext:
|
||||
seccompProfile:
|
||||
type: RuntimeDefault
|
||||
containers:
|
||||
- name: machine-learning
|
||||
# ringtail uses the -cuda tag (set in kustomization.yaml)
|
||||
# to take advantage of the RTX 4080 via the nvidia
|
||||
# device plugin. Time-slicing is configured for 4 replicas
|
||||
# so frigate + ollama + this pod can share.
|
||||
image: ghcr.io/immich-app/immich-machine-learning:kustomized
|
||||
ports:
|
||||
- name: http
|
||||
|
|
@ -57,6 +62,7 @@ spec:
|
|||
cpu: "100m"
|
||||
limits:
|
||||
memory: "4Gi"
|
||||
nvidia.com/gpu: "1"
|
||||
volumes:
|
||||
- name: cache
|
||||
persistentVolumeClaim:
|
||||
|
|
@ -1,6 +1,9 @@
|
|||
# Tailscale Ingress for Immich
|
||||
# Exposes Immich at photos.tail8d86e.ts.net
|
||||
# Caddy will proxy photos.ops.eblu.me to this endpoint
|
||||
# Tailscale ProxyGroup Ingress for Immich on ringtail.
|
||||
#
|
||||
# Production hostname: photos.tail8d86e.ts.net
|
||||
# (during the cutover window this was photos-ringtail; the minikube
|
||||
# ingress was torn down before this was renamed to photos to avoid
|
||||
# the Tailscale device-name collision.)
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
|
|
@ -16,12 +19,6 @@ metadata:
|
|||
gethomepage.dev/description: "Photo management"
|
||||
gethomepage.dev/href: "https://photos.ops.eblu.me"
|
||||
gethomepage.dev/pod-selector: "app=immich,component=server"
|
||||
# TODO: Add Immich widget - requires API key from Account Settings > API Keys
|
||||
# See: https://gethomepage.dev/widgets/services/immich/
|
||||
# gethomepage.dev/widget.type: "immich"
|
||||
# gethomepage.dev/widget.url: "https://photos.ops.eblu.me"
|
||||
# gethomepage.dev/widget.key: "{{HOMEPAGE_VAR_IMMICH_API_KEY}}"
|
||||
# gethomepage.dev/widget.version: "2"
|
||||
spec:
|
||||
ingressClassName: tailscale
|
||||
rules:
|
||||
|
|
@ -1,7 +1,8 @@
|
|||
---
|
||||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||||
kind: Kustomization
|
||||
|
||||
namespace: immich
|
||||
|
||||
resources:
|
||||
- deployment-server.yaml
|
||||
- deployment-ml.yaml
|
||||
|
|
@ -13,11 +14,15 @@ resources:
|
|||
- pv-nfs.yaml
|
||||
- pvc.yaml
|
||||
- ingress-tailscale.yaml
|
||||
|
||||
images:
|
||||
- name: ghcr.io/immich-app/immich-server
|
||||
newTag: v2.6.3
|
||||
- name: ghcr.io/immich-app/immich-machine-learning
|
||||
newTag: v2.6.3
|
||||
# CUDA variant of the same release — ringtail has an RTX 4080
|
||||
newTag: v2.6.3-cuda
|
||||
# Using upstream multi-arch valkey image directly; the
|
||||
# registry.ops.eblu.me/blumeops/valkey mirror is arm64-only (built
|
||||
# on indri) and would crashloop on ringtail.
|
||||
- name: docker.io/valkey/valkey
|
||||
newName: registry.ops.eblu.me/blumeops/valkey
|
||||
newTag: v8.1.6-r0-fabca04
|
||||
newTag: "8.1.6"
|
||||
29
argocd/manifests/immich-ringtail/pv-nfs.yaml
Normal file
29
argocd/manifests/immich-ringtail/pv-nfs.yaml
Normal file
|
|
@ -0,0 +1,29 @@
|
|||
# NFS PersistentVolume for Immich photo library on ringtail k3s.
|
||||
#
|
||||
# Mirror of argocd/manifests/immich/pv-nfs.yaml (minikube) but with
|
||||
# a distinct name (minikube and ringtail are separate clusters, so PV
|
||||
# names don't collide cluster-side, but using the same name in two
|
||||
# manifests is confusing).
|
||||
#
|
||||
# The sifaka NFS export for /volume1/photos already permits
|
||||
# 192.168.1.0/24 + 100.64.0.0/10. Ringtail's wired IP (192.168.1.21)
|
||||
# falls in the first CIDR, so no DSM rule changes are needed.
|
||||
#
|
||||
# Verified 2026-05-13: ringtail pod can read existing dirs, write
|
||||
# new files, and delete them. DNS resolves sifaka to 192.168.1.203
|
||||
# (LAN), so NFS traffic stays off the tailnet — avoids the known
|
||||
# sifaka-tailscale-userspace bite.
|
||||
apiVersion: v1
|
||||
kind: PersistentVolume
|
||||
metadata:
|
||||
name: immich-library-nfs-pv-ringtail
|
||||
spec:
|
||||
capacity:
|
||||
storage: 2Ti
|
||||
accessModes:
|
||||
- ReadWriteMany
|
||||
persistentVolumeReclaimPolicy: Retain
|
||||
storageClassName: ""
|
||||
nfs:
|
||||
server: sifaka
|
||||
path: /volume1/photos
|
||||
|
|
@ -1,5 +1,5 @@
|
|||
# PersistentVolumeClaim for Immich photo library
|
||||
# Binds to the NFS PV for sifaka:/volume1/photos
|
||||
# PersistentVolumeClaim for Immich photo library on ringtail.
|
||||
# Binds to immich-library-nfs-pv-ringtail (sifaka:/volume1/photos).
|
||||
apiVersion: v1
|
||||
kind: PersistentVolumeClaim
|
||||
metadata:
|
||||
|
|
@ -9,7 +9,7 @@ spec:
|
|||
accessModes:
|
||||
- ReadWriteMany
|
||||
storageClassName: ""
|
||||
volumeName: immich-library-nfs-pv
|
||||
volumeName: immich-library-nfs-pv-ringtail
|
||||
resources:
|
||||
requests:
|
||||
storage: 2Ti
|
||||
|
|
@ -1,115 +0,0 @@
|
|||
# Immich
|
||||
|
||||
Self-hosted photo and video management solution with AI-powered search and face recognition.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
1. **NFS Share**: Create `/volume1/photos` on sifaka with NFS permissions for indri
|
||||
2. **PostgreSQL**: The `immich-pg` cluster (with pgvecto.rs) must be healthy
|
||||
3. **Secrets**: Create the database password secret
|
||||
|
||||
## Deployment Order
|
||||
|
||||
1. Sync `blumeops-pg` (to get CloudNativePG operator if not already running)
|
||||
2. Wait for `immich-pg` cluster to be healthy
|
||||
3. Create secrets (see below)
|
||||
4. Sync `immich` (deploys all resources: storage, services, deployments)
|
||||
5. Run `mise run provision-indri -- --tags caddy` to update Caddy config
|
||||
|
||||
## Components
|
||||
|
||||
| Component | Deployment | Service | Port |
|
||||
|-----------|------------|---------|------|
|
||||
| Server (web/API) | `immich-server` | `immich-server` | 2283 |
|
||||
| Machine Learning | `immich-machine-learning` | `immich-machine-learning` | 3003 |
|
||||
| Valkey (Redis) | `immich-valkey` | `immich-valkey` | 6379 |
|
||||
|
||||
## Secret Setup
|
||||
|
||||
The `immich-db` secret contains the database password, which is auto-generated by CloudNativePG
|
||||
in the `immich-pg-app` secret. To create or regenerate the secret:
|
||||
|
||||
```bash
|
||||
# Create namespace if needed
|
||||
kubectl --context=minikube-indri create namespace immich
|
||||
|
||||
# Copy password from CNPG secret to immich namespace
|
||||
kubectl --context=minikube-indri create secret generic immich-db -n immich \
|
||||
--from-literal=password="$(kubectl --context=minikube-indri -n databases get secret immich-pg-app -o jsonpath='{.data.password}' | base64 -d)"
|
||||
```
|
||||
|
||||
Note: This secret is not managed by ExternalSecrets since the source of truth is the CNPG-generated secret.
|
||||
|
||||
## Access
|
||||
|
||||
- **URL**: https://photos.ops.eblu.me (after Caddy is updated)
|
||||
- **Tailscale**: https://photos.tail8d86e.ts.net (direct)
|
||||
|
||||
## First-Time Setup
|
||||
|
||||
1. Navigate to https://photos.ops.eblu.me
|
||||
2. Create an admin account
|
||||
3. Configure external library (optional - for importing existing photos)
|
||||
|
||||
## External Library (iCloud Photos)
|
||||
|
||||
To import existing photos from iCloud sync on indri:
|
||||
|
||||
1. In Immich Admin > External Libraries, create a new library
|
||||
2. Set the import path to the location where iCloud photos sync
|
||||
3. Configure scan schedule or trigger manual scan
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────┐ ┌─────────────────┐
|
||||
│ immich-server │────▶│ immich-pg │
|
||||
│ (web/api) │ │ (PostgreSQL │
|
||||
└────────┬────────┘ │ + pgvecto.rs) │
|
||||
│ └─────────────────┘
|
||||
│
|
||||
┌────────▼────────┐ ┌─────────────────┐
|
||||
│ immich-ml │ │ valkey │
|
||||
│ (ML inference) │ │ (Redis cache) │
|
||||
└─────────────────┘ └─────────────────┘
|
||||
│
|
||||
┌────────▼────────┐
|
||||
│ sifaka NFS │
|
||||
│ /volume1/photos│
|
||||
└─────────────────┘
|
||||
```
|
||||
|
||||
## Version Management
|
||||
|
||||
Image versions are controlled via `kustomization.yaml`:
|
||||
|
||||
```yaml
|
||||
images:
|
||||
- name: ghcr.io/immich-app/immich-server
|
||||
newTag: v2.6.3
|
||||
- name: ghcr.io/immich-app/immich-machine-learning
|
||||
newTag: v2.6.3
|
||||
- name: docker.io/valkey/valkey
|
||||
newTag: "8.1-alpine"
|
||||
```
|
||||
|
||||
To upgrade, update `newTag` values and sync via ArgoCD.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
```bash
|
||||
# Check pods
|
||||
kubectl --context=minikube-indri -n immich get pods
|
||||
|
||||
# Check immich-pg cluster
|
||||
kubectl --context=minikube-indri -n databases get cluster immich-pg
|
||||
|
||||
# View server logs
|
||||
kubectl --context=minikube-indri -n immich logs -l app=immich,component=server
|
||||
|
||||
# View ML logs
|
||||
kubectl --context=minikube-indri -n immich logs -l app=immich,component=machine-learning
|
||||
|
||||
# Check PVC binding
|
||||
kubectl --context=minikube-indri -n immich get pvc
|
||||
```
|
||||
|
|
@ -1,22 +0,0 @@
|
|||
# NFS PersistentVolume for Immich photo library
|
||||
# Requires: NFS share on sifaka at /volume1/photos with NFS permissions for indri
|
||||
#
|
||||
# To create on Synology:
|
||||
# 1. Control Panel > Shared Folder > Create
|
||||
# 2. Name: photos, Location: Volume 1
|
||||
# 3. Control Panel > File Services > NFS > NFS Rules
|
||||
# 4. Add rule for "photos" share: Hostname=indri, Privilege=Read/Write, Squash=No mapping
|
||||
apiVersion: v1
|
||||
kind: PersistentVolume
|
||||
metadata:
|
||||
name: immich-library-nfs-pv
|
||||
spec:
|
||||
capacity:
|
||||
storage: 2Ti
|
||||
accessModes:
|
||||
- ReadWriteMany
|
||||
persistentVolumeReclaimPolicy: Retain
|
||||
storageClassName: ""
|
||||
nfs:
|
||||
server: sifaka
|
||||
path: /volume1/photos
|
||||
|
|
@ -11,4 +11,4 @@ data:
|
|||
timeSlicing:
|
||||
resources:
|
||||
- name: nvidia.com/gpu
|
||||
replicas: 2
|
||||
replicas: 4
|
||||
|
|
|
|||
13
docs/changelog.d/migrate-immich-to-ringtail.infra.md
Normal file
13
docs/changelog.d/migrate-immich-to-ringtail.infra.md
Normal file
|
|
@ -0,0 +1,13 @@
|
|||
Move the entire Immich stack — server, machine-learning, valkey,
|
||||
and the PostgreSQL+VectorChord cluster — off `minikube-indri` and
|
||||
onto `k3s-ringtail`. Postgres data migrated zero-loss via CNPG
|
||||
`pg_basebackup` (replica catch-up then promote); row counts on
|
||||
`asset`, `user`, `album`, `smart_search`, `activity`, `asset_face`
|
||||
verified equal between source and replica before cutover. The ML
|
||||
pod now uses ringtail's RTX 4080 via the nvidia-device-plugin
|
||||
(time-slicing bumped 2 → 4 to share with frigate + ollama). Caddy
|
||||
routing at `photos.ops.eblu.me` is unchanged (still
|
||||
`photos.tail8d86e.ts.net`, the device just lives on ringtail now).
|
||||
Borgmatic backups continue against the same `immich-pg` tailnet
|
||||
hostname. First concrete chain in the broader indri-k8s
|
||||
decommission effort.
|
||||
52
docs/how-to/immich/cnpg-on-ringtail.md
Normal file
52
docs/how-to/immich/cnpg-on-ringtail.md
Normal file
|
|
@ -0,0 +1,52 @@
|
|||
---
|
||||
title: CNPG Operator on Ringtail
|
||||
modified: 2026-05-13
|
||||
last-reviewed: 2026-05-13
|
||||
tags:
|
||||
- how-to
|
||||
- operations
|
||||
- postgres
|
||||
- ringtail
|
||||
---
|
||||
|
||||
# CNPG Operator on Ringtail
|
||||
|
||||
Bring up the `cloudnative-pg` operator on `k3s-ringtail`. Today the
|
||||
operator only exists on `minikube-indri` (see
|
||||
`argocd/apps/cloudnative-pg.yaml`, destination `kubernetes.default.svc`).
|
||||
|
||||
Prerequisite of [[migrate-immich-to-ringtail]]; consumed by
|
||||
[[immich-pg-on-ringtail]].
|
||||
|
||||
## What to do
|
||||
|
||||
- Add a sibling `argocd/apps/cloudnative-pg-ringtail.yaml` pointing
|
||||
at the same mirror (`mirrors/cloudnative-pg`, tag `v1.27.1`),
|
||||
destination `https://ringtail.tail8d86e.ts.net:6443`,
|
||||
namespace `cnpg-system`.
|
||||
- Mirror the `ServerSideApply=true` and `CreateNamespace=true` sync
|
||||
options (the CRDs exceed the annotation size limit).
|
||||
- Sync `apps` then `cloudnative-pg-ringtail`. Verify the operator
|
||||
pod is running on ringtail.
|
||||
|
||||
## Verification
|
||||
|
||||
```fish
|
||||
kubectl --context=k3s-ringtail -n cnpg-system get pods
|
||||
kubectl --context=k3s-ringtail get crd clusters.postgresql.cnpg.io
|
||||
```
|
||||
|
||||
## Why a separate app
|
||||
|
||||
Each ArgoCD app targets a single cluster via `destination.server`.
|
||||
We could parameterize with ApplicationSets, but blumeops' convention
|
||||
is to duplicate the manifest with a `-ringtail` suffix (see
|
||||
`alloy-ringtail`, `external-secrets-ringtail`, etc.). Keep the
|
||||
convention.
|
||||
|
||||
## Out of scope
|
||||
|
||||
- Postgres clusters themselves (`immich-pg`, etc.) — those come from
|
||||
[[immich-pg-on-ringtail]].
|
||||
- Removing the minikube cnpg operator. That happens at the very end
|
||||
of the indri-k8s decommission, not in this chain.
|
||||
91
docs/how-to/immich/immich-app-on-ringtail.md
Normal file
91
docs/how-to/immich/immich-app-on-ringtail.md
Normal file
|
|
@ -0,0 +1,91 @@
|
|||
---
|
||||
title: Immich App on Ringtail
|
||||
modified: 2026-05-13
|
||||
last-reviewed: 2026-05-13
|
||||
tags:
|
||||
- how-to
|
||||
- operations
|
||||
- immich
|
||||
---
|
||||
|
||||
# Immich App on Ringtail
|
||||
|
||||
Bring up `immich-server`, `immich-machine-learning`, and
|
||||
`immich-valkey` on ringtail. This card stands the stack up against
|
||||
the *new* pg cluster — it does not move user traffic. Cutover lives
|
||||
in [[immich-cutover-and-decommission]].
|
||||
|
||||
## What to do
|
||||
|
||||
- New manifest dir `argocd/manifests/immich-ringtail/` (the suffix
|
||||
matches the `-ringtail` convention used by other apps). Port from
|
||||
`argocd/manifests/immich/`:
|
||||
- `deployment-server.yaml` — point `DB_HOSTNAME` at the ringtail
|
||||
pg service.
|
||||
- `deployment-ml.yaml` — use `runtimeClassName: nvidia` + a
|
||||
`resources.limits` for `nvidia.com/gpu: 1`. Use the `-cuda` tag
|
||||
of the immich-ml image (set in kustomization). Ringtail is
|
||||
single-node, so no node selector needed. See
|
||||
`argocd/manifests/frigate/` for the existing GPU pod pattern.
|
||||
|
||||
**GPU contention discovery:** ringtail's `nvidia-device-plugin`
|
||||
is configured with `timeSlicing.replicas: 2`. Frigate + Ollama
|
||||
already consume both virtual slices. Adding immich-ml requires
|
||||
bumping the count to >= 3. Edit
|
||||
`argocd/manifests/nvidia-device-plugin/configmap.yaml` (or
|
||||
wherever the device-plugin config lives) and re-sync the
|
||||
`nvidia-device-plugin` ArgoCD app. The plugin pod restarts and
|
||||
the new advertised count appears as the node's
|
||||
`nvidia.com/gpu` allocatable.
|
||||
- `deployment-valkey.yaml` — straight port, BUT use the upstream
|
||||
multi-arch `docker.io/valkey/valkey:<version>` image — do NOT
|
||||
use the `registry.ops.eblu.me/blumeops/valkey` rewrite in the
|
||||
kustomization. That mirror was built on indri (arm64) and is
|
||||
single-arch; pulling it on ringtail (amd64) gets `exec format
|
||||
error` in CrashLoopBackOff. The mirror should eventually carry
|
||||
a multi-arch tag, at which point the rewrite can return.
|
||||
- `service*.yaml` — straight port.
|
||||
- `pvc-ml-cache.yaml` — straight port (empty `local-path` PVC).
|
||||
- `pv-nfs.yaml` + `pvc.yaml` — already covered by
|
||||
[[sifaka-nfs-from-ringtail]] (may live in this dir or theirs).
|
||||
- `ingress-tailscale.yaml` — ProxyGroup ingress, **must not** set
|
||||
an explicit `host:` (or use `host: *`) per the lesson on
|
||||
ProxyGroup VIP routing.
|
||||
**Hostname collision warning:** the minikube ingress claims the
|
||||
Tailscale device name `photos` (`tls.hosts: [photos]`). Two
|
||||
devices on the tailnet cannot share that name. While the
|
||||
ringtail deployment is being staged it must use a *different*
|
||||
`tls.hosts` value (e.g. `photos-ringtail`) so it can coexist
|
||||
with the running minikube one. The flip to `photos` happens at
|
||||
cutover time, *after* the minikube ingress has been removed.
|
||||
See [[immich-cutover-and-decommission#Cutover sequence]].
|
||||
- `kustomization.yaml` — same `images:` block (server, ML, valkey).
|
||||
- New ArgoCD app `argocd/apps/immich-ringtail.yaml` targeting
|
||||
ringtail, namespace `immich`. **Manual sync only** until the
|
||||
cutover.
|
||||
- Existing `argocd/apps/immich.yaml` (minikube) stays untouched
|
||||
during this card — both apps exist briefly.
|
||||
|
||||
## Bring it up against a copy of the DB
|
||||
|
||||
Use the throwaway/test path from [[immich-pg-data-migration#Dry run
|
||||
before real cutover]]: point the ringtail immich at the *test* pg
|
||||
cluster first, verify the pod boots, the web UI loads (via
|
||||
`kubectl port-forward`), assets list, ML embeddings query. Then
|
||||
tear it down.
|
||||
|
||||
## Verification
|
||||
|
||||
- All three pods Ready.
|
||||
- ML pod has a GPU attached: `nvidia-smi` inside the container shows
|
||||
the 4080.
|
||||
- `immich-server` connects to pg and valkey (no `ECONNREFUSED` in
|
||||
logs).
|
||||
- A `kubectl port-forward` to the server service shows the Immich
|
||||
web UI.
|
||||
|
||||
## Out of scope
|
||||
|
||||
- Public/tailnet routing flip. Caddy still points at the minikube
|
||||
Tailscale ingress until [[immich-cutover-and-decommission]].
|
||||
- Removing the minikube immich. Same.
|
||||
103
docs/how-to/immich/immich-cutover-and-decommission.md
Normal file
103
docs/how-to/immich/immich-cutover-and-decommission.md
Normal file
|
|
@ -0,0 +1,103 @@
|
|||
---
|
||||
title: Immich Cutover and Decommission
|
||||
modified: 2026-05-13
|
||||
last-reviewed: 2026-05-13
|
||||
tags:
|
||||
- how-to
|
||||
- operations
|
||||
- immich
|
||||
- migration
|
||||
---
|
||||
|
||||
# Immich Cutover and Decommission
|
||||
|
||||
The user-visible flip. By the time this card opens, the ringtail
|
||||
stack has been proven against a copy of the data. This card does the
|
||||
real cutover.
|
||||
|
||||
## Pre-cutover checklist
|
||||
|
||||
- [[immich-pg-data-migration]] dry-run succeeded; method is chosen.
|
||||
- Ringtail immich stack has been brought up against the test pg,
|
||||
pods healthy, UI loaded ([[immich-app-on-ringtail#Verification]]).
|
||||
- Borgmatic just ran successfully (a fresh nightly archive is a
|
||||
belt-and-suspenders fallback, on top of the live source pg).
|
||||
- User has been told to stop uploading from the iOS app for the
|
||||
cutover window.
|
||||
|
||||
## Cutover sequence
|
||||
|
||||
1. **Quiesce source.** `kubectl --context=minikube-indri -n immich
|
||||
scale deploy/immich-server --replicas=0` and same for ML. Leave
|
||||
valkey + pg running. Confirm no client traffic on the source pg
|
||||
via `pg_stat_activity`.
|
||||
2. **Tear down the minikube Tailscale ingress.** The `photos`
|
||||
Tailscale device name must be freed before ringtail's ingress can
|
||||
claim it (Tailscale enforces uniqueness across the tailnet).
|
||||
`kubectl --context=minikube-indri -n immich delete ingress
|
||||
immich-tailscale` and wait for the corresponding `tailscale`-LB
|
||||
StatefulSet pod to terminate. Verify the `photos` device is gone:
|
||||
`tailscale status | grep -i photos` from any tailnet host.
|
||||
3. **Final sync.** Per chosen method in
|
||||
[[immich-pg-data-migration]]:
|
||||
- Option A: promote the ringtail replica.
|
||||
- Option B: take final `pg_dump`, restore to ringtail
|
||||
`immich-pg`.
|
||||
4. **Verify.** Run the row-count and schema-diff checks from
|
||||
[[immich-pg-data-migration#Verification on the real run]].
|
||||
5. **Flip the ringtail ingress to `photos`.** Update
|
||||
`argocd/manifests/immich-ringtail/ingress-tailscale.yaml`:
|
||||
`tls.hosts: [photos]` (was `[photos-ringtail]` during staging per
|
||||
[[immich-app-on-ringtail]]). Commit, `argocd app sync
|
||||
immich-ringtail`. Wait for the `photos` device to register on the
|
||||
tailnet again.
|
||||
6. **Bring up ringtail immich** against the now-promoted pg
|
||||
(`argocd app sync immich-ringtail`). Wait for Ready.
|
||||
7. **Flip routing.** Update Caddy on indri
|
||||
(`ansible/roles/caddy/defaults/main.yml`): `photos.ops.eblu.me`
|
||||
upstream changes to the ringtail Tailscale ingress hostname
|
||||
(`photos` — same MagicDNS name, now pointing to the ringtail
|
||||
proxy). `mise run provision-indri -- --tags caddy`.
|
||||
8. **Smoke test.** Open `photos.ops.eblu.me` in a browser. Sign in.
|
||||
Scroll the timeline. Open an album. Trigger an ML search.
|
||||
9. **Update borgmatic.** If the Tailscale hostname for pg changed,
|
||||
update `borgmatic.cfg` on indri to point at the ringtail
|
||||
`immich-pg-tailscale` service. Run a manual backup to verify.
|
||||
|
||||
## After cutover
|
||||
|
||||
- `argocd app set immich --revision <branch>` is no longer relevant;
|
||||
the minikube `immich` app gets deleted entirely.
|
||||
- Delete `argocd/apps/immich.yaml`, `argocd/manifests/immich/`, and
|
||||
the minikube `argocd/manifests/databases/immich-pg.yaml` +
|
||||
`external-secret-immich-borgmatic.yaml` +
|
||||
`service-immich-pg-tailscale.yaml`.
|
||||
- Rename `immich-ringtail` back to `immich` (the `-ringtail` suffix
|
||||
was scaffolding for the dual-cluster window; once minikube is
|
||||
empty of immich, the unsuffixed name is clean).
|
||||
- Confirm the minikube `immich-pg` PVC is no longer used, then
|
||||
delete it (the PV with `Retain` policy will persist — clean that
|
||||
up too).
|
||||
|
||||
## Verification (definition of done)
|
||||
|
||||
- `photos.ops.eblu.me` works for a real session, including ML search.
|
||||
- Source minikube has no `immich` pods, no `immich-pg`, no PVCs.
|
||||
- Memory pressure on minikube has dropped (≥1.5 GiB reclaimed). Check
|
||||
`docker stats minikube` on indri.
|
||||
- Nightly borgmatic run after the cutover completes successfully,
|
||||
with the immich-pg archive showing the new source.
|
||||
|
||||
## Rollback (within the cutover window)
|
||||
|
||||
If smoke test fails: flip Caddy back, scale ringtail immich to 0,
|
||||
scale source immich back up. Source pg was never destroyed. File a
|
||||
plan reset on the relevant prerequisite card and try again next
|
||||
session.
|
||||
|
||||
## Out of scope
|
||||
|
||||
- Decommissioning all of minikube. This chain just removes immich.
|
||||
Other tenants migrate in their own chains as part of the broader
|
||||
indri-k8s decommission. See [[migrate-immich-to-ringtail]] for
|
||||
context.
|
||||
79
docs/how-to/immich/immich-pg-data-migration.md
Normal file
79
docs/how-to/immich/immich-pg-data-migration.md
Normal file
|
|
@ -0,0 +1,79 @@
|
|||
---
|
||||
title: Immich Postgres Data Migration
|
||||
modified: 2026-05-13
|
||||
last-reviewed: 2026-05-13
|
||||
tags:
|
||||
- how-to
|
||||
- operations
|
||||
- postgres
|
||||
- immich
|
||||
- critical
|
||||
---
|
||||
|
||||
# Immich Postgres Data Migration
|
||||
|
||||
**This is the data-loss surface of the migration.** Pick a method,
|
||||
prove it on a throwaway copy first, then run the real cutover.
|
||||
|
||||
## Decision: pick one
|
||||
|
||||
### Option A — CNPG `externalCluster` bootstrap (preferred)
|
||||
|
||||
Stand the ringtail cluster up as a streaming replica of the minikube
|
||||
cluster via `bootstrap.pg_basebackup.source`. Replica catches up
|
||||
online; when ready, promote it and point Immich at it. This is
|
||||
CNPG's documented PG-to-PG migration path and gives near-zero data
|
||||
loss (the WAL position at promote == the position at app stop).
|
||||
|
||||
Requires: network path from ringtail to minikube's pg over the
|
||||
tailnet (the existing `immich-pg-tailscale` Service works), and a
|
||||
superuser secret minikube-side exposed to ringtail's basebackup.
|
||||
|
||||
Pitfall to plan around: the ringtail Cluster CR will need its
|
||||
`bootstrap` block rewritten *after* promotion (CNPG doesn't
|
||||
gracefully drop the externalCluster reference). Account for this in
|
||||
[[immich-pg-on-ringtail]] — it may force a reset of that card.
|
||||
|
||||
### Option B — pg_dump / pg_restore
|
||||
|
||||
Stop immich, `pg_dump -Fc` from minikube, scp to ringtail, restore.
|
||||
Simpler but full downtime for the whole dump+restore window
|
||||
(measure on a copy first — VectorChord indexes are slow to rebuild).
|
||||
Smaller blast radius; no streaming-replication moving parts.
|
||||
|
||||
Use this if Option A hits any blocker. Data loss should still be
|
||||
zero if the source is stopped first.
|
||||
|
||||
### Option C — leave pg on minikube
|
||||
|
||||
Rejected. See goal card [[migrate-immich-to-ringtail#Why postgres on
|
||||
ringtail (not cross-cluster)]].
|
||||
|
||||
## Dry run before real cutover
|
||||
|
||||
Whichever option wins:
|
||||
|
||||
1. Snapshot the minikube `immich-pg` PVC or take a fresh `pg_dump`
|
||||
into a scratch location.
|
||||
2. Restore into a *separate* ringtail CNPG cluster (different name,
|
||||
e.g. `immich-pg-test`) and point a scratch immich-server pod at
|
||||
it.
|
||||
3. Verify: pod boots, can list assets, ML embeddings query without
|
||||
error, face thumbnails render. VectorChord-backed queries should
|
||||
not error.
|
||||
4. Tear the scratch cluster down before doing the real one.
|
||||
|
||||
## Verification on the real run
|
||||
|
||||
- Row counts match for `assets`, `albums`, `users`, `face`,
|
||||
`asset_face`, `smart_search` (the embedding table) — script this.
|
||||
- `pg_dump --schema-only --no-owner` diff between source and dest
|
||||
should be empty modulo CNPG-managed roles.
|
||||
- Immich `/api/server-info/version` and `/api/server-info/statistics`
|
||||
return sane numbers.
|
||||
|
||||
## Rollback
|
||||
|
||||
If the cutover fails verification: stop the ringtail immich, repoint
|
||||
ArgoCD `immich.destination` back to minikube, re-sync. Source pg was
|
||||
never deleted. Document what failed and reset the chain.
|
||||
69
docs/how-to/immich/immich-pg-on-ringtail.md
Normal file
69
docs/how-to/immich/immich-pg-on-ringtail.md
Normal file
|
|
@ -0,0 +1,69 @@
|
|||
---
|
||||
title: Immich Postgres Cluster on Ringtail
|
||||
modified: 2026-05-13
|
||||
last-reviewed: 2026-05-13
|
||||
tags:
|
||||
- how-to
|
||||
- operations
|
||||
- postgres
|
||||
- immich
|
||||
---
|
||||
|
||||
# Immich Postgres Cluster on Ringtail
|
||||
|
||||
Stand up a fresh `immich-pg` CNPG Cluster on ringtail, ready to receive
|
||||
data. **No data import yet** — that's [[immich-pg-data-migration]].
|
||||
|
||||
## What to do
|
||||
|
||||
- Create `argocd/manifests/databases-ringtail/` (or pick another
|
||||
namespace name — verify what other ringtail pg clusters will use;
|
||||
if none yet, `databases` is fine).
|
||||
- Port these from the minikube side:
|
||||
- `immich-pg.yaml` — CNPG Cluster CR. Same image
|
||||
(`ghcr.io/tensorchord/cloudnative-vectorchord:17-0.5.0`), same
|
||||
extensions, same managed `borgmatic` role. Bump `storage.size` if
|
||||
the minikube 10 GiB looks tight (check actual usage first).
|
||||
`storageClass: local-path` on ringtail (default).
|
||||
- `external-secret-immich-borgmatic.yaml` — same 1Password item,
|
||||
same field, but referencing the ringtail `ClusterSecretStore`
|
||||
(`onepassword-blumeops` already exists per the
|
||||
`external-secrets-ringtail` app).
|
||||
- Service for in-cluster access (the operator creates `immich-pg-rw`
|
||||
etc. automatically; verify the app deployment uses those names).
|
||||
- A Tailscale Service if we want backups to keep working via the
|
||||
same hostname during the transition — see "Borgmatic" below.
|
||||
- New ArgoCD app `argocd/apps/databases-ringtail.yaml` pointing at
|
||||
the new path, destination ringtail.
|
||||
|
||||
## Verification
|
||||
|
||||
- Cluster reaches `Ready`.
|
||||
- `borgmatic` role exists, `rolcanlogin=t`, and is a member of
|
||||
`pg_read_all_data` (via `managed.roles[].inRoles`).
|
||||
- ExternalSecret `immich-pg-borgmatic` syncs from 1Password
|
||||
(`Ready: True`) and the rendered Secret has `username=borgmatic`.
|
||||
- The `vchord`, `vector`, `cube`, `earthdistance` extensions show
|
||||
installed in the `postgres` database (`\dx` from
|
||||
`psql -U postgres`). They are NOT installed in the `immich`
|
||||
database at this point — `postInitSQL` in CNPG's `initdb` block
|
||||
runs against the `postgres` superuser database. The Immich app
|
||||
itself creates the extensions in its own `immich` database at
|
||||
startup; do not be alarmed by their absence pre-immich-deploy.
|
||||
The `vchord.so` library is preloaded via
|
||||
`shared_preload_libraries` regardless, so `CREATE EXTENSION` at
|
||||
app startup just registers it in the right database.
|
||||
|
||||
## Borgmatic implications
|
||||
|
||||
`borgmatic.cfg` on indri targets `immich-pg-tailscale` over the
|
||||
tailnet. During migration both clusters will exist briefly. Decide
|
||||
upfront: backup the *source* pg until cutover, then flip borgmatic
|
||||
to the ringtail Tailscale service. Document the flip in
|
||||
[[immich-cutover-and-decommission]].
|
||||
|
||||
## Out of scope
|
||||
|
||||
- Importing data. That is [[immich-pg-data-migration]], which may
|
||||
drive a reset on this card if the migration approach (e.g. CNPG
|
||||
`externalCluster` bootstrap) requires changes to this Cluster CR.
|
||||
132
docs/how-to/immich/migrate-immich-to-ringtail.md
Normal file
132
docs/how-to/immich/migrate-immich-to-ringtail.md
Normal file
|
|
@ -0,0 +1,132 @@
|
|||
---
|
||||
title: Migrate Immich to Ringtail
|
||||
modified: 2026-05-13
|
||||
last-reviewed: 2026-05-13
|
||||
tags:
|
||||
- how-to
|
||||
- operations
|
||||
- immich
|
||||
- migration
|
||||
---
|
||||
|
||||
# Migrate Immich to Ringtail
|
||||
|
||||
Move the entire Immich stack (server, ML, valkey, postgres) off
|
||||
`minikube-indri` and onto `k3s-ringtail`. This is the first concrete
|
||||
chain in the broader indri-k8s decommission: minikube is
|
||||
memory-saturated (97% RAM, swapping), and Immich is the single
|
||||
largest tenant (~1.5 GiB resident).
|
||||
|
||||
## End state
|
||||
|
||||
- Immich `server`, `machine-learning`, and `valkey` Deployments run on
|
||||
ringtail k3s in the `immich` namespace.
|
||||
- The `immich-machine-learning` pod uses ringtail's RTX 4080 via the
|
||||
`nvidia-device-plugin` (performance win — currently CPU-only on
|
||||
minikube).
|
||||
- A CNPG `immich-pg` Cluster (PostgreSQL 17 + VectorChord) runs in a
|
||||
`databases` namespace on ringtail, owned by the `cnpg-system`
|
||||
operator on ringtail.
|
||||
- The photo library still lives on [[sifaka]] at `/volume1/photos`,
|
||||
mounted via NFS from ringtail pods (RWX).
|
||||
- Routing: `photos.ops.eblu.me` (Caddy on indri) proxies to a
|
||||
Tailscale ProxyGroup ingress on ringtail. No public surface today.
|
||||
- The ArgoCD `immich` app's `destination.server` points at
|
||||
`https://ringtail.tail8d86e.ts.net:6443`. The old minikube
|
||||
manifests are removed.
|
||||
|
||||
## Non-goals
|
||||
|
||||
- Public exposure via Fly. Immich stays tailnet-only.
|
||||
- Changing the immich version or runtime configuration. This is a
|
||||
lift-and-shift; bumps come later.
|
||||
- Backing up to a different target. [[borgmatic]] keeps running on
|
||||
indri (it pulls via Tailscale and uses sifaka SMB for the library).
|
||||
|
||||
## Critical constraint: no data loss
|
||||
|
||||
Downtime is acceptable (Immich is a single-user system; we can take
|
||||
it offline for the cutover). **Data loss is not.** Two surfaces matter:
|
||||
|
||||
1. **Postgres** — face data, ML embeddings (vectors), album state,
|
||||
sharing, etc. Re-derivable in theory; weeks of recompute in
|
||||
practice. See [[immich-pg-data-migration]].
|
||||
2. **Library files** — `/volume1/photos`. Not moving, but the NFS
|
||||
path must be verified accessible from ringtail before cutover.
|
||||
See [[sifaka-nfs-from-ringtail]].
|
||||
|
||||
[[borgmatic]] backs both up to sifaka + BorgBase nightly; restore is
|
||||
possible but slow. Treat it as a fallback, not a plan.
|
||||
|
||||
## Why postgres on ringtail (not cross-cluster)
|
||||
|
||||
`immich-pg` already has a Tailscale Service we could point ringtail
|
||||
at, leaving the DB on minikube. We're not doing that because:
|
||||
|
||||
- The whole goal is to retire minikube — keeping pg there blocks it.
|
||||
- Immich is chatty against pg; tailnet round-trips would hurt.
|
||||
- CNPG is the same operator on both sides — a Cluster CR on ringtail
|
||||
is mechanically equivalent.
|
||||
|
||||
## Approach
|
||||
|
||||
This is a C2 Mikado chain. The prerequisite cards each represent a
|
||||
distinct surface that has to work before cutover. See
|
||||
[[agent-change-process#C2 — Mikado Chain]] for the discipline.
|
||||
|
||||
## Workflow note: registering new ArgoCD apps during the chain
|
||||
|
||||
This chain adds three new ArgoCD `Application` definitions in
|
||||
`argocd/apps/`: `cloudnative-pg-ringtail`, `databases-ringtail`,
|
||||
and (later) `immich-ringtail`. The usual C1/C2 pattern of
|
||||
`argocd app set <app> --revision <branch> && argocd app sync <app>`
|
||||
does NOT work for the app-of-apps `apps` Application itself, because
|
||||
`apps` self-manages: it re-reads `apps.yaml` (which declares
|
||||
`targetRevision: main`) on every sync and reverts the override. As a
|
||||
result, new app definitions added on a feature branch are never
|
||||
visible to the cluster via `apps`.
|
||||
|
||||
**Use `kubectl apply` to register each new Application directly:**
|
||||
|
||||
```fish
|
||||
kubectl --context=minikube-indri apply -f argocd/apps/<new-app>.yaml
|
||||
```
|
||||
|
||||
This creates the Application resource out-of-band, bypassing `apps`.
|
||||
|
||||
For apps whose source lives in **this** repo (e.g.
|
||||
`databases-ringtail`, `immich-ringtail` — manifest paths exist only
|
||||
on the branch until merge), follow the apply with a branch override:
|
||||
|
||||
```fish
|
||||
argocd app set <new-app> --revision mikado/migrate-immich-to-ringtail
|
||||
argocd app sync <new-app>
|
||||
```
|
||||
|
||||
For apps whose source is an **external** repo at a pinned tag (e.g.
|
||||
`cloudnative-pg-ringtail` → `mirrors/cloudnative-pg` `v1.27.1`), no
|
||||
override is needed — the source revision is independent of this PR.
|
||||
|
||||
After PR merge:
|
||||
|
||||
```fish
|
||||
argocd app set <new-app> --revision main
|
||||
argocd app sync <new-app>
|
||||
```
|
||||
|
||||
`apps` itself, on its next sync from `main`, will discover the new
|
||||
Application definitions in `argocd/apps/` and adopt the already-running
|
||||
resources without disruption — provided their in-cluster spec matches
|
||||
the on-disk definitions (which it does because we applied the same
|
||||
file).
|
||||
|
||||
## Related
|
||||
|
||||
- [[shower-on-ringtail]] — a previous migration to ringtail (simpler:
|
||||
no upstream cluster, SQLite, no GPU)
|
||||
- [[connect-to-postgres]] — getting a psql session against CNPG
|
||||
- [[ringtail]] — the target cluster
|
||||
- [[cnpg-on-ringtail]], [[immich-pg-on-ringtail]],
|
||||
[[immich-pg-data-migration]], [[sifaka-nfs-from-ringtail]],
|
||||
[[immich-app-on-ringtail]], [[immich-cutover-and-decommission]] —
|
||||
the prerequisite cards
|
||||
67
docs/how-to/immich/sifaka-nfs-from-ringtail.md
Normal file
67
docs/how-to/immich/sifaka-nfs-from-ringtail.md
Normal file
|
|
@ -0,0 +1,67 @@
|
|||
---
|
||||
title: Sifaka NFS Photos from Ringtail
|
||||
modified: 2026-05-13
|
||||
last-reviewed: 2026-05-13
|
||||
tags:
|
||||
- how-to
|
||||
- operations
|
||||
- storage
|
||||
- nfs
|
||||
- sifaka
|
||||
---
|
||||
|
||||
# Sifaka NFS Photos from Ringtail
|
||||
|
||||
The Immich library lives at `sifaka:/volume1/photos` and is mounted
|
||||
into the pod via an NFS PV (see `argocd/manifests/immich/pv-nfs.yaml`).
|
||||
That PV is currently scoped to indri. We need ringtail to mount the
|
||||
same path with the same RWX semantics, without breaking the existing
|
||||
indri mount during the transition.
|
||||
|
||||
## What to verify / do
|
||||
|
||||
- Check `sifaka` DSM NFS rules for the `photos` share. Per
|
||||
[[shower-on-ringtail#NFS + SMB share on sifaka]] convention, rules
|
||||
use `192.168.1.0/24` + `100.64.0.0/10` with
|
||||
`all_squash`/`Map all users to admin`. The existing rule may
|
||||
already cover ringtail (it's on `192.168.1.21` per the recent
|
||||
static-IP pin). If so this card is a verification card.
|
||||
- If the rule is locked to indri's IP: add an entry for ringtail
|
||||
(192.168.1.21) or widen to the subnet pattern above.
|
||||
- Test mount from a ringtail debug pod (busybox or alpine with
|
||||
nfs-utils) against the `photos` share. Read a file. Write a temp
|
||||
file. Delete it.
|
||||
- Watch for the known sifaka NFS-over-Tailscale gotcha: sifaka's
|
||||
Tailscale must be in TUN mode (not userspace) for NFS to work
|
||||
reliably over the tailnet. The NFS path here goes over the LAN
|
||||
(not tailnet), so this shouldn't bite, but worth confirming the
|
||||
NFS traffic is on `192.168.1.x` not `100.x`.
|
||||
|
||||
## PV + PVC on ringtail
|
||||
|
||||
- New `pv-nfs.yaml` mirroring the minikube one (name can be shared
|
||||
if the PV is cluster-scoped — but PVs are per-cluster, so just
|
||||
duplicate). Same `server: sifaka`, same path, same
|
||||
`accessModes: [ReadWriteMany]`, `persistentVolumeReclaimPolicy:
|
||||
Retain`.
|
||||
- New `pvc.yaml` in the ringtail `immich` namespace bound to it.
|
||||
- The minikube PVC stays bound and active until cutover — both
|
||||
clusters can have the share NFS-mounted simultaneously (NFS RWX
|
||||
permits this). Immich itself must not be running on both sides
|
||||
at once.
|
||||
|
||||
## Verification
|
||||
|
||||
- A pod on ringtail can `ls /mnt/photos/` and see the same files
|
||||
as the indri pod.
|
||||
- File written from ringtail pod is visible from indri pod and
|
||||
vice versa (proves there's no caching surprise).
|
||||
|
||||
## Out of scope
|
||||
|
||||
- Migrating photo files. Nothing moves; this is just adding a second
|
||||
NFS client.
|
||||
- The `pvc-ml-cache.yaml` PVC (a separate ML model cache). That's
|
||||
not on NFS — it's a regular PVC. Recreated empty on ringtail in
|
||||
[[immich-app-on-ringtail]]; the first ML pod boot will repopulate
|
||||
it.
|
||||
Loading…
Add table
Add a link
Reference in a new issue