Compare commits

...
Sign in to create a new pull request.

20 commits

Author SHA1 Message Date
b21d13fe20 C2(migrate-immich-to-ringtail): finalize chain — strip mikado frontmatter, add changelog
Immich is fully migrated off minikube-indri onto k3s-ringtail. All
six prerequisite cards plus the goal card converted to historical
documentation by removing status/branch/requires Mikado frontmatter.

Changelog fragment added at docs/changelog.d/migrate-immich-to-ringtail.infra.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:46:27 -07:00
7400807be3 C2(migrate-immich-to-ringtail): close immich-cutover-and-decommission
Sequence executed:
1. Quiesced source: immich-server + immich-machine-learning on
   minikube scaled to 0 (done in immich-pg-data-migration).
2. Deleted minikube immich-tailscale Ingress; waited for "photos"
   Tailscale device to deregister.
3. (Promote of ringtail pg was done in immich-pg-data-migration.)
4. Renamed ringtail ingress tls.hosts photos-ringtail -> photos.
5. Caddy was already pointing photos.ops.eblu.me ->
   photos.tail8d86e.ts.net so no Ansible change needed.
6. Smoke test: photos.ops.eblu.me/api/server/ping -> 200,
   /api/server/version -> {"major":2,"minor":6,"patch":3}.
7. Borgmatic continuity: added a ringtail immich-pg-tailscale
   Service (same FQDN as before, immich-pg.tail8d86e.ts.net).
   Verified borgmatic role can SELECT count(*) FROM asset over the
   tailnet (returned 12681, matches source).

Decommission:
- Deleted argocd Application "immich" with --cascade (clears
  Deployments, Services, etc. on minikube).
- Pruned blumeops-pg Application against the branch which removed
  the Cluster immich-pg, its ExternalSecret, and the old
  immich-pg-tailscale Service from minikube.
- Deleted leftover Released PVs on minikube.
- Deleted the empty immich namespace on minikube.

Did not verify minikube host memory drop directly (tailscale-ssh
re-auth was prompting at the time). Caller should confirm via
"docker stats minikube" once SSH is re-authenticated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:42:31 -07:00
7573a72318 C2(migrate-immich-to-ringtail): impl decommission minikube immich; add ringtail immich-pg tailscale service
GitOps decommission of immich + immich-pg on minikube:
- Delete argocd/apps/immich.yaml
- Delete argocd/manifests/immich/ entirely
- Delete argocd/manifests/databases/{immich-pg,external-secret-immich-borgmatic,service-immich-pg-tailscale}.yaml
- Remove those entries from databases/kustomization.yaml

Add ringtail-side immich-pg Tailscale LoadBalancer Service (hostname
"immich-pg") so borgmatic can keep using the same FQDN for nightly
backups. This claims the device name freed by deleting the minikube
service.

The ringtail manifest path stays as argocd/manifests/immich-ringtail/
and the ArgoCD app stays as immich-ringtail — renaming would force a
cascading delete + recreate, with a window where live resources
disappear.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:31:09 -07:00
aad76fc3e0 C2(migrate-immich-to-ringtail): impl rename ringtail immich ingress photos-ringtail -> photos
Minikube immich-tailscale Ingress was deleted; the "photos" Tailscale
device name is now free. Renaming the ringtail ingress claims it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:27:04 -07:00
18e6c7ef5d C2(migrate-immich-to-ringtail): close immich-app-on-ringtail
All three pods Running, 1/1 Ready:
- immich-server: v2.6.3, connected to ringtail pg + valkey
  ("/api/server/ping" returns 200, "/api/server/version" returns
  v2.6.3)
- immich-machine-learning: CUDA variant, RTX 4080 attached
  (nvidia-smi shows 8 GiB used / 16 GiB total — shared with
  frigate via time-slicing), gunicorn workers booted
- immich-valkey: upstream multi-arch docker.io/valkey/valkey:8.1.6

immich-db Secret in the immich namespace created manually with
source's immich-pg-app password (matches minikube pattern).
Tailscale ingress staging hostname: photos-ringtail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:23:24 -07:00
5a9596c7d9 C2(migrate-immich-to-ringtail): impl add immich Deployments + bump GPU time-slicing
- argocd/manifests/immich-ringtail/: full port of the immich stack
  (server, ML, valkey, services, ingress, pvc-ml-cache) from
  argocd/manifests/immich/, with ringtail-specific tweaks:
  - deployment-ml: runtimeClassName=nvidia, nvidia.com/gpu:1 limit,
    -cuda image tag
  - deployment-valkey + kustomization: drop the
    registry.ops.eblu.me/blumeops/valkey mirror (arm64-only), use
    upstream docker.io/valkey/valkey:8.1.6 (multi-arch)
  - ingress-tailscale: tls.hosts=[photos-ringtail] for staging
- argocd/apps/immich-ringtail.yaml: new ArgoCD app (manual sync,
  ringtail destination)
- argocd/manifests/nvidia-device-plugin/time-slicing-config.yaml:
  bump replicas 2 -> 4 so the ringtail GPU can be shared by
  frigate + ollama + immich-ml

The immich-db Secret in the immich namespace is created manually
(matching minikube pattern) — see argocd/apps/immich-ringtail.yaml
header for the procedure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:14:07 -07:00
674ca2ced9 C2(migrate-immich-to-ringtail): close immich-pg-data-migration
Migration via CNPG pg_basebackup (Option A) completed cleanly.

Sequence:
1. Stopped immich-server + immich-machine-learning on minikube
   (scaled to 0). valkey + source pg kept running.
2. Copied minikube's immich-pg-ca + immich-pg-replication secrets
   to ringtail as source-immich-pg-{ca,replication}.
3. Recreated the ringtail immich-pg Cluster with
   bootstrap.pg_basebackup, replica.enabled=true, externalClusters
   pointing at immich-pg.tail8d86e.ts.net via the streaming_replica
   TLS cert.
4. Basebackup completed in ~50s. Replica caught up streaming.
5. Verified row counts identical between source and replica:
   asset=12681, user=1, album=28, smart_search=9624,
   activity=0, asset_face=3917.
6. Promoted via replica.enabled=false. pg_is_in_recovery → false.
   Write test passed. All 7 expected extensions present in immich
   db (vector, vchord, cube, earthdistance, pg_trgm, unaccent,
   uuid-ossp).
7. Pruned bootstrap + externalClusters blocks; deleted out-of-band
   replication secrets.

Source minikube immich-pg is intact and untouched — recovery path
remains available until immich-cutover-and-decommission completes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:12:21 -07:00
e59bbc9348 C2(migrate-immich-to-ringtail): impl prune externalClusters + bootstrap from immich-pg manifest
Migration done, cluster promoted. Pruning the externalClusters block
and bootstrap.pg_basebackup reference eliminates the footgun where a
future replica.enabled=true would demote this primary against the
stale minikube source.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:12:21 -07:00
431d538ab1 C2(migrate-immich-to-ringtail): impl promote ringtail immich-pg from replica to primary
Row counts verified equal between source (minikube) and replica
(ringtail) across asset (12681), user (1), album (28),
smart_search (9624), activity (0), asset_face (3917). Source immich
is scaled to 0 — no writes since the basebackup completed.

Flipping replica.enabled=false to promote. The externalClusters and
bootstrap.pg_basebackup blocks are left in place as documentation
(CNPG ignores them after initialization).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:12:21 -07:00
5752f00343 C2(migrate-immich-to-ringtail): impl bootstrap immich-pg via pg_basebackup from minikube
Replaces the initdb bootstrap with a pg_basebackup from the minikube
source over the tailnet (immich-pg.tail8d86e.ts.net). The ringtail
cluster starts in replica mode (replica.enabled=true), streaming WAL
from the source. Promotion happens by flipping replica.enabled=false
after the replica catches up and the source is quiesced.

Uses the source's streaming_replica TLS cert + CA, copied to ringtail
as out-of-band secrets (source-immich-pg-replication,
source-immich-pg-ca) — the standard CNPG-to-CNPG migration auth path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:12:21 -07:00
be5255d685 C2(migrate-immich-to-ringtail): close sifaka-nfs-from-ringtail
Verified on k3s-ringtail:
- Sifaka NFS export /volume1/photos covers 192.168.1.0/24 +
  100.64.0.0/10. Ringtail at 192.168.1.21 is in scope; no DSM rule
  changes needed.
- nfs-test pod mounted the share, read existing library/ thumbs/
  backups/ encoded-video/ profile/, wrote a temp file, deleted it.
- DNS resolution: sifaka → 192.168.1.203 (LAN). NFS traffic stays
  off tailnet, avoiding the sifaka-tailscale-userspace concern.
- Committed PV + PVC bind on first apply (RWX, 2Ti).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:12:21 -07:00
9f8d627ce8 C2(migrate-immich-to-ringtail): impl add ringtail-side NFS PV/PVC for immich library
Mirrors argocd/manifests/immich/pv-nfs.yaml + pvc.yaml. PV renamed
to immich-library-nfs-pv-ringtail to avoid confusion with the
minikube side (PVs are cluster-scoped; both can coexist).

Initial kustomization.yaml in argocd/manifests/immich-ringtail/
holds just the storage bits today; deployments/services/ingress
will be added in immich-app-on-ringtail.

Verified: PVC binds to PV on k3s-ringtail; mount test from a
busybox pod read existing photo library dirs, wrote and deleted a
test file. DNS resolves sifaka to 192.168.1.203 so NFS traffic
stays on the LAN, off the tailnet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:12:21 -07:00
4c6695868d C2(migrate-immich-to-ringtail): close immich-pg-on-ringtail
Verified on k3s-ringtail:
- Cluster immich-pg reached "Cluster in healthy state" (1/1 instance)
- borgmatic role: rolcanlogin=t, member of pg_read_all_data
- ExternalSecret immich-pg-borgmatic: Ready=True, username=borgmatic
- Extensions vchord, vector, cube, earthdistance installed in postgres db
  (immich db extensions deferred to app startup per the card)

10 GiB local-path storage; same VectorChord image as minikube source.
Bootstrap is empty initdb today; will be rewritten when
immich-pg-data-migration picks its import method.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:12:21 -07:00
1d9d8867fb C2(migrate-immich-to-ringtail): impl add immich-pg cluster + app on ringtail
Mirror of argocd/manifests/databases/immich-pg.yaml on ringtail:
- Same VectorChord image (PG17 + VectorChord 0.5.0)
- Same extensions (vector, vchord, cube, earthdistance) via postInitSQL
- Same managed borgmatic role with pg_read_all_data
- 10 GiB local-path storage (matches minikube source)
- shared_preload_libraries: vchord.so
- Empty initdb today; bootstrap block will be rewritten when
  immich-pg-data-migration picks its import method.

ArgoCD app databases-ringtail targets ringtail/databases.
ExternalSecret reuses the onepassword-blumeops ClusterSecretStore that
already exists on ringtail via external-secrets-ringtail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:12:21 -07:00
e1fe5d2ea6 C2(migrate-immich-to-ringtail): close cnpg-on-ringtail
Verified: cnpg-controller-manager pod Ready on k3s-ringtail; CRDs
clusters.postgresql.cnpg.io etc. installed; ArgoCD app Synced/Healthy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:12:21 -07:00
b37ac0750f C2(migrate-immich-to-ringtail): impl add cloudnative-pg-ringtail ArgoCD app
Sibling of cloudnative-pg.yaml targeting k3s-ringtail. Same mirror
(mirrors/cloudnative-pg) and release (v1.27.1), same sync options.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:12:21 -07:00
bca5c40663 C2(migrate-immich-to-ringtail): plan capture GPU contention + valkey arch on immich-app-on-ringtail
Two discovered prereqs while bringing the immich stack up on ringtail:

1. nvidia-device-plugin time-slicing on ringtail advertises only 2
   virtual GPUs. Frigate + Ollama consume both. immich-ml's
   nvidia.com/gpu:1 cannot schedule until replicas is bumped to >= 3.
2. The registry.ops.eblu.me/blumeops/valkey image was built on indri
   (arm64) and is single-arch. Pulling on ringtail (amd64)
   crashloops with "exec format error". Use the upstream multi-arch
   docker.io/valkey/valkey image directly until the mirror gets a
   multi-arch tag.

Card body updated to capture both. Next impl incorporates the fixes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:12:09 -07:00
355be3fbc4 C2(migrate-immich-to-ringtail): plan correct extension-verification on immich-pg-on-ringtail card
CNPG's bootstrap.initdb.postInitSQL runs against the postgres
superuser database, not the application database. Extensions
declared there end up in the postgres db, not the immich db. The
Immich app installs them in its own database at startup.

This matches the existing minikube cluster's behavior — same
Cluster CR, same effect. Adjusting the card's verification to
reflect reality rather than (incorrectly) requiring extensions to
be present in the immich db pre-app-deploy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 12:25:30 -07:00
db37e7cc3e C2(migrate-immich-to-ringtail): plan capture two discovered concerns
1. Registering new ArgoCD apps from a feature branch: the app-of-apps
   "apps" Application is self-managing (re-reads apps.yaml on every
   sync, which pins targetRevision: main). So setting its revision to
   a branch doesn't stick across syncs, and new app definitions on a
   branch are invisible to the cluster via the normal flow. The goal
   card now documents the kubectl-apply + per-new-app `argocd app set
   --revision <branch>` workaround.

2. Tailscale device-name collision on cutover. The minikube immich
   ingress claims tailnet hostname "photos" (tls.hosts: [photos]).
   The ringtail ingress can't claim the same name while minikube's is
   alive (Tailscale enforces uniqueness). Staging uses
   tls.hosts: [photos-ringtail], with the rename to "photos" baked
   into immich-cutover-and-decommission step 2 + step 5.

Card dependency graph unchanged; no new cards.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 12:21:57 -07:00
4623733695 C2(migrate-immich-to-ringtail): plan introduce mikado chain
Goal: move immich (server, ML, valkey, postgres) off minikube-indri
onto k3s-ringtail. Immich is the largest single tenant on minikube
(~1.5 GiB resident) and minikube is memory-saturated.

Prerequisite cards:
- cnpg-on-ringtail
- immich-pg-on-ringtail (requires cnpg-on-ringtail)
- immich-pg-data-migration (requires immich-pg-on-ringtail)
- sifaka-nfs-from-ringtail
- immich-app-on-ringtail (requires immich-pg-on-ringtail, sifaka-nfs-from-ringtail)
- immich-cutover-and-decommission (requires immich-pg-data-migration, immich-app-on-ringtail)

Data loss is a critical failure; downtime is acceptable. The cutover
plan favors a CNPG externalCluster basebackup (Option A) with pg_dump
as the documented fallback (Option B).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 11:05:40 -07:00
32 changed files with 820 additions and 265 deletions

View file

@ -0,0 +1,27 @@
# CloudNativePG Operator for ringtail k3s cluster
# Deploys the operator only; PostgreSQL clusters are created separately
#
# Sibling of cloudnative-pg.yaml (minikube). Same mirror, same release,
# different destination. Both apps will coexist during the immich
# migration; the minikube one is removed at the end of the broader
# indri-k8s decommission.
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: cloudnative-pg-ringtail
namespace: argocd
spec:
project: default
source:
repoURL: ssh://forgejo@forge.ops.eblu.me:2222/mirrors/cloudnative-pg.git
targetRevision: v1.27.1
path: releases
directory:
include: 'cnpg-1.27.1.yaml'
destination:
server: https://ringtail.tail8d86e.ts.net:6443
namespace: cnpg-system
syncPolicy:
syncOptions:
- CreateNamespace=true
- ServerSideApply=true # Required for large CRDs that exceed annotation size limit

View file

@ -0,0 +1,26 @@
# Databases on ringtail k3s.
#
# Today: only immich-pg (CNPG Cluster) + its borgmatic ExternalSecret.
# More databases may move here as the indri-k8s decommission proceeds.
#
# Prerequisites:
# - cloudnative-pg-ringtail (operator must exist before the Cluster CR)
# - external-secrets-ringtail + 1password-connect-ringtail (for the
# immich-pg-borgmatic ExternalSecret to sync)
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: databases-ringtail
namespace: argocd
spec:
project: default
source:
repoURL: ssh://forgejo@forge.ops.eblu.me:2222/eblume/blumeops.git
targetRevision: main
path: argocd/manifests/databases-ringtail
destination:
server: https://ringtail.tail8d86e.ts.net:6443
namespace: databases
syncPolicy:
syncOptions:
- CreateNamespace=true

View file

@ -0,0 +1,31 @@
# Immich on ringtail k3s.
#
# Staging deployment; the minikube `immich` app remains in parallel
# until cutover. See [[immich-cutover-and-decommission]] for the
# routing flip + minikube cleanup.
#
# Prerequisites:
# - cnpg-on-ringtail + databases-ringtail (postgres)
# - 1password-connect-ringtail + external-secrets-ringtail (not used
# by this app today — immich-db Secret is created manually,
# matching the minikube pattern)
# - The immich-db Secret in the immich namespace, holding the
# password for the `immich` postgres role (copied from the source
# immich-pg-app Secret at migration time).
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: immich-ringtail
namespace: argocd
spec:
project: default
source:
repoURL: ssh://forgejo@forge.ops.eblu.me:2222/eblume/blumeops.git
targetRevision: main
path: argocd/manifests/immich-ringtail
destination:
server: https://ringtail.tail8d86e.ts.net:6443
namespace: immich
syncPolicy:
syncOptions:
- CreateNamespace=true

View file

@ -1,30 +0,0 @@
# Immich - Self-hosted photo and video management
# High-performance Google Photos/iCloud alternative with AI features
#
# Kustomize manifests in argocd/manifests/immich/
# Components: server, machine-learning, valkey (Redis)
#
# Prerequisites:
# 1. Create immich namespace and secrets:
# kubectl create namespace immich
# kubectl --context=minikube-indri create secret generic immich-db -n immich \
# --from-literal=password="$(kubectl --context=minikube-indri -n databases get secret immich-pg-app -o jsonpath='{.data.password}' | base64 -d)"
# 2. Create immich-pg database and user (see immich-pg app)
# 3. NFS share on sifaka at /volume1/photos with read/write for indri
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: immich
namespace: argocd
spec:
project: default
source:
repoURL: ssh://forgejo@forge.ops.eblu.me:2222/eblume/blumeops.git
targetRevision: main
path: argocd/manifests/immich
destination:
server: https://kubernetes.default.svc
namespace: immich
syncPolicy:
syncOptions:
- CreateNamespace=true

View file

@ -1,9 +1,12 @@
# ExternalSecret for borgmatic backup user password on immich-pg cluster
# (ringtail k3s).
#
# Mirror of argocd/manifests/databases/external-secret-immich-borgmatic.yaml.
# The onepassword-blumeops ClusterSecretStore exists on ringtail via the
# external-secrets-ringtail app.
#
# Reuses the same 1Password item as blumeops-pg-borgmatic.
# 1Password item: "borgmatic" in blumeops vault
# Field: "db-password"
#
apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
@ -23,7 +26,7 @@ spec:
username: borgmatic
password: "{{ .password }}"
data:
- secretKey: password
remoteRef:
key: borgmatic
property: db-password
- secretKey: password
remoteRef:
key: borgmatic
property: db-password

View file

@ -0,0 +1,53 @@
# PostgreSQL Cluster for Immich on ringtail k3s.
#
# Initially bootstrapped via CNPG pg_basebackup from the minikube
# immich-pg cluster on 2026-05-13, then promoted to primary. The
# externalClusters + bootstrap.pg_basebackup blocks have been pruned
# from this manifest now that the migration is complete — leaving
# them around is a footgun (re-enabling replica.enabled=true would
# try to demote this cluster against a stale source). See
# [[immich-pg-data-migration]] for the procedure used.
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: immich-pg
namespace: databases
spec:
instances: 1
imageName: ghcr.io/tensorchord/cloudnative-vectorchord:17-0.5.0
storage:
size: 10Gi
storageClass: local-path
# Managed roles
managed:
roles:
- name: borgmatic
login: true
connectionLimit: -1
ensure: present
inherit: true
inRoles:
- pg_read_all_data
passwordSecret:
name: immich-pg-borgmatic
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "500m"
postgresql:
shared_preload_libraries:
- "vchord.so"
parameters:
max_connections: "50"
shared_buffers: "128MB"
password_encryption: "scram-sha-256"
pg_hba:
- host all all 0.0.0.0/0 scram-sha-256
- host all all ::/0 scram-sha-256

View file

@ -0,0 +1,9 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: databases
resources:
- immich-pg.yaml
- external-secret-immich-borgmatic.yaml
- service-immich-pg-tailscale.yaml

View file

@ -1,6 +1,8 @@
# Tailscale LoadBalancer for immich-pg PostgreSQL access
# Canonical hostname: immich-pg.tail8d86e.ts.net
# Caddy L4 proxies pg.ops.eblu.me:5433 → this service for borgmatic backups
# Tailscale LoadBalancer for immich-pg PostgreSQL access on ringtail.
# Canonical hostname: immich-pg.tail8d86e.ts.net (claimed from the
# minikube side after the minikube service was removed during the
# immich-to-ringtail migration). Borgmatic on indri uses this
# hostname for nightly backups.
apiVersion: v1
kind: Service
metadata:

View file

@ -1,69 +0,0 @@
# PostgreSQL Cluster for Immich
# Uses VectorChord (successor to pgvecto.rs) for AI-powered vector search
# See: https://github.com/immich-app/immich/discussions/9060
# Managed by CloudNativePG operator
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: immich-pg
namespace: databases
spec:
instances: 1
# VectorChord image for PostgreSQL 17 with VectorChord 0.5.0
# Immich v2.4.1 requires VectorChord >=0.3 <0.6
# See: https://github.com/tensorchord/VectorChord
imageName: ghcr.io/tensorchord/cloudnative-vectorchord:17-0.5.0
storage:
size: 10Gi
storageClass: standard
# Bootstrap creates initial database and owner
bootstrap:
initdb:
database: immich
owner: immich
postInitSQL:
# Extensions required by Immich
- CREATE EXTENSION IF NOT EXISTS vector;
- CREATE EXTENSION IF NOT EXISTS vchord CASCADE;
- CREATE EXTENSION IF NOT EXISTS cube CASCADE;
- CREATE EXTENSION IF NOT EXISTS earthdistance CASCADE;
# Managed roles
# Note: connectionLimit, ensure, inherit are CNPG defaults added to prevent ArgoCD drift
managed:
roles:
# borgmatic read-only user for backups
- name: borgmatic
login: true
connectionLimit: -1
ensure: present
inherit: true
inRoles:
- pg_read_all_data
passwordSecret:
name: immich-pg-borgmatic
# Resource limits for minikube environment
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "500m"
# PostgreSQL configuration
postgresql:
# VectorChord requires vchord.so in shared_preload_libraries
shared_preload_libraries:
- "vchord.so"
parameters:
max_connections: "50"
shared_buffers: "128MB"
password_encryption: "scram-sha-256"
pg_hba:
# Allow connections from k8s pods
- host all all 0.0.0.0/0 scram-sha-256
- host all all ::/0 scram-sha-256

View file

@ -5,13 +5,10 @@ namespace: databases
resources:
- blumeops-pg.yaml
- immich-pg.yaml
- service-tailscale.yaml
- service-immich-pg-tailscale.yaml
- service-metrics-tailscale.yaml
- external-secret-eblume.yaml
- external-secret-borgmatic.yaml
- external-secret-immich-borgmatic.yaml
- external-secret-teslamate.yaml
- external-secret-authentik.yaml
- external-secret-paperless.yaml

View file

@ -16,11 +16,16 @@ spec:
app: immich
component: machine-learning
spec:
runtimeClassName: nvidia
securityContext:
seccompProfile:
type: RuntimeDefault
containers:
- name: machine-learning
# ringtail uses the -cuda tag (set in kustomization.yaml)
# to take advantage of the RTX 4080 via the nvidia
# device plugin. Time-slicing is configured for 4 replicas
# so frigate + ollama + this pod can share.
image: ghcr.io/immich-app/immich-machine-learning:kustomized
ports:
- name: http
@ -57,6 +62,7 @@ spec:
cpu: "100m"
limits:
memory: "4Gi"
nvidia.com/gpu: "1"
volumes:
- name: cache
persistentVolumeClaim:

View file

@ -1,6 +1,9 @@
# Tailscale Ingress for Immich
# Exposes Immich at photos.tail8d86e.ts.net
# Caddy will proxy photos.ops.eblu.me to this endpoint
# Tailscale ProxyGroup Ingress for Immich on ringtail.
#
# Production hostname: photos.tail8d86e.ts.net
# (during the cutover window this was photos-ringtail; the minikube
# ingress was torn down before this was renamed to photos to avoid
# the Tailscale device-name collision.)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
@ -16,12 +19,6 @@ metadata:
gethomepage.dev/description: "Photo management"
gethomepage.dev/href: "https://photos.ops.eblu.me"
gethomepage.dev/pod-selector: "app=immich,component=server"
# TODO: Add Immich widget - requires API key from Account Settings > API Keys
# See: https://gethomepage.dev/widgets/services/immich/
# gethomepage.dev/widget.type: "immich"
# gethomepage.dev/widget.url: "https://photos.ops.eblu.me"
# gethomepage.dev/widget.key: "{{HOMEPAGE_VAR_IMMICH_API_KEY}}"
# gethomepage.dev/widget.version: "2"
spec:
ingressClassName: tailscale
rules:

View file

@ -1,7 +1,8 @@
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: immich
resources:
- deployment-server.yaml
- deployment-ml.yaml
@ -13,11 +14,15 @@ resources:
- pv-nfs.yaml
- pvc.yaml
- ingress-tailscale.yaml
images:
- name: ghcr.io/immich-app/immich-server
newTag: v2.6.3
- name: ghcr.io/immich-app/immich-machine-learning
newTag: v2.6.3
# CUDA variant of the same release — ringtail has an RTX 4080
newTag: v2.6.3-cuda
# Using upstream multi-arch valkey image directly; the
# registry.ops.eblu.me/blumeops/valkey mirror is arm64-only (built
# on indri) and would crashloop on ringtail.
- name: docker.io/valkey/valkey
newName: registry.ops.eblu.me/blumeops/valkey
newTag: v8.1.6-r0-fabca04
newTag: "8.1.6"

View file

@ -0,0 +1,29 @@
# NFS PersistentVolume for Immich photo library on ringtail k3s.
#
# Mirror of argocd/manifests/immich/pv-nfs.yaml (minikube) but with
# a distinct name (minikube and ringtail are separate clusters, so PV
# names don't collide cluster-side, but using the same name in two
# manifests is confusing).
#
# The sifaka NFS export for /volume1/photos already permits
# 192.168.1.0/24 + 100.64.0.0/10. Ringtail's wired IP (192.168.1.21)
# falls in the first CIDR, so no DSM rule changes are needed.
#
# Verified 2026-05-13: ringtail pod can read existing dirs, write
# new files, and delete them. DNS resolves sifaka to 192.168.1.203
# (LAN), so NFS traffic stays off the tailnet — avoids the known
# sifaka-tailscale-userspace bite.
apiVersion: v1
kind: PersistentVolume
metadata:
name: immich-library-nfs-pv-ringtail
spec:
capacity:
storage: 2Ti
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: ""
nfs:
server: sifaka
path: /volume1/photos

View file

@ -1,5 +1,5 @@
# PersistentVolumeClaim for Immich photo library
# Binds to the NFS PV for sifaka:/volume1/photos
# PersistentVolumeClaim for Immich photo library on ringtail.
# Binds to immich-library-nfs-pv-ringtail (sifaka:/volume1/photos).
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
@ -9,7 +9,7 @@ spec:
accessModes:
- ReadWriteMany
storageClassName: ""
volumeName: immich-library-nfs-pv
volumeName: immich-library-nfs-pv-ringtail
resources:
requests:
storage: 2Ti

View file

@ -1,115 +0,0 @@
# Immich
Self-hosted photo and video management solution with AI-powered search and face recognition.
## Prerequisites
1. **NFS Share**: Create `/volume1/photos` on sifaka with NFS permissions for indri
2. **PostgreSQL**: The `immich-pg` cluster (with pgvecto.rs) must be healthy
3. **Secrets**: Create the database password secret
## Deployment Order
1. Sync `blumeops-pg` (to get CloudNativePG operator if not already running)
2. Wait for `immich-pg` cluster to be healthy
3. Create secrets (see below)
4. Sync `immich` (deploys all resources: storage, services, deployments)
5. Run `mise run provision-indri -- --tags caddy` to update Caddy config
## Components
| Component | Deployment | Service | Port |
|-----------|------------|---------|------|
| Server (web/API) | `immich-server` | `immich-server` | 2283 |
| Machine Learning | `immich-machine-learning` | `immich-machine-learning` | 3003 |
| Valkey (Redis) | `immich-valkey` | `immich-valkey` | 6379 |
## Secret Setup
The `immich-db` secret contains the database password, which is auto-generated by CloudNativePG
in the `immich-pg-app` secret. To create or regenerate the secret:
```bash
# Create namespace if needed
kubectl --context=minikube-indri create namespace immich
# Copy password from CNPG secret to immich namespace
kubectl --context=minikube-indri create secret generic immich-db -n immich \
--from-literal=password="$(kubectl --context=minikube-indri -n databases get secret immich-pg-app -o jsonpath='{.data.password}' | base64 -d)"
```
Note: This secret is not managed by ExternalSecrets since the source of truth is the CNPG-generated secret.
## Access
- **URL**: https://photos.ops.eblu.me (after Caddy is updated)
- **Tailscale**: https://photos.tail8d86e.ts.net (direct)
## First-Time Setup
1. Navigate to https://photos.ops.eblu.me
2. Create an admin account
3. Configure external library (optional - for importing existing photos)
## External Library (iCloud Photos)
To import existing photos from iCloud sync on indri:
1. In Immich Admin > External Libraries, create a new library
2. Set the import path to the location where iCloud photos sync
3. Configure scan schedule or trigger manual scan
## Architecture
```
┌─────────────────┐ ┌─────────────────┐
│ immich-server │────▶│ immich-pg │
│ (web/api) │ │ (PostgreSQL │
└────────┬────────┘ │ + pgvecto.rs) │
│ └─────────────────┘
┌────────▼────────┐ ┌─────────────────┐
│ immich-ml │ │ valkey │
│ (ML inference) │ │ (Redis cache) │
└─────────────────┘ └─────────────────┘
┌────────▼────────┐
│ sifaka NFS │
│ /volume1/photos│
└─────────────────┘
```
## Version Management
Image versions are controlled via `kustomization.yaml`:
```yaml
images:
- name: ghcr.io/immich-app/immich-server
newTag: v2.6.3
- name: ghcr.io/immich-app/immich-machine-learning
newTag: v2.6.3
- name: docker.io/valkey/valkey
newTag: "8.1-alpine"
```
To upgrade, update `newTag` values and sync via ArgoCD.
## Troubleshooting
```bash
# Check pods
kubectl --context=minikube-indri -n immich get pods
# Check immich-pg cluster
kubectl --context=minikube-indri -n databases get cluster immich-pg
# View server logs
kubectl --context=minikube-indri -n immich logs -l app=immich,component=server
# View ML logs
kubectl --context=minikube-indri -n immich logs -l app=immich,component=machine-learning
# Check PVC binding
kubectl --context=minikube-indri -n immich get pvc
```

View file

@ -1,22 +0,0 @@
# NFS PersistentVolume for Immich photo library
# Requires: NFS share on sifaka at /volume1/photos with NFS permissions for indri
#
# To create on Synology:
# 1. Control Panel > Shared Folder > Create
# 2. Name: photos, Location: Volume 1
# 3. Control Panel > File Services > NFS > NFS Rules
# 4. Add rule for "photos" share: Hostname=indri, Privilege=Read/Write, Squash=No mapping
apiVersion: v1
kind: PersistentVolume
metadata:
name: immich-library-nfs-pv
spec:
capacity:
storage: 2Ti
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: ""
nfs:
server: sifaka
path: /volume1/photos

View file

@ -11,4 +11,4 @@ data:
timeSlicing:
resources:
- name: nvidia.com/gpu
replicas: 2
replicas: 4

View file

@ -0,0 +1,13 @@
Move the entire Immich stack — server, machine-learning, valkey,
and the PostgreSQL+VectorChord cluster — off `minikube-indri` and
onto `k3s-ringtail`. Postgres data migrated zero-loss via CNPG
`pg_basebackup` (replica catch-up then promote); row counts on
`asset`, `user`, `album`, `smart_search`, `activity`, `asset_face`
verified equal between source and replica before cutover. The ML
pod now uses ringtail's RTX 4080 via the nvidia-device-plugin
(time-slicing bumped 2 → 4 to share with frigate + ollama). Caddy
routing at `photos.ops.eblu.me` is unchanged (still
`photos.tail8d86e.ts.net`, the device just lives on ringtail now).
Borgmatic backups continue against the same `immich-pg` tailnet
hostname. First concrete chain in the broader indri-k8s
decommission effort.

View file

@ -0,0 +1,52 @@
---
title: CNPG Operator on Ringtail
modified: 2026-05-13
last-reviewed: 2026-05-13
tags:
- how-to
- operations
- postgres
- ringtail
---
# CNPG Operator on Ringtail
Bring up the `cloudnative-pg` operator on `k3s-ringtail`. Today the
operator only exists on `minikube-indri` (see
`argocd/apps/cloudnative-pg.yaml`, destination `kubernetes.default.svc`).
Prerequisite of [[migrate-immich-to-ringtail]]; consumed by
[[immich-pg-on-ringtail]].
## What to do
- Add a sibling `argocd/apps/cloudnative-pg-ringtail.yaml` pointing
at the same mirror (`mirrors/cloudnative-pg`, tag `v1.27.1`),
destination `https://ringtail.tail8d86e.ts.net:6443`,
namespace `cnpg-system`.
- Mirror the `ServerSideApply=true` and `CreateNamespace=true` sync
options (the CRDs exceed the annotation size limit).
- Sync `apps` then `cloudnative-pg-ringtail`. Verify the operator
pod is running on ringtail.
## Verification
```fish
kubectl --context=k3s-ringtail -n cnpg-system get pods
kubectl --context=k3s-ringtail get crd clusters.postgresql.cnpg.io
```
## Why a separate app
Each ArgoCD app targets a single cluster via `destination.server`.
We could parameterize with ApplicationSets, but blumeops' convention
is to duplicate the manifest with a `-ringtail` suffix (see
`alloy-ringtail`, `external-secrets-ringtail`, etc.). Keep the
convention.
## Out of scope
- Postgres clusters themselves (`immich-pg`, etc.) — those come from
[[immich-pg-on-ringtail]].
- Removing the minikube cnpg operator. That happens at the very end
of the indri-k8s decommission, not in this chain.

View file

@ -0,0 +1,91 @@
---
title: Immich App on Ringtail
modified: 2026-05-13
last-reviewed: 2026-05-13
tags:
- how-to
- operations
- immich
---
# Immich App on Ringtail
Bring up `immich-server`, `immich-machine-learning`, and
`immich-valkey` on ringtail. This card stands the stack up against
the *new* pg cluster — it does not move user traffic. Cutover lives
in [[immich-cutover-and-decommission]].
## What to do
- New manifest dir `argocd/manifests/immich-ringtail/` (the suffix
matches the `-ringtail` convention used by other apps). Port from
`argocd/manifests/immich/`:
- `deployment-server.yaml` — point `DB_HOSTNAME` at the ringtail
pg service.
- `deployment-ml.yaml` — use `runtimeClassName: nvidia` + a
`resources.limits` for `nvidia.com/gpu: 1`. Use the `-cuda` tag
of the immich-ml image (set in kustomization). Ringtail is
single-node, so no node selector needed. See
`argocd/manifests/frigate/` for the existing GPU pod pattern.
**GPU contention discovery:** ringtail's `nvidia-device-plugin`
is configured with `timeSlicing.replicas: 2`. Frigate + Ollama
already consume both virtual slices. Adding immich-ml requires
bumping the count to >= 3. Edit
`argocd/manifests/nvidia-device-plugin/configmap.yaml` (or
wherever the device-plugin config lives) and re-sync the
`nvidia-device-plugin` ArgoCD app. The plugin pod restarts and
the new advertised count appears as the node's
`nvidia.com/gpu` allocatable.
- `deployment-valkey.yaml` — straight port, BUT use the upstream
multi-arch `docker.io/valkey/valkey:<version>` image — do NOT
use the `registry.ops.eblu.me/blumeops/valkey` rewrite in the
kustomization. That mirror was built on indri (arm64) and is
single-arch; pulling it on ringtail (amd64) gets `exec format
error` in CrashLoopBackOff. The mirror should eventually carry
a multi-arch tag, at which point the rewrite can return.
- `service*.yaml` — straight port.
- `pvc-ml-cache.yaml` — straight port (empty `local-path` PVC).
- `pv-nfs.yaml` + `pvc.yaml` — already covered by
[[sifaka-nfs-from-ringtail]] (may live in this dir or theirs).
- `ingress-tailscale.yaml` — ProxyGroup ingress, **must not** set
an explicit `host:` (or use `host: *`) per the lesson on
ProxyGroup VIP routing.
**Hostname collision warning:** the minikube ingress claims the
Tailscale device name `photos` (`tls.hosts: [photos]`). Two
devices on the tailnet cannot share that name. While the
ringtail deployment is being staged it must use a *different*
`tls.hosts` value (e.g. `photos-ringtail`) so it can coexist
with the running minikube one. The flip to `photos` happens at
cutover time, *after* the minikube ingress has been removed.
See [[immich-cutover-and-decommission#Cutover sequence]].
- `kustomization.yaml` — same `images:` block (server, ML, valkey).
- New ArgoCD app `argocd/apps/immich-ringtail.yaml` targeting
ringtail, namespace `immich`. **Manual sync only** until the
cutover.
- Existing `argocd/apps/immich.yaml` (minikube) stays untouched
during this card — both apps exist briefly.
## Bring it up against a copy of the DB
Use the throwaway/test path from [[immich-pg-data-migration#Dry run
before real cutover]]: point the ringtail immich at the *test* pg
cluster first, verify the pod boots, the web UI loads (via
`kubectl port-forward`), assets list, ML embeddings query. Then
tear it down.
## Verification
- All three pods Ready.
- ML pod has a GPU attached: `nvidia-smi` inside the container shows
the 4080.
- `immich-server` connects to pg and valkey (no `ECONNREFUSED` in
logs).
- A `kubectl port-forward` to the server service shows the Immich
web UI.
## Out of scope
- Public/tailnet routing flip. Caddy still points at the minikube
Tailscale ingress until [[immich-cutover-and-decommission]].
- Removing the minikube immich. Same.

View file

@ -0,0 +1,103 @@
---
title: Immich Cutover and Decommission
modified: 2026-05-13
last-reviewed: 2026-05-13
tags:
- how-to
- operations
- immich
- migration
---
# Immich Cutover and Decommission
The user-visible flip. By the time this card opens, the ringtail
stack has been proven against a copy of the data. This card does the
real cutover.
## Pre-cutover checklist
- [[immich-pg-data-migration]] dry-run succeeded; method is chosen.
- Ringtail immich stack has been brought up against the test pg,
pods healthy, UI loaded ([[immich-app-on-ringtail#Verification]]).
- Borgmatic just ran successfully (a fresh nightly archive is a
belt-and-suspenders fallback, on top of the live source pg).
- User has been told to stop uploading from the iOS app for the
cutover window.
## Cutover sequence
1. **Quiesce source.** `kubectl --context=minikube-indri -n immich
scale deploy/immich-server --replicas=0` and same for ML. Leave
valkey + pg running. Confirm no client traffic on the source pg
via `pg_stat_activity`.
2. **Tear down the minikube Tailscale ingress.** The `photos`
Tailscale device name must be freed before ringtail's ingress can
claim it (Tailscale enforces uniqueness across the tailnet).
`kubectl --context=minikube-indri -n immich delete ingress
immich-tailscale` and wait for the corresponding `tailscale`-LB
StatefulSet pod to terminate. Verify the `photos` device is gone:
`tailscale status | grep -i photos` from any tailnet host.
3. **Final sync.** Per chosen method in
[[immich-pg-data-migration]]:
- Option A: promote the ringtail replica.
- Option B: take final `pg_dump`, restore to ringtail
`immich-pg`.
4. **Verify.** Run the row-count and schema-diff checks from
[[immich-pg-data-migration#Verification on the real run]].
5. **Flip the ringtail ingress to `photos`.** Update
`argocd/manifests/immich-ringtail/ingress-tailscale.yaml`:
`tls.hosts: [photos]` (was `[photos-ringtail]` during staging per
[[immich-app-on-ringtail]]). Commit, `argocd app sync
immich-ringtail`. Wait for the `photos` device to register on the
tailnet again.
6. **Bring up ringtail immich** against the now-promoted pg
(`argocd app sync immich-ringtail`). Wait for Ready.
7. **Flip routing.** Update Caddy on indri
(`ansible/roles/caddy/defaults/main.yml`): `photos.ops.eblu.me`
upstream changes to the ringtail Tailscale ingress hostname
(`photos` — same MagicDNS name, now pointing to the ringtail
proxy). `mise run provision-indri -- --tags caddy`.
8. **Smoke test.** Open `photos.ops.eblu.me` in a browser. Sign in.
Scroll the timeline. Open an album. Trigger an ML search.
9. **Update borgmatic.** If the Tailscale hostname for pg changed,
update `borgmatic.cfg` on indri to point at the ringtail
`immich-pg-tailscale` service. Run a manual backup to verify.
## After cutover
- `argocd app set immich --revision <branch>` is no longer relevant;
the minikube `immich` app gets deleted entirely.
- Delete `argocd/apps/immich.yaml`, `argocd/manifests/immich/`, and
the minikube `argocd/manifests/databases/immich-pg.yaml` +
`external-secret-immich-borgmatic.yaml` +
`service-immich-pg-tailscale.yaml`.
- Rename `immich-ringtail` back to `immich` (the `-ringtail` suffix
was scaffolding for the dual-cluster window; once minikube is
empty of immich, the unsuffixed name is clean).
- Confirm the minikube `immich-pg` PVC is no longer used, then
delete it (the PV with `Retain` policy will persist — clean that
up too).
## Verification (definition of done)
- `photos.ops.eblu.me` works for a real session, including ML search.
- Source minikube has no `immich` pods, no `immich-pg`, no PVCs.
- Memory pressure on minikube has dropped (≥1.5 GiB reclaimed). Check
`docker stats minikube` on indri.
- Nightly borgmatic run after the cutover completes successfully,
with the immich-pg archive showing the new source.
## Rollback (within the cutover window)
If smoke test fails: flip Caddy back, scale ringtail immich to 0,
scale source immich back up. Source pg was never destroyed. File a
plan reset on the relevant prerequisite card and try again next
session.
## Out of scope
- Decommissioning all of minikube. This chain just removes immich.
Other tenants migrate in their own chains as part of the broader
indri-k8s decommission. See [[migrate-immich-to-ringtail]] for
context.

View file

@ -0,0 +1,79 @@
---
title: Immich Postgres Data Migration
modified: 2026-05-13
last-reviewed: 2026-05-13
tags:
- how-to
- operations
- postgres
- immich
- critical
---
# Immich Postgres Data Migration
**This is the data-loss surface of the migration.** Pick a method,
prove it on a throwaway copy first, then run the real cutover.
## Decision: pick one
### Option A — CNPG `externalCluster` bootstrap (preferred)
Stand the ringtail cluster up as a streaming replica of the minikube
cluster via `bootstrap.pg_basebackup.source`. Replica catches up
online; when ready, promote it and point Immich at it. This is
CNPG's documented PG-to-PG migration path and gives near-zero data
loss (the WAL position at promote == the position at app stop).
Requires: network path from ringtail to minikube's pg over the
tailnet (the existing `immich-pg-tailscale` Service works), and a
superuser secret minikube-side exposed to ringtail's basebackup.
Pitfall to plan around: the ringtail Cluster CR will need its
`bootstrap` block rewritten *after* promotion (CNPG doesn't
gracefully drop the externalCluster reference). Account for this in
[[immich-pg-on-ringtail]] — it may force a reset of that card.
### Option B — pg_dump / pg_restore
Stop immich, `pg_dump -Fc` from minikube, scp to ringtail, restore.
Simpler but full downtime for the whole dump+restore window
(measure on a copy first — VectorChord indexes are slow to rebuild).
Smaller blast radius; no streaming-replication moving parts.
Use this if Option A hits any blocker. Data loss should still be
zero if the source is stopped first.
### Option C — leave pg on minikube
Rejected. See goal card [[migrate-immich-to-ringtail#Why postgres on
ringtail (not cross-cluster)]].
## Dry run before real cutover
Whichever option wins:
1. Snapshot the minikube `immich-pg` PVC or take a fresh `pg_dump`
into a scratch location.
2. Restore into a *separate* ringtail CNPG cluster (different name,
e.g. `immich-pg-test`) and point a scratch immich-server pod at
it.
3. Verify: pod boots, can list assets, ML embeddings query without
error, face thumbnails render. VectorChord-backed queries should
not error.
4. Tear the scratch cluster down before doing the real one.
## Verification on the real run
- Row counts match for `assets`, `albums`, `users`, `face`,
`asset_face`, `smart_search` (the embedding table) — script this.
- `pg_dump --schema-only --no-owner` diff between source and dest
should be empty modulo CNPG-managed roles.
- Immich `/api/server-info/version` and `/api/server-info/statistics`
return sane numbers.
## Rollback
If the cutover fails verification: stop the ringtail immich, repoint
ArgoCD `immich.destination` back to minikube, re-sync. Source pg was
never deleted. Document what failed and reset the chain.

View file

@ -0,0 +1,69 @@
---
title: Immich Postgres Cluster on Ringtail
modified: 2026-05-13
last-reviewed: 2026-05-13
tags:
- how-to
- operations
- postgres
- immich
---
# Immich Postgres Cluster on Ringtail
Stand up a fresh `immich-pg` CNPG Cluster on ringtail, ready to receive
data. **No data import yet** — that's [[immich-pg-data-migration]].
## What to do
- Create `argocd/manifests/databases-ringtail/` (or pick another
namespace name — verify what other ringtail pg clusters will use;
if none yet, `databases` is fine).
- Port these from the minikube side:
- `immich-pg.yaml` — CNPG Cluster CR. Same image
(`ghcr.io/tensorchord/cloudnative-vectorchord:17-0.5.0`), same
extensions, same managed `borgmatic` role. Bump `storage.size` if
the minikube 10 GiB looks tight (check actual usage first).
`storageClass: local-path` on ringtail (default).
- `external-secret-immich-borgmatic.yaml` — same 1Password item,
same field, but referencing the ringtail `ClusterSecretStore`
(`onepassword-blumeops` already exists per the
`external-secrets-ringtail` app).
- Service for in-cluster access (the operator creates `immich-pg-rw`
etc. automatically; verify the app deployment uses those names).
- A Tailscale Service if we want backups to keep working via the
same hostname during the transition — see "Borgmatic" below.
- New ArgoCD app `argocd/apps/databases-ringtail.yaml` pointing at
the new path, destination ringtail.
## Verification
- Cluster reaches `Ready`.
- `borgmatic` role exists, `rolcanlogin=t`, and is a member of
`pg_read_all_data` (via `managed.roles[].inRoles`).
- ExternalSecret `immich-pg-borgmatic` syncs from 1Password
(`Ready: True`) and the rendered Secret has `username=borgmatic`.
- The `vchord`, `vector`, `cube`, `earthdistance` extensions show
installed in the `postgres` database (`\dx` from
`psql -U postgres`). They are NOT installed in the `immich`
database at this point — `postInitSQL` in CNPG's `initdb` block
runs against the `postgres` superuser database. The Immich app
itself creates the extensions in its own `immich` database at
startup; do not be alarmed by their absence pre-immich-deploy.
The `vchord.so` library is preloaded via
`shared_preload_libraries` regardless, so `CREATE EXTENSION` at
app startup just registers it in the right database.
## Borgmatic implications
`borgmatic.cfg` on indri targets `immich-pg-tailscale` over the
tailnet. During migration both clusters will exist briefly. Decide
upfront: backup the *source* pg until cutover, then flip borgmatic
to the ringtail Tailscale service. Document the flip in
[[immich-cutover-and-decommission]].
## Out of scope
- Importing data. That is [[immich-pg-data-migration]], which may
drive a reset on this card if the migration approach (e.g. CNPG
`externalCluster` bootstrap) requires changes to this Cluster CR.

View file

@ -0,0 +1,132 @@
---
title: Migrate Immich to Ringtail
modified: 2026-05-13
last-reviewed: 2026-05-13
tags:
- how-to
- operations
- immich
- migration
---
# Migrate Immich to Ringtail
Move the entire Immich stack (server, ML, valkey, postgres) off
`minikube-indri` and onto `k3s-ringtail`. This is the first concrete
chain in the broader indri-k8s decommission: minikube is
memory-saturated (97% RAM, swapping), and Immich is the single
largest tenant (~1.5 GiB resident).
## End state
- Immich `server`, `machine-learning`, and `valkey` Deployments run on
ringtail k3s in the `immich` namespace.
- The `immich-machine-learning` pod uses ringtail's RTX 4080 via the
`nvidia-device-plugin` (performance win — currently CPU-only on
minikube).
- A CNPG `immich-pg` Cluster (PostgreSQL 17 + VectorChord) runs in a
`databases` namespace on ringtail, owned by the `cnpg-system`
operator on ringtail.
- The photo library still lives on [[sifaka]] at `/volume1/photos`,
mounted via NFS from ringtail pods (RWX).
- Routing: `photos.ops.eblu.me` (Caddy on indri) proxies to a
Tailscale ProxyGroup ingress on ringtail. No public surface today.
- The ArgoCD `immich` app's `destination.server` points at
`https://ringtail.tail8d86e.ts.net:6443`. The old minikube
manifests are removed.
## Non-goals
- Public exposure via Fly. Immich stays tailnet-only.
- Changing the immich version or runtime configuration. This is a
lift-and-shift; bumps come later.
- Backing up to a different target. [[borgmatic]] keeps running on
indri (it pulls via Tailscale and uses sifaka SMB for the library).
## Critical constraint: no data loss
Downtime is acceptable (Immich is a single-user system; we can take
it offline for the cutover). **Data loss is not.** Two surfaces matter:
1. **Postgres** — face data, ML embeddings (vectors), album state,
sharing, etc. Re-derivable in theory; weeks of recompute in
practice. See [[immich-pg-data-migration]].
2. **Library files**`/volume1/photos`. Not moving, but the NFS
path must be verified accessible from ringtail before cutover.
See [[sifaka-nfs-from-ringtail]].
[[borgmatic]] backs both up to sifaka + BorgBase nightly; restore is
possible but slow. Treat it as a fallback, not a plan.
## Why postgres on ringtail (not cross-cluster)
`immich-pg` already has a Tailscale Service we could point ringtail
at, leaving the DB on minikube. We're not doing that because:
- The whole goal is to retire minikube — keeping pg there blocks it.
- Immich is chatty against pg; tailnet round-trips would hurt.
- CNPG is the same operator on both sides — a Cluster CR on ringtail
is mechanically equivalent.
## Approach
This is a C2 Mikado chain. The prerequisite cards each represent a
distinct surface that has to work before cutover. See
[[agent-change-process#C2 — Mikado Chain]] for the discipline.
## Workflow note: registering new ArgoCD apps during the chain
This chain adds three new ArgoCD `Application` definitions in
`argocd/apps/`: `cloudnative-pg-ringtail`, `databases-ringtail`,
and (later) `immich-ringtail`. The usual C1/C2 pattern of
`argocd app set <app> --revision <branch> && argocd app sync <app>`
does NOT work for the app-of-apps `apps` Application itself, because
`apps` self-manages: it re-reads `apps.yaml` (which declares
`targetRevision: main`) on every sync and reverts the override. As a
result, new app definitions added on a feature branch are never
visible to the cluster via `apps`.
**Use `kubectl apply` to register each new Application directly:**
```fish
kubectl --context=minikube-indri apply -f argocd/apps/<new-app>.yaml
```
This creates the Application resource out-of-band, bypassing `apps`.
For apps whose source lives in **this** repo (e.g.
`databases-ringtail`, `immich-ringtail` — manifest paths exist only
on the branch until merge), follow the apply with a branch override:
```fish
argocd app set <new-app> --revision mikado/migrate-immich-to-ringtail
argocd app sync <new-app>
```
For apps whose source is an **external** repo at a pinned tag (e.g.
`cloudnative-pg-ringtail``mirrors/cloudnative-pg` `v1.27.1`), no
override is needed — the source revision is independent of this PR.
After PR merge:
```fish
argocd app set <new-app> --revision main
argocd app sync <new-app>
```
`apps` itself, on its next sync from `main`, will discover the new
Application definitions in `argocd/apps/` and adopt the already-running
resources without disruption — provided their in-cluster spec matches
the on-disk definitions (which it does because we applied the same
file).
## Related
- [[shower-on-ringtail]] — a previous migration to ringtail (simpler:
no upstream cluster, SQLite, no GPU)
- [[connect-to-postgres]] — getting a psql session against CNPG
- [[ringtail]] — the target cluster
- [[cnpg-on-ringtail]], [[immich-pg-on-ringtail]],
[[immich-pg-data-migration]], [[sifaka-nfs-from-ringtail]],
[[immich-app-on-ringtail]], [[immich-cutover-and-decommission]] —
the prerequisite cards

View file

@ -0,0 +1,67 @@
---
title: Sifaka NFS Photos from Ringtail
modified: 2026-05-13
last-reviewed: 2026-05-13
tags:
- how-to
- operations
- storage
- nfs
- sifaka
---
# Sifaka NFS Photos from Ringtail
The Immich library lives at `sifaka:/volume1/photos` and is mounted
into the pod via an NFS PV (see `argocd/manifests/immich/pv-nfs.yaml`).
That PV is currently scoped to indri. We need ringtail to mount the
same path with the same RWX semantics, without breaking the existing
indri mount during the transition.
## What to verify / do
- Check `sifaka` DSM NFS rules for the `photos` share. Per
[[shower-on-ringtail#NFS + SMB share on sifaka]] convention, rules
use `192.168.1.0/24` + `100.64.0.0/10` with
`all_squash`/`Map all users to admin`. The existing rule may
already cover ringtail (it's on `192.168.1.21` per the recent
static-IP pin). If so this card is a verification card.
- If the rule is locked to indri's IP: add an entry for ringtail
(192.168.1.21) or widen to the subnet pattern above.
- Test mount from a ringtail debug pod (busybox or alpine with
nfs-utils) against the `photos` share. Read a file. Write a temp
file. Delete it.
- Watch for the known sifaka NFS-over-Tailscale gotcha: sifaka's
Tailscale must be in TUN mode (not userspace) for NFS to work
reliably over the tailnet. The NFS path here goes over the LAN
(not tailnet), so this shouldn't bite, but worth confirming the
NFS traffic is on `192.168.1.x` not `100.x`.
## PV + PVC on ringtail
- New `pv-nfs.yaml` mirroring the minikube one (name can be shared
if the PV is cluster-scoped — but PVs are per-cluster, so just
duplicate). Same `server: sifaka`, same path, same
`accessModes: [ReadWriteMany]`, `persistentVolumeReclaimPolicy:
Retain`.
- New `pvc.yaml` in the ringtail `immich` namespace bound to it.
- The minikube PVC stays bound and active until cutover — both
clusters can have the share NFS-mounted simultaneously (NFS RWX
permits this). Immich itself must not be running on both sides
at once.
## Verification
- A pod on ringtail can `ls /mnt/photos/` and see the same files
as the indri pod.
- File written from ringtail pod is visible from indri pod and
vice versa (proves there's no caching surprise).
## Out of scope
- Migrating photo files. Nothing moves; this is just adding a second
NFS client.
- The `pvc-ml-cache.yaml` PVC (a separate ML model cache). That's
not on NFS — it's a regular PVC. Recreated empty on ringtail in
[[immich-app-on-ringtail]]; the first ML pod boot will repopulate
it.