blumeops/plans/k8s-migration/P5.1_qemu2_migration.md
Erich Blume 70357d247b Update P5.1 plan to complete status
- ArgoCD deployed and all apps synced
- Document remaining steps (secrets, post-merge reset)
- Simplified and reorganized documentation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-21 14:49:04 -08:00

6.7 KiB

Phase 5.1: Migrate Minikube from QEMU2 to Docker Driver

Goal: Replace the qemu2 driver with docker to fix remote API access and simplify volume mounts

Status: Complete (2026-01-21) - Cluster running, ArgoCD deployed, apps synced

Prerequisites: Phase 5 complete


Background

Original Problem (Podman → QEMU2)

During Phase 6 (Kiwix/Transmission migration), we discovered that the podman driver has fundamental limitations that prevent mounting external volumes:

  1. SMB CSI driver fails with "Operation not permitted" - the rootless container lacks kernel-level mount capabilities
  2. minikube mount fails - 9p mount gets "permission denied" inside the podman VM
  3. hostPath volumes only work for paths inside the minikube container, not the macOS host

We migrated to QEMU2 to get a full VM with kernel capabilities.

New Problem (QEMU2 → Docker)

The QEMU2 driver introduced a new problem: the Kubernetes API server is inside the VM at 192.168.105.2:6443, and Tailscale's TCP proxy cannot forward to it properly:

  • TCP connections succeed (nc -zv works)
  • TLS handshake times out
  • Root cause unknown, but likely related to Tailscale serve's handling of non-localhost upstreams

Additionally, the volume mount solution with QEMU2 was complex:

  • Required NFS mount from sifaka → indri
  • Then minikube mount to pass through to VM
  • Two LaunchAgents/LaunchDaemons for persistence
  • macOS GUI approval required for network access

Why Docker?

The docker driver solves both problems:

  1. API Server on localhost: Docker Desktop handles port forwarding from container to localhost automatically, so tailscale serve --tcp=443 tcp://localhost:PORT works

  2. Simpler volume mounts: Docker Desktop has built-in macOS file sharing. Paths shared with Docker are accessible inside containers.

  3. Official Tailscale recommendation: Tailscale's own Kubernetes guide uses minikube with the docker driver.


Implementation Summary

Infrastructure Changes

  1. Docker Desktop installed (manual via brew install --cask docker)

    • Configured with 12GB memory in Docker Desktop settings
    • Kubernetes option disabled (using minikube instead)
  2. Docker minikube cluster created:

    minikube start \
      --driver=docker \
      --container-runtime=docker \
      --cpus=6 \
      --memory=11264 \
      --disk-size=200g \
      --apiserver-names=k8s.tail8d86e.ts.net,indri \
      --apiserver-port=6443 \
      --listen-address=0.0.0.0
    
  3. Tailscale serve configured for k8s API:

    • API server on localhost (port is dynamic with docker driver)
    • tailscale serve --service=svc:k8s --tcp=443 tcp://localhost:<PORT>
  4. Remote kubectl access working from gilbert:

    • Created mise-tasks/ensure-minikube-indri-kubectl-config script
    • Fetches certs from indri and sets up ~/.kube/minikube-indri/config.yml

Ansible Roles Updated

  • ansible/roles/minikube/ - docker driver, removed qemu2/NFS/socket_vmnet
  • ansible/roles/tailscale_serve/ - removed svc:k8s (minikube role handles dynamic port)
  • Containerd registry mirrors configured for zot pull-through cache

ArgoCD Bootstrap

All apps deployed and synced from feature/p5.1-qemu2-migration branch:

App Status Notes
tailscale-operator Healthy Manages Tailscale ingresses
argocd Healthy Self-managed
cloudnative-pg Healthy PostgreSQL operator
blumeops-pg Progressing PostgreSQL cluster starting
grafana Progressing Needs grafana-admin secret
grafana-config Healthy Dashboards and ingress
miniflux Progressing Needs miniflux-config secret
devpi Progressing Starting up

Secrets Still Needed

After PR merge, apply these secrets manually:

# Grafana admin password
op inject -i argocd/manifests/grafana-config/secret-admin.yaml.tpl | kubectl --context=minikube-indri apply -f -

# Miniflux config
op inject -i argocd/manifests/miniflux/secret.yaml.tpl | kubectl --context=minikube-indri apply -f -

Technical Notes

API Server Port

With docker driver, the API server port is dynamic - Docker maps a random host port to 6443 inside the container.

The minikube ansible role queries the port after cluster start and configures tailscale serve accordingly.

Registry Mirror Configuration

Containerd uses /etc/containerd/certs.d/<registry>/hosts.toml files. The ansible role configures mirrors for:

  • registry.tail8d86e.ts.net (private images)
  • docker.io
  • ghcr.io
  • quay.io

ProxyClass Renamed

Changed from crio-compat to default - the old name was misleading since we're no longer using CRI-O.

Volume Mounts for P6 (Kiwix/Transmission)

Two options available:

Option A: hostPath via Docker Desktop File Sharing RECOMMENDED

  1. Mount sifaka NFS share on indri macOS: /Volumes/torrents
  2. Docker Desktop file sharing exposes /Volumes into the Docker VM
  3. Pods use hostPath to access /Volumes/torrents

Option B: Update sifaka NFS exports for Docker network

  1. Add 192.168.49.0/24 to sifaka's NFS exports
  2. Pods mount NFS directly (network connectivity works after macOS approval)

Verification Checklist

  • Docker Desktop installed and running on indri
  • QEMU2 minikube deleted
  • Docker minikube running (6 CPUs, 11GB RAM)
  • API server accessible on localhost
  • Tailscale serve configured for svc:k8s
  • Remote kubectl access working from gilbert
  • Ansible roles updated for docker driver
  • socket_vmnet stopped
  • ArgoCD deployed and synced
  • All apps synced to feature branch
  • Apply app secrets (grafana-admin, miniflux-config)
  • Verify all apps healthy after secrets applied
  • Merge PR and reset apps to main branch
  • mise run indri-services-check passes

Post-Merge Steps

After PR is merged:

# Reset all blumeops apps to main branch
argocd app set apps --revision main
argocd app set argocd --revision main
argocd app set blumeops-pg --revision main
argocd app set devpi --revision main
argocd app set grafana-config --revision main
argocd app set miniflux --revision main
argocd app set tailscale-operator --revision main

# Sync all apps
argocd app sync apps
argocd app sync argocd
argocd app sync tailscale-operator
argocd app sync blumeops-pg
argocd app sync grafana-config
argocd app sync miniflux
argocd app sync devpi

Rollback Plan

If Docker driver doesn't work:

  1. Delete Docker minikube: minikube delete
  2. Recreate QEMU2 cluster (restore old ansible config from git)
  3. Accept the Tailscale TCP forwarding limitation and use SSH tunnel for remote kubectl