blumeops/plans/k8s-migration/P5.1_qemu2_migration.md
Erich Blume 75f945385c Update P5.1 plan with completion status and P6 storage options
- Document completed steps (docker driver working, kubectl access, ansible updated)
- Add detailed analysis of volume mount options for P6
- Recommend hostPath via Docker Desktop file sharing as simplest approach
- Document why direct NFS won't work (Docker network isolation)
- Include sample LaunchDaemon for persistent NFS mount

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-21 14:05:26 -08:00

9.7 KiB

Phase 5.1: Migrate Minikube from QEMU2 to Docker Driver

Goal: Replace the qemu2 driver with docker to fix remote API access and simplify volume mounts

Status: In Progress (2026-01-21) - Ansible roles updated, cluster running, awaiting ArgoCD redeploy

Prerequisites: Phase 5 complete


Background

Original Problem (Podman → QEMU2)

During Phase 6 (Kiwix/Transmission migration), we discovered that the podman driver has fundamental limitations that prevent mounting external volumes:

  1. SMB CSI driver fails with "Operation not permitted" - the rootless container lacks kernel-level mount capabilities
  2. minikube mount fails - 9p mount gets "permission denied" inside the podman VM
  3. hostPath volumes only work for paths inside the minikube container, not the macOS host

We migrated to QEMU2 to get a full VM with kernel capabilities.

New Problem (QEMU2 → Docker)

The QEMU2 driver introduced a new problem: the Kubernetes API server is inside the VM at 192.168.105.2:6443, and Tailscale's TCP proxy cannot forward to it properly:

  • TCP connections succeed (nc -zv works)
  • TLS handshake times out
  • Root cause unknown, but likely related to Tailscale serve's handling of non-localhost upstreams

Additionally, the volume mount solution with QEMU2 was complex:

  • Required NFS mount from sifaka → indri
  • Then minikube mount to pass through to VM
  • Two LaunchAgents/LaunchDaemons for persistence
  • macOS GUI approval required for network access

Why Docker?

The docker driver solves both problems:

  1. API Server on localhost: Docker Desktop handles port forwarding from container to localhost automatically, so tailscale serve --tcp=443 tcp://localhost:PORT works

  2. Simpler volume mounts: Docker Desktop has built-in macOS file sharing. Paths shared with Docker are accessible inside containers.

  3. Official Tailscale recommendation: Tailscale's own Kubernetes guide uses minikube with the docker driver.


Implementation Progress

Completed

  1. Docker Desktop installed (manual via brew install --cask docker)

    • Configured with 12GB memory in Docker Desktop settings
    • Kubernetes option disabled (using minikube instead)
  2. QEMU2 minikube deleted (minikube stop && minikube delete)

  3. Docker minikube cluster created:

    minikube start \
      --driver=docker \
      --container-runtime=docker \
      --cpus=6 \
      --memory=11264 \
      --disk-size=200g \
      --apiserver-names=k8s.tail8d86e.ts.net,indri \
      --apiserver-port=6443 \
      --listen-address=0.0.0.0
    

    Note: Memory set to 11264MB (11GB) to leave headroom for Docker Desktop overhead.

  4. Tailscale serve configured for k8s API:

    • API server on localhost:50820 (port is dynamic with docker driver)
    • tailscale serve --service=svc:k8s --tcp=443 tcp://localhost:50820
  5. Remote kubectl access working from gilbert:

    • Created mise-tasks/ensure-minikube-indri-kubectl-config script
    • Fetches certs from indri and sets up ~/.kube/minikube-indri/config.yml
    • kubectl --context=minikube-indri get nodes works
  6. Ansible roles updated:

    • ansible/roles/minikube/ - docker driver, removed qemu2/NFS/socket_vmnet
    • ansible/roles/tailscale_serve/ - removed svc:k8s (minikube role handles dynamic port)
    • Containerd registry mirrors configured for zot pull-through cache
  7. QEMU2 artifacts cleaned up:

    • Stopped socket_vmnet service
    • Removed NFS LaunchDaemon
    • Removed minikube mount LaunchAgent
    • kubectl still works after cleanup

Remaining 📋

  1. Redeploy ArgoCD and apps - bootstrap the cluster with:

    # On indri - apply secrets first
    op inject -i argocd/manifests/tailscale-operator/secret.yaml.tpl | kubectl apply -f -
    
    # Create repo secret for ArgoCD
    PRIV_KEY=$(op read "op://vg6xf6vvfmoh5hqjjhlhbeoaie/csjncynh6htjvnh2l2da65y32q/private key?ssh-format=openssh")$'\n'
    kubectl create namespace argocd
    kubectl create secret generic repo-forge -n argocd \
      --from-literal=type=git \
      --from-literal=url='ssh://forgejo@indri.tail8d86e.ts.net:2200/eblume/blumeops.git' \
      --from-literal=insecure=true \
      --from-literal=sshPrivateKey="$PRIV_KEY"
    kubectl label secret repo-forge -n argocd argocd.argoproj.io/secret-type=repository
    
    # Bootstrap operators
    kubectl create namespace tailscale
    kubectl apply -k argocd/manifests/tailscale-operator/
    kubectl apply -k argocd/manifests/argocd/
    
    # Wait for ArgoCD
    kubectl wait --for=condition=available deployment/argocd-server -n argocd --timeout=300s
    
    # Login and sync apps
    argocd login argocd.tail8d86e.ts.net --username admin --grpc-web
    argocd app sync apps
    argocd app sync tailscale-operator
    argocd app sync cloudnative-pg
    argocd app sync blumeops-pg
    argocd app sync grafana
    argocd app sync grafana-config
    argocd app sync miniflux
    argocd app sync devpi
    
  2. Verify all services with mise run indri-services-check

  3. Configure containerd registry mirrors (will be done by ansible on next provision)


Technical Notes

API Server Port

With docker driver, the API server port is dynamic - Docker maps a random host port to 6443 inside the container. Current port: 50820.

The minikube ansible role queries the port after cluster start and configures tailscale serve accordingly.

Registry Mirror Configuration

Containerd uses /etc/containerd/certs.d/<registry>/hosts.toml files:

# /etc/containerd/certs.d/docker.io/hosts.toml
server = "https://registry-1.docker.io"

[host."http://host.minikube.internal:5050"]
  capabilities = ["pull", "resolve"]
  skip_verify = true

The ansible role configures mirrors for:

  • registry.tail8d86e.ts.net (private images)
  • docker.io
  • ghcr.io
  • quay.io

Volume Mounts for P6 (Kiwix/Transmission)

With the docker driver, volume mounts work differently than podman or qemu2. Here's the analysis:

Current Network State:

  • Minikube container is on Docker network 192.168.49.0/24
  • Sifaka NFS exports /volume1/torrents to:
    • 192.168.105.0/24 (old qemu2 VM network - no longer used)
    • 100.64.0.0/10 (Tailscale CGNAT range)
  • Minikube can resolve sifaka (192.168.1.203) but can't reach it (100% packet loss due to Docker network isolation)

Option A: hostPath via Docker Desktop File Sharing RECOMMENDED

  1. Mount sifaka NFS share on indri macOS: mount -t nfs sifaka:/volume1/torrents /Volumes/torrents
  2. Docker Desktop file sharing exposes /Volumes into the Docker VM
  3. Pods use hostPath to access /Volumes/torrents

Pros:

  • Simplest approach, uses native Docker file sharing
  • No network reconfiguration needed on sifaka
  • Path is stable and predictable

Cons:

  • Requires persistent NFS mount on indri (LaunchDaemon)
  • File sharing performance may be slower than direct NFS

Implementation:

# Manual mount test
ssh indri 'sudo mkdir -p /Volumes/torrents && sudo mount -t nfs -o resvport,rw sifaka:/volume1/torrents /Volumes/torrents'

# Verify Docker can see it
ssh indri 'docker run --rm -v /Volumes/torrents:/data alpine ls /data'

# Pod manifest uses hostPath:
# volumes:
#   - name: torrents
#     hostPath:
#       path: /Volumes/torrents
#       type: Directory

Option B: Update sifaka NFS exports for Docker network

  1. Add 192.168.49.0/24 to sifaka's NFS exports
  2. Pods mount NFS directly using kubernetes NFS volume type

Cons:

  • Docker network might change (though 192.168.49.x seems stable for minikube)
  • Requires sifaka configuration change
  • NFS mount from inside container may have permission issues

Option C: Tailscale sidecar for NFS access

  1. Pods include a Tailscale sidecar that joins the tailnet
  2. Mount NFS via Tailscale IP (sifaka is at 100.x.x.x)

Cons:

  • Complex setup with sidecar containers
  • Each pod needs Tailscale auth
  • Overkill for this use case

Recommendation for P6: Use Option A (hostPath via Docker Desktop file sharing). It's the simplest and most reliable approach. We'll need a LaunchDaemon for the NFS mount, but it's straightforward:

<!-- /Library/LaunchDaemons/com.blumeops.nfs-torrents.plist -->
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.blumeops.nfs-torrents</string>
    <key>ProgramArguments</key>
    <array>
        <string>/sbin/mount</string>
        <string>-t</string>
        <string>nfs</string>
        <string>-o</string>
        <string>resvport,rw</string>
        <string>sifaka:/volume1/torrents</string>
        <string>/Volumes/torrents</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
</dict>
</plist>

This is simpler than the qemu2 approach because there's no intermediate minikube mount step - Docker Desktop handles the path passthrough automatically.


Verification Checklist

  • Docker Desktop installed and running on indri
  • QEMU2 minikube deleted
  • Docker minikube running (6 CPUs, 11GB RAM)
  • API server accessible on localhost:50820
  • Tailscale serve configured for svc:k8s → localhost:50820
  • Remote kubectl access working from gilbert
  • Ansible roles updated for docker driver
  • socket_vmnet stopped
  • ArgoCD redeployed and synced
  • All existing apps healthy (grafana, miniflux, devpi, etc.)
  • PostgreSQL cluster healthy
  • Containerd registry mirrors configured
  • mise run indri-services-check passes

Rollback Plan

If Docker driver doesn't work:

  1. Delete Docker minikube: minikube delete
  2. Recreate QEMU2 cluster (restore old ansible config from git)
  3. Accept the Tailscale TCP forwarding limitation and use SSH tunnel for remote kubectl