blumeops/plans/k8s-migration/P5.1_qemu2_migration.md
Erich Blume 7b60cca31e Document P6 blocker and add P5.1 QEMU2 migration plan (#37)
## Summary
- Document P6 (Kiwix/Transmission) blocker: podman driver cannot mount external volumes
- Add P5.1 plan to migrate minikube from podman to QEMU2 driver
- Update overview with corrected phase statuses and driver information

## Background

P6 implementation (`feature/p6-kiwix-transmission`) was completed but blocked because **all volume mount approaches failed** with the podman driver:

| Approach | Result |
|----------|--------|
| NFS volume | Failed - CAP_SYS_ADMIN required |
| SMB CSI driver | Failed - EPERM in rootless container |
| `minikube mount` (9p) | Failed - permission denied |
| hostPath | Failed - path doesn't exist in container |

Root cause: Podman driver runs minikube in a rootless container lacking kernel capabilities for filesystem mounts.

## What's Next

1. Merge this documentation PR
2. Execute P5.1 (QEMU2 migration) in a fresh session
3. Retry P6 with the QEMU2 driver

## Deployment and Testing
- [x] No deployment needed - documentation only
- [x] ArgoCD apps reset to main
- [x] Cluster healthy (except kiwix/transmission intentionally offline)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/37
2026-01-20 20:49:48 -08:00

6.3 KiB

Phase 5.1: Migrate Minikube from Podman to QEMU2 Driver

Goal: Replace the podman driver with qemu2 to enable proper volume mounts (hostPath, NFS, SMB CSI)

Status: Planning

Prerequisites: Phase 5 complete


Background

During Phase 6 (Kiwix/Transmission migration), we discovered that the podman driver has fundamental limitations that prevent mounting external volumes:

  1. SMB CSI driver fails with "Operation not permitted" - the rootless container lacks kernel-level mount capabilities
  2. minikube mount fails - 9p mount gets "permission denied" inside the podman VM
  3. hostPath volumes only work for paths inside the minikube container, not the macOS host

These are documented limitations of the podman driver, which is labeled "experimental" in the minikube documentation.

Failed P6 Attempt

Branch feature/p6-kiwix-transmission contains the P6 implementation that was blocked by these issues. The manifests are complete and tested, but couldn't mount the torrents volume.

What was tried:

  • NFS volume mounts - failed due to missing CAP_SYS_ADMIN in podman container
  • SMB CSI driver (v1.17.0) - mount fails with EPERM (same root cause)
  • minikube mount /Volumes/torrents:/Volumes/torrents - 9p mount permission denied
  • hostPath PV pointing to /Volumes/torrents - path doesn't exist inside minikube container
  • Installing cifs-utils in minikube VM - still fails at kernel level

All of these failures trace back to the same root cause: the podman driver runs minikube in a rootless container that lacks the kernel capabilities required for filesystem mounts.

Why QEMU2?

Multiple sources recommend QEMU2 as the best driver for Apple Silicon Macs:

"Qemu emulator is the best option to run a Kubernetes Cluster using minikube on MAC arm64-based systems without any issues." — DevOpsCube

QEMU2 creates an actual VM (not a container), which has:

  • Full kernel capabilities for mounts
  • Proper 9p/virtio filesystem support
  • Native NFS client support

Plan

1. Export Current State

Before destroying the cluster, capture the current state:

# List all ArgoCD apps and their sync status
argocd app list

# Backup any runtime state that matters (should be minimal - everything is in git)
kubectl --context=minikube-indri get all --all-namespaces -o yaml > /tmp/k8s-backup.yaml

2. Stop and Delete Podman Minikube

# Stop the cluster
minikube stop

# Delete the cluster and all data
minikube delete

# Verify podman VM is cleaned up
podman machine list

3. Update Ansible Roles for QEMU2

The installation must be orchestrated via ansible, following the existing patterns for podman and minikube roles.

Changes needed:

  1. Update ansible/roles/minikube/ role:

    • Change driver from podman to qemu2
    • Add QEMU as a dependency (via Brewfile or role)
    • Optionally add socket_vmnet for full networking support
    • Update any driver-specific configuration
  2. Update Brewfile:

    brew "qemu"
    # Optional: brew "socket_vmnet"
    
  3. Update minikube start command in role:

    minikube start \
      --driver=qemu2 \
      --cpus=4 \
      --memory=8192 \
      --disk-size=50g \
      --container-runtime=containerd \
      --kubernetes-version=stable
    
  4. Remove or update podman role (may still be useful for container builds)

4. Run Ansible to Create QEMU2 Cluster

# Run the updated minikube role
mise run provision-indri -- --tags minikube

# Verify cluster is running
minikube status
kubectl get nodes

5. Configure Host Path Access

With QEMU2, we need to either:

Option A: Use minikube mount (9p)

# Start persistent mount (run in background or via launchd)
minikube mount /Volumes/torrents:/Volumes/torrents &

Option B: Use NFS export from macOS

# Add NFS export on macOS
echo "/Volumes/torrents -alldirs -mapall=$(id -u):$(id -g) -network 192.168.0.0 -mask 255.255.0.0" | sudo tee -a /etc/exports
sudo nfsd restart

# In k8s, use NFS volume type directly

6. Test Volume Mount with Test Pod

Create a test pod that mounts the torrents volume:

apiVersion: v1
kind: Pod
metadata:
  name: volume-test
  namespace: default
spec:
  containers:
    - name: test
      image: busybox
      command: ["sh", "-c", "ls -la /data && sleep 3600"]
      volumeMounts:
        - name: torrents
          mountPath: /data
  volumes:
    - name: torrents
      hostPath:
        path: /Volumes/torrents
        type: Directory

Verify:

kubectl apply -f volume-test.yaml
kubectl logs volume-test
kubectl exec volume-test -- ls -la /data

7. Redeploy ArgoCD and Existing Apps

# Re-add ArgoCD
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

# Wait for ArgoCD to be ready
kubectl wait --for=condition=available deployment/argocd-server -n argocd --timeout=300s

# Re-configure ArgoCD (repo credentials, etc.)
# ... follow P1 setup steps ...

# Sync all apps
argocd app sync apps

8. Verify All Services

# Run health check
mise run indri-services-check

# Verify each k8s service
argocd app list
kubectl get pods --all-namespaces

9. Clean Up Test Pod

kubectl delete pod volume-test

Verification Checklist

  • Podman minikube deleted
  • QEMU2 minikube running
  • minikube mount or NFS working
  • Test pod can read /Volumes/torrents
  • ArgoCD redeployed and synced
  • All existing apps healthy (grafana, miniflux, devpi, etc.)
  • PostgreSQL cluster healthy
  • Test pod deleted
  • mise run indri-services-check passes (except intentionally offline services)

Rollback Plan

If QEMU2 doesn't work:

  1. Delete QEMU2 cluster: minikube delete
  2. Recreate podman cluster following P0/P1 steps
  3. Redeploy apps from git

All state is in git, so cluster recreation is straightforward.


Notes

  • The QEMU2 VM will use more resources than podman (actual VM vs container)
  • First boot may be slower due to VM initialization
  • socket_vmnet provides better networking but requires sudo setup
  • Consider creating a LaunchAgent for minikube mount if using that approach