## Summary - Migrate minikube from podman driver to qemu2 driver for proper NFS/SMB volume mount support - Update ansible minikube role with qemu installation and containerd runtime - Remove podman role dependency from indri.yml - Add synology user creation steps and post-migration zot reconfiguration notes ## Why Phase 6 (Kiwix/Transmission migration) was blocked because the podman driver lacks kernel capabilities for filesystem mounts. QEMU2 creates an actual VM with full mount support. ## Deployment and Testing - [ ] Create k8s-storage user on Synology DSM - [ ] Store credentials in 1Password (synology-k8s-storage) - [ ] Export current k8s state - [ ] Stop and delete podman-based minikube cluster - [ ] Run ansible to create QEMU2 cluster - [ ] Test NFS volume mount with test pod - [ ] Redeploy ArgoCD and all apps - [ ] Verify all services healthy - [ ] Reconfigure zot registry mirrors for containerd (post-migration) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/38
208 lines
6.8 KiB
Markdown
208 lines
6.8 KiB
Markdown
# Phase 5.1: Migrate Minikube from QEMU2 to Docker Driver
|
|
|
|
**Goal**: Replace the qemu2 driver with docker to fix remote API access and simplify volume mounts
|
|
|
|
**Status**: Complete (2026-01-21) - Cluster running, ArgoCD deployed, apps synced
|
|
|
|
**Prerequisites**: [Phase 5](P5_devpi.complete.md) complete
|
|
|
|
---
|
|
|
|
## Background
|
|
|
|
### Original Problem (Podman → QEMU2)
|
|
|
|
During Phase 6 (Kiwix/Transmission migration), we discovered that the **podman driver has fundamental limitations** that prevent mounting external volumes:
|
|
|
|
1. **SMB CSI driver fails** with "Operation not permitted" - the rootless container lacks kernel-level mount capabilities
|
|
2. **`minikube mount` fails** - 9p mount gets "permission denied" inside the podman VM
|
|
3. **hostPath volumes** only work for paths inside the minikube container, not the macOS host
|
|
|
|
We migrated to QEMU2 to get a full VM with kernel capabilities.
|
|
|
|
### New Problem (QEMU2 → Docker)
|
|
|
|
The QEMU2 driver introduced a **new problem**: the Kubernetes API server is inside the VM at `192.168.105.2:6443`, and Tailscale's TCP proxy cannot forward to it properly:
|
|
|
|
- TCP connections succeed (nc -zv works)
|
|
- TLS handshake times out
|
|
- Root cause unknown, but likely related to Tailscale serve's handling of non-localhost upstreams
|
|
|
|
Additionally, the volume mount solution with QEMU2 was complex:
|
|
- Required NFS mount from sifaka → indri
|
|
- Then `minikube mount` to pass through to VM
|
|
- Two LaunchAgents/LaunchDaemons for persistence
|
|
- macOS GUI approval required for network access
|
|
|
|
### Why Docker?
|
|
|
|
The **docker driver** solves both problems:
|
|
|
|
1. **API Server on localhost**: Docker Desktop handles port forwarding from container to localhost automatically, so `tailscale serve --tcp=443 tcp://localhost:PORT` works
|
|
|
|
2. **Simpler volume mounts**: Docker Desktop has built-in macOS file sharing. Paths shared with Docker are accessible inside containers.
|
|
|
|
3. **Official Tailscale recommendation**: Tailscale's own [Kubernetes guide](https://tailscale.com/learn/managing-access-to-kubernetes-with-tailscale) uses minikube with the docker driver.
|
|
|
|
---
|
|
|
|
## Implementation Summary
|
|
|
|
### Infrastructure Changes
|
|
|
|
1. **Docker Desktop installed** (manual via `brew install --cask docker`)
|
|
- Configured with 12GB memory in Docker Desktop settings
|
|
- Kubernetes option disabled (using minikube instead)
|
|
|
|
2. **Docker minikube cluster created**:
|
|
```bash
|
|
minikube start \
|
|
--driver=docker \
|
|
--container-runtime=docker \
|
|
--cpus=6 \
|
|
--memory=11264 \
|
|
--disk-size=200g \
|
|
--apiserver-names=k8s.tail8d86e.ts.net,indri \
|
|
--apiserver-port=6443 \
|
|
--listen-address=0.0.0.0
|
|
```
|
|
|
|
3. **Tailscale serve configured** for k8s API:
|
|
- API server on localhost (port is dynamic with docker driver)
|
|
- `tailscale serve --service=svc:k8s --tcp=443 tcp://localhost:<PORT>`
|
|
|
|
4. **Remote kubectl access working** from gilbert:
|
|
- Created `mise-tasks/ensure-minikube-indri-kubectl-config` script
|
|
- Fetches certs from indri and sets up `~/.kube/minikube-indri/config.yml`
|
|
|
|
### Ansible Roles Updated
|
|
|
|
- `ansible/roles/minikube/` - docker driver, removed qemu2/NFS/socket_vmnet
|
|
- `ansible/roles/tailscale_serve/` - removed svc:k8s (minikube role handles dynamic port)
|
|
- Containerd registry mirrors configured for zot pull-through cache
|
|
|
|
### ArgoCD Bootstrap
|
|
|
|
All apps deployed and synced from `feature/p5.1-qemu2-migration` branch:
|
|
|
|
| App | Status | Notes |
|
|
|-----|--------|-------|
|
|
| tailscale-operator | Healthy | Manages Tailscale ingresses |
|
|
| argocd | Healthy | Self-managed |
|
|
| cloudnative-pg | Healthy | PostgreSQL operator |
|
|
| blumeops-pg | Progressing | PostgreSQL cluster starting |
|
|
| grafana | Progressing | Needs grafana-admin secret |
|
|
| grafana-config | Healthy | Dashboards and ingress |
|
|
| miniflux | Progressing | Needs miniflux-config secret |
|
|
| devpi | Progressing | Starting up |
|
|
|
|
### Secrets Still Needed
|
|
|
|
After PR merge, apply these secrets manually:
|
|
|
|
```bash
|
|
# Grafana admin password
|
|
op inject -i argocd/manifests/grafana-config/secret-admin.yaml.tpl | kubectl --context=minikube-indri apply -f -
|
|
|
|
# Miniflux config
|
|
op inject -i argocd/manifests/miniflux/secret.yaml.tpl | kubectl --context=minikube-indri apply -f -
|
|
```
|
|
|
|
---
|
|
|
|
## Technical Notes
|
|
|
|
### API Server Port
|
|
|
|
With docker driver, the API server port is **dynamic** - Docker maps a random host port to 6443 inside the container.
|
|
|
|
The minikube ansible role queries the port after cluster start and configures tailscale serve accordingly.
|
|
|
|
### Registry Mirror Configuration
|
|
|
|
Containerd uses `/etc/containerd/certs.d/<registry>/hosts.toml` files. The ansible role configures mirrors for:
|
|
- `registry.tail8d86e.ts.net` (private images)
|
|
- `docker.io`
|
|
- `ghcr.io`
|
|
- `quay.io`
|
|
|
|
### ProxyClass Renamed
|
|
|
|
Changed from `crio-compat` to `default` - the old name was misleading since we're no longer using CRI-O.
|
|
|
|
### Volume Mounts for P6 (Kiwix/Transmission)
|
|
|
|
**Solution: Direct NFS from pods to sifaka** ✅ TESTED AND WORKING
|
|
|
|
Docker NATs outbound traffic through indri's LAN IP (192.168.1.50), so sifaka's NFS exports need to allow `192.168.1.0/24`.
|
|
|
|
Sifaka NFS exports configured:
|
|
- `192.168.1.0/24` - Docker containers via indri NAT
|
|
- `100.64.0.0/10` - Tailscale clients
|
|
|
|
Pods can mount NFS directly:
|
|
```yaml
|
|
volumes:
|
|
- name: torrents
|
|
nfs:
|
|
server: sifaka
|
|
path: /volume1/torrents
|
|
```
|
|
|
|
No LaunchAgents, no `minikube mount`, no SMB CSI driver needed.
|
|
|
|
---
|
|
|
|
## Verification Checklist
|
|
|
|
- [x] Docker Desktop installed and running on indri
|
|
- [x] QEMU2 minikube deleted
|
|
- [x] Docker minikube running (6 CPUs, 11GB RAM)
|
|
- [x] API server accessible on localhost
|
|
- [x] Tailscale serve configured for svc:k8s
|
|
- [x] Remote kubectl access working from gilbert
|
|
- [x] Ansible roles updated for docker driver
|
|
- [x] socket_vmnet stopped
|
|
- [x] ArgoCD deployed and synced
|
|
- [x] All apps synced to feature branch
|
|
- [x] Apply app secrets (grafana-admin, miniflux-db, devpi-root, eblume, borgmatic)
|
|
- [x] Verify all apps healthy after secrets applied
|
|
- [x] Miniflux database restored from borgmatic backup
|
|
- [ ] Merge PR and reset apps to main branch
|
|
- [ ] `mise run indri-services-check` passes
|
|
|
|
---
|
|
|
|
## Post-Merge Steps
|
|
|
|
After PR is merged:
|
|
|
|
```bash
|
|
# Reset all blumeops apps to main branch
|
|
argocd app set apps --revision main
|
|
argocd app set argocd --revision main
|
|
argocd app set blumeops-pg --revision main
|
|
argocd app set devpi --revision main
|
|
argocd app set grafana-config --revision main
|
|
argocd app set miniflux --revision main
|
|
argocd app set tailscale-operator --revision main
|
|
|
|
# Sync all apps
|
|
argocd app sync apps
|
|
argocd app sync argocd
|
|
argocd app sync tailscale-operator
|
|
argocd app sync blumeops-pg
|
|
argocd app sync grafana-config
|
|
argocd app sync miniflux
|
|
argocd app sync devpi
|
|
```
|
|
|
|
---
|
|
|
|
## Rollback Plan
|
|
|
|
If Docker driver doesn't work:
|
|
|
|
1. Delete Docker minikube: `minikube delete`
|
|
2. Recreate QEMU2 cluster (restore old ansible config from git)
|
|
3. Accept the Tailscale TCP forwarding limitation and use SSH tunnel for remote kubectl
|