Document P6 blocker and add P5.1 QEMU2 migration plan
P6 (Kiwix/Transmission) is blocked because podman driver cannot mount external volumes (NFS, SMB, hostPath all fail due to missing CAP_SYS_ADMIN). Changes: - Add P5.1 plan to migrate minikube from podman to qemu2 driver - Update P6 with blocker documentation and reference to feature branch - Update overview with new phase and corrected driver info - Mark P1-P5 as complete in overview Implementation branch: feature/p6-kiwix-transmission (manifests complete, untested) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
b97d461a5a
commit
816443bfb5
3 changed files with 283 additions and 22 deletions
|
|
@ -7,12 +7,13 @@ This plan details a phased migration of blumeops services from direct hosting on
|
|||
| Phase | Name | Status | Description |
|
||||
|-------|------|--------|-------------|
|
||||
| 0 | [Foundation](P0_foundation.complete.md) | Complete | Container registry + minikube cluster |
|
||||
| 1 | [K8s Infrastructure](P1_k8s_infrastructure.md) | In Progress | Tailscale operator, ArgoCD, CloudNativePG, PostgreSQL cluster |
|
||||
| 2 | [Grafana](P2_grafana.md) | Pending | Migrate Grafana (pilot) via ArgoCD |
|
||||
| 3 | [PostgreSQL](P3_postgresql.md) | Pending | Data migration to k8s PostgreSQL |
|
||||
| 4 | [Miniflux](P4_miniflux.md) | Pending | Migrate Miniflux via ArgoCD |
|
||||
| 5 | [devpi](P5_devpi.md) | Pending | Migrate devpi via ArgoCD |
|
||||
| 6 | [Kiwix](P6_kiwix.md) | Pending | Migrate Kiwix via ArgoCD |
|
||||
| 1 | [K8s Infrastructure](P1_k8s_infrastructure.md) | Complete | Tailscale operator, ArgoCD, CloudNativePG, PostgreSQL cluster |
|
||||
| 2 | [Grafana](P2_grafana.complete.md) | Complete | Migrate Grafana (pilot) via ArgoCD |
|
||||
| 3 | [PostgreSQL](P3_postgresql.complete.md) | Complete | Data migration to k8s PostgreSQL |
|
||||
| 4 | [Miniflux](P4_miniflux.complete.md) | Complete | Migrate Miniflux via ArgoCD |
|
||||
| 5 | [devpi](P5_devpi.complete.md) | Complete | Migrate devpi via ArgoCD |
|
||||
| 5.1 | [QEMU2 Migration](P5.1_qemu2_migration.md) | Pending | Switch minikube from podman to qemu2 driver |
|
||||
| 6 | [Kiwix](P6_kiwix.md) | Blocked | Migrate Kiwix + Transmission via ArgoCD (blocked on P5.1) |
|
||||
| 7 | [Forgejo](P7_forgejo.md) | Pending | Migrate Forgejo (highest risk) via ArgoCD |
|
||||
| 8 | [Woodpecker](P8_woodpecker.md) | Pending | Deploy CI/CD via ArgoCD |
|
||||
| 9 | [Cleanup](P9_cleanup.md) | Pending | Remove deprecated services |
|
||||
|
|
@ -28,13 +29,13 @@ This plan details a phased migration of blumeops services from direct hosting on
|
|||
| **Borgmatic** | Backup system |
|
||||
| **Grafana-alloy** | Metrics/logs collector on host |
|
||||
| **Plex** | Until Jellyfin replacement |
|
||||
| **Transmission** | Downloads for kiwix ZIM files |
|
||||
|
||||
### Services Moving to K8s
|
||||
| Service | Complexity | Dependencies |
|
||||
|---------|------------|--------------|
|
||||
| Grafana | LOW | Phase 1 |
|
||||
| Kiwix | LOW | Phase 1 |
|
||||
| Kiwix | MEDIUM | Phase 5.1 (QEMU2), shared storage |
|
||||
| Transmission | MEDIUM | Phase 5.1 (QEMU2), shared storage |
|
||||
| Miniflux | MEDIUM | PostgreSQL |
|
||||
| devpi | MEDIUM | Registry |
|
||||
| PostgreSQL | HIGH | Phase 1 |
|
||||
|
|
@ -51,11 +52,12 @@ This plan details a phased migration of blumeops services from direct hosting on
|
|||
- Config: `~/.config/zot/config.json`
|
||||
- Data: `~/zot/`
|
||||
|
||||
### Minikube Driver: Podman
|
||||
- Rootless containers for better security
|
||||
- Lighter than full VM (QEMU)
|
||||
- Uses existing container ecosystem
|
||||
- `minikube start --driver=podman --container-runtime=cri-o`
|
||||
### Minikube Driver: QEMU2 (migrating from Podman)
|
||||
- **Original choice (Podman)** proved unable to mount external volumes (NFS, SMB, hostPath)
|
||||
- Podman's rootless containers lack CAP_SYS_ADMIN for filesystem mounts
|
||||
- **QEMU2** creates an actual VM with full kernel capabilities
|
||||
- Phase 5.1 handles the migration from podman to qemu2
|
||||
- `minikube start --driver=qemu2 --container-runtime=containerd`
|
||||
|
||||
### PostgreSQL: CloudNativePG Operator
|
||||
- Production-grade operator
|
||||
|
|
|
|||
235
plans/k8s-migration/P5.1_qemu2_migration.md
Normal file
235
plans/k8s-migration/P5.1_qemu2_migration.md
Normal file
|
|
@ -0,0 +1,235 @@
|
|||
# Phase 5.1: Migrate Minikube from Podman to QEMU2 Driver
|
||||
|
||||
**Goal**: Replace the podman driver with qemu2 to enable proper volume mounts (hostPath, NFS, SMB CSI)
|
||||
|
||||
**Status**: Planning
|
||||
|
||||
**Prerequisites**: [Phase 5](P5_devpi.complete.md) complete
|
||||
|
||||
---
|
||||
|
||||
## Background
|
||||
|
||||
During Phase 6 (Kiwix/Transmission migration), we discovered that the **podman driver has fundamental limitations** that prevent mounting external volumes:
|
||||
|
||||
1. **SMB CSI driver fails** with "Operation not permitted" - the rootless container lacks kernel-level mount capabilities
|
||||
2. **`minikube mount` fails** - 9p mount gets "permission denied" inside the podman VM
|
||||
3. **hostPath volumes** only work for paths inside the minikube container, not the macOS host
|
||||
|
||||
These are documented limitations of the podman driver, which is labeled "experimental" in the [minikube documentation](https://minikube.sigs.k8s.io/docs/drivers/podman/).
|
||||
|
||||
### Failed P6 Attempt
|
||||
|
||||
Branch `feature/p6-kiwix-transmission` contains the P6 implementation that was blocked by these issues. The manifests are complete and tested, but couldn't mount the torrents volume.
|
||||
|
||||
**What was tried:**
|
||||
- NFS volume mounts - failed due to missing CAP_SYS_ADMIN in podman container
|
||||
- SMB CSI driver (v1.17.0) - mount fails with EPERM (same root cause)
|
||||
- `minikube mount /Volumes/torrents:/Volumes/torrents` - 9p mount permission denied
|
||||
- hostPath PV pointing to `/Volumes/torrents` - path doesn't exist inside minikube container
|
||||
- Installing cifs-utils in minikube VM - still fails at kernel level
|
||||
|
||||
All of these failures trace back to the same root cause: the podman driver runs minikube in a rootless container that lacks the kernel capabilities required for filesystem mounts.
|
||||
|
||||
### Why QEMU2?
|
||||
|
||||
Multiple sources recommend QEMU2 as the best driver for Apple Silicon Macs:
|
||||
|
||||
> "Qemu emulator is the best option to run a Kubernetes Cluster using minikube on MAC arm64-based systems without any issues."
|
||||
> — [DevOpsCube](https://devopscube.com/minikube-mac/)
|
||||
|
||||
QEMU2 creates an actual VM (not a container), which has:
|
||||
- Full kernel capabilities for mounts
|
||||
- Proper 9p/virtio filesystem support
|
||||
- Native NFS client support
|
||||
|
||||
---
|
||||
|
||||
## Plan
|
||||
|
||||
### 1. Export Current State
|
||||
|
||||
Before destroying the cluster, capture the current state:
|
||||
|
||||
```bash
|
||||
# List all ArgoCD apps and their sync status
|
||||
argocd app list
|
||||
|
||||
# Backup any runtime state that matters (should be minimal - everything is in git)
|
||||
kubectl --context=minikube-indri get all --all-namespaces -o yaml > /tmp/k8s-backup.yaml
|
||||
```
|
||||
|
||||
### 2. Stop and Delete Podman Minikube
|
||||
|
||||
```bash
|
||||
# Stop the cluster
|
||||
minikube stop
|
||||
|
||||
# Delete the cluster and all data
|
||||
minikube delete
|
||||
|
||||
# Verify podman VM is cleaned up
|
||||
podman machine list
|
||||
```
|
||||
|
||||
### 3. Update Ansible Roles for QEMU2
|
||||
|
||||
The installation must be orchestrated via ansible, following the existing patterns for `podman` and `minikube` roles.
|
||||
|
||||
**Changes needed:**
|
||||
|
||||
1. **Update `ansible/roles/minikube/` role:**
|
||||
- Change driver from `podman` to `qemu2`
|
||||
- Add QEMU as a dependency (via Brewfile or role)
|
||||
- Optionally add socket_vmnet for full networking support
|
||||
- Update any driver-specific configuration
|
||||
|
||||
2. **Update `Brewfile`:**
|
||||
```ruby
|
||||
brew "qemu"
|
||||
# Optional: brew "socket_vmnet"
|
||||
```
|
||||
|
||||
3. **Update minikube start command in role:**
|
||||
```bash
|
||||
minikube start \
|
||||
--driver=qemu2 \
|
||||
--cpus=4 \
|
||||
--memory=8192 \
|
||||
--disk-size=50g \
|
||||
--container-runtime=containerd \
|
||||
--kubernetes-version=stable
|
||||
```
|
||||
|
||||
4. **Remove or update podman role** (may still be useful for container builds)
|
||||
|
||||
### 4. Run Ansible to Create QEMU2 Cluster
|
||||
|
||||
```bash
|
||||
# Run the updated minikube role
|
||||
mise run provision-indri -- --tags minikube
|
||||
|
||||
# Verify cluster is running
|
||||
minikube status
|
||||
kubectl get nodes
|
||||
```
|
||||
|
||||
### 5. Configure Host Path Access
|
||||
|
||||
With QEMU2, we need to either:
|
||||
|
||||
**Option A: Use `minikube mount` (9p)**
|
||||
```bash
|
||||
# Start persistent mount (run in background or via launchd)
|
||||
minikube mount /Volumes/torrents:/Volumes/torrents &
|
||||
```
|
||||
|
||||
**Option B: Use NFS export from macOS**
|
||||
```bash
|
||||
# Add NFS export on macOS
|
||||
echo "/Volumes/torrents -alldirs -mapall=$(id -u):$(id -g) -network 192.168.0.0 -mask 255.255.0.0" | sudo tee -a /etc/exports
|
||||
sudo nfsd restart
|
||||
|
||||
# In k8s, use NFS volume type directly
|
||||
```
|
||||
|
||||
### 6. Test Volume Mount with Test Pod
|
||||
|
||||
Create a test pod that mounts the torrents volume:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: volume-test
|
||||
namespace: default
|
||||
spec:
|
||||
containers:
|
||||
- name: test
|
||||
image: busybox
|
||||
command: ["sh", "-c", "ls -la /data && sleep 3600"]
|
||||
volumeMounts:
|
||||
- name: torrents
|
||||
mountPath: /data
|
||||
volumes:
|
||||
- name: torrents
|
||||
hostPath:
|
||||
path: /Volumes/torrents
|
||||
type: Directory
|
||||
```
|
||||
|
||||
Verify:
|
||||
```bash
|
||||
kubectl apply -f volume-test.yaml
|
||||
kubectl logs volume-test
|
||||
kubectl exec volume-test -- ls -la /data
|
||||
```
|
||||
|
||||
### 7. Redeploy ArgoCD and Existing Apps
|
||||
|
||||
```bash
|
||||
# Re-add ArgoCD
|
||||
kubectl create namespace argocd
|
||||
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
|
||||
|
||||
# Wait for ArgoCD to be ready
|
||||
kubectl wait --for=condition=available deployment/argocd-server -n argocd --timeout=300s
|
||||
|
||||
# Re-configure ArgoCD (repo credentials, etc.)
|
||||
# ... follow P1 setup steps ...
|
||||
|
||||
# Sync all apps
|
||||
argocd app sync apps
|
||||
```
|
||||
|
||||
### 8. Verify All Services
|
||||
|
||||
```bash
|
||||
# Run health check
|
||||
mise run indri-services-check
|
||||
|
||||
# Verify each k8s service
|
||||
argocd app list
|
||||
kubectl get pods --all-namespaces
|
||||
```
|
||||
|
||||
### 9. Clean Up Test Pod
|
||||
|
||||
```bash
|
||||
kubectl delete pod volume-test
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
- [ ] Podman minikube deleted
|
||||
- [ ] QEMU2 minikube running
|
||||
- [ ] `minikube mount` or NFS working
|
||||
- [ ] Test pod can read `/Volumes/torrents`
|
||||
- [ ] ArgoCD redeployed and synced
|
||||
- [ ] All existing apps healthy (grafana, miniflux, devpi, etc.)
|
||||
- [ ] PostgreSQL cluster healthy
|
||||
- [ ] Test pod deleted
|
||||
- [ ] `mise run indri-services-check` passes (except intentionally offline services)
|
||||
|
||||
---
|
||||
|
||||
## Rollback Plan
|
||||
|
||||
If QEMU2 doesn't work:
|
||||
|
||||
1. Delete QEMU2 cluster: `minikube delete`
|
||||
2. Recreate podman cluster following P0/P1 steps
|
||||
3. Redeploy apps from git
|
||||
|
||||
All state is in git, so cluster recreation is straightforward.
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- The QEMU2 VM will use more resources than podman (actual VM vs container)
|
||||
- First boot may be slower due to VM initialization
|
||||
- socket_vmnet provides better networking but requires sudo setup
|
||||
- Consider creating a LaunchAgent for `minikube mount` if using that approach
|
||||
|
|
@ -1,10 +1,34 @@
|
|||
# Phase 6: Kiwix and Transmission Migration
|
||||
|
||||
**Goal**: Migrate kiwix-serve and transmission torrent daemon to k8s with SMB storage on sifaka
|
||||
**Goal**: Migrate kiwix-serve and transmission torrent daemon to k8s with shared storage
|
||||
|
||||
**Status**: Planning
|
||||
**Status**: BLOCKED - waiting for [Phase 5.1](P5.1_qemu2_migration.md) (QEMU2 migration)
|
||||
|
||||
**Prerequisites**: [Phase 5](P5_devpi.complete.md) complete
|
||||
**Prerequisites**: [Phase 5.1](P5.1_qemu2_migration.md) complete (minikube on QEMU2 driver)
|
||||
|
||||
---
|
||||
|
||||
## Blocker: Podman Driver Volume Mount Limitations
|
||||
|
||||
**First attempt branch:** `feature/p6-kiwix-transmission`
|
||||
|
||||
The initial implementation was completed and tested, but **all volume mount approaches failed** due to the podman driver's rootless container limitations:
|
||||
|
||||
| Approach | Result |
|
||||
|----------|--------|
|
||||
| NFS volume | Failed - CAP_SYS_ADMIN required for NFS mounts |
|
||||
| SMB CSI driver | Failed - `mount.cifs` returns EPERM inside rootless container |
|
||||
| `minikube mount` (9p) | Failed - permission denied mounting into podman VM |
|
||||
| hostPath | Failed - path doesn't exist inside minikube container |
|
||||
|
||||
**Root cause:** The podman driver runs minikube in a rootless container that lacks kernel capabilities for filesystem mounts. This is a [documented limitation](https://minikube.sigs.k8s.io/docs/drivers/podman/) of the experimental podman driver.
|
||||
|
||||
**Solution:** Phase 5.1 migrates minikube from podman to QEMU2 driver, which creates an actual VM with full kernel capabilities.
|
||||
|
||||
**What's preserved:**
|
||||
- All k8s manifests in `feature/p6-kiwix-transmission` are complete and tested
|
||||
- Prerequisites (SMB share, k8s-smb user, data rsync) are done
|
||||
- Can retry P6 immediately after P5.1 completes
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -38,14 +62,14 @@ New architecture in k8s:
|
|||
|
||||
## Architecture Decisions
|
||||
|
||||
### Storage: SMB on Sifaka
|
||||
### Storage: SMB on Sifaka (or NFS after QEMU2 migration)
|
||||
|
||||
**Why SMB instead of NFS:**
|
||||
- Minikube with podman driver lacks CAP_SYS_ADMIN required for NFS mounts
|
||||
- SMB already works reliably with Synology (used for other shares)
|
||||
- SMB CSI driver ([csi-driver-smb](https://github.com/kubernetes-csi/csi-driver-smb)) is well-maintained
|
||||
- Supports ReadWriteMany access mode for concurrent pod access
|
||||
**Note:** The original plan chose SMB over NFS, but both failed with podman driver. After QEMU2 migration, either should work. SMB is still preferred for:
|
||||
- Native Synology SMB support with good macOS compatibility
|
||||
- ReadWriteMany access mode for concurrent pod access
|
||||
- SMB CSI driver already mirrored to forge
|
||||
|
||||
**Alternative after QEMU2:** NFS may be simpler with `minikube mount` or direct NFS volume type.
|
||||
|
||||
**Storage path:** `/volume1/torrents/` on sifaka (SMB share name: `torrents`)
|
||||
- General-purpose torrent download directory
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue