Update P5.1 plan with completion status and P6 storage options
- Document completed steps (docker driver working, kubectl access, ansible updated) - Add detailed analysis of volume mount options for P6 - Recommend hostPath via Docker Desktop file sharing as simplest approach - Document why direct NFS won't work (Docker network isolation) - Include sample LaunchDaemon for persistent NFS mount Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
9fac4439b1
commit
75f945385c
1 changed files with 178 additions and 241 deletions
|
|
@ -2,7 +2,7 @@
|
|||
|
||||
**Goal**: Replace the qemu2 driver with docker to fix remote API access and simplify volume mounts
|
||||
|
||||
**Status**: In Progress (2026-01-21)
|
||||
**Status**: In Progress (2026-01-21) - Ansible roles updated, cluster running, awaiting ArgoCD redeploy
|
||||
|
||||
**Prerequisites**: [Phase 5](P5_devpi.complete.md) complete
|
||||
|
||||
|
|
@ -38,269 +38,232 @@ Additionally, the volume mount solution with QEMU2 was complex:
|
|||
|
||||
The **docker driver** solves both problems:
|
||||
|
||||
1. **API Server on localhost**: Docker Desktop handles port forwarding from container to localhost automatically, so `tailscale serve --tcp=443 tcp://localhost:PORT` will work (like podman did)
|
||||
1. **API Server on localhost**: Docker Desktop handles port forwarding from container to localhost automatically, so `tailscale serve --tcp=443 tcp://localhost:PORT` works
|
||||
|
||||
2. **Simpler volume mounts**: Docker Desktop has built-in macOS file sharing. Paths shared with Docker are accessible inside containers, and minikube (running in Docker) can use those paths via hostPath.
|
||||
2. **Simpler volume mounts**: Docker Desktop has built-in macOS file sharing. Paths shared with Docker are accessible inside containers.
|
||||
|
||||
3. **Official Tailscale recommendation**: Tailscale's own [Kubernetes guide](https://tailscale.com/learn/managing-access-to-kubernetes-with-tailscale) uses minikube with the docker driver.
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
## Implementation Progress
|
||||
|
||||
### 1. Install Docker Desktop (Manual - Before Ansible)
|
||||
### Completed ✅
|
||||
|
||||
Docker Desktop requires GUI setup, so install manually first:
|
||||
1. **Docker Desktop installed** (manual via `brew install --cask docker`)
|
||||
- Configured with 12GB memory in Docker Desktop settings
|
||||
- Kubernetes option disabled (using minikube instead)
|
||||
|
||||
```bash
|
||||
# On indri:
|
||||
brew install --cask docker-desktop
|
||||
2. **QEMU2 minikube deleted** (`minikube stop && minikube delete`)
|
||||
|
||||
# Then launch Docker Desktop from /Applications
|
||||
# Complete the setup wizard (accept license, skip tutorial)
|
||||
# Wait for Docker to be "Running" (green icon in menu bar)
|
||||
|
||||
# Verify:
|
||||
docker version
|
||||
docker run hello-world
|
||||
```
|
||||
|
||||
**File Sharing Configuration** (in Docker Desktop → Settings → Resources → File sharing):
|
||||
- Ensure `/Volumes` is shared (for future NFS mounts from sifaka)
|
||||
- Or add specific paths as needed for P6
|
||||
|
||||
### 2. Stop Current QEMU2 Minikube
|
||||
|
||||
```bash
|
||||
# On indri:
|
||||
minikube stop
|
||||
minikube delete
|
||||
|
||||
# Verify QEMU resources are cleaned up
|
||||
ps aux | grep qemu
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Plan
|
||||
|
||||
### 1. Update Ansible Role for Docker Driver
|
||||
|
||||
**Changes to `ansible/roles/minikube/defaults/main.yml`:**
|
||||
|
||||
```yaml
|
||||
# Change from:
|
||||
minikube_driver: qemu2
|
||||
minikube_network: socket_vmnet
|
||||
minikube_container_runtime: containerd
|
||||
|
||||
# To:
|
||||
minikube_driver: docker
|
||||
minikube_container_runtime: docker # or containerd, both work
|
||||
```
|
||||
|
||||
**Remove from defaults:**
|
||||
- `minikube_network` (not needed for docker driver)
|
||||
|
||||
**Changes to `ansible/roles/minikube/tasks/main.yml`:**
|
||||
- Remove qemu installation
|
||||
- Remove socket_vmnet installation and service management
|
||||
- Remove NFS mount point creation
|
||||
- Remove NFS LaunchDaemon installation
|
||||
- Remove minikube mount LaunchAgent installation
|
||||
- Keep containerd registry mirror config (adapting for docker if needed)
|
||||
|
||||
**Remove files from `ansible/roles/minikube/files/`:**
|
||||
- `com.blumeops.nfs-torrents.plist`
|
||||
- `com.blumeops.minikube-mount.plist`
|
||||
|
||||
**Changes to `ansible/roles/minikube/handlers/main.yml`:**
|
||||
- Remove `Load NFS mount LaunchDaemon`
|
||||
- Remove `Load minikube mount LaunchAgent`
|
||||
|
||||
**Add to Brewfile:**
|
||||
```ruby
|
||||
cask "docker" # Docker Desktop
|
||||
```
|
||||
|
||||
### 2. Update Tailscale Serve Configuration
|
||||
|
||||
**Changes to `ansible/roles/tailscale_serve/defaults/main.yml`:**
|
||||
|
||||
```yaml
|
||||
# Change svc:k8s upstream from VM IP back to localhost:
|
||||
- name: svc:k8s
|
||||
tcp:
|
||||
port: 443
|
||||
upstream: tcp://localhost:PORT # PORT will be dynamic, see below
|
||||
```
|
||||
|
||||
**Note on API server port**: With the docker driver, the API server port is dynamic (assigned by minikube). We need to either:
|
||||
- Use `--apiserver-port=6443` to fix it
|
||||
- Or query and update the config after cluster creation
|
||||
|
||||
### 3. Create Docker Minikube Cluster
|
||||
|
||||
```bash
|
||||
# On indri (after Docker Desktop is running):
|
||||
minikube start \
|
||||
--driver=docker \
|
||||
--cpus=6 \
|
||||
--memory=12288 \
|
||||
--disk-size=200g \
|
||||
--apiserver-names=k8s.tail8d86e.ts.net,indri \
|
||||
--apiserver-port=6443 \
|
||||
--listen-address=0.0.0.0
|
||||
|
||||
# Verify cluster
|
||||
minikube status
|
||||
kubectl get nodes
|
||||
```
|
||||
|
||||
### 4. Verify API Server is on Localhost
|
||||
|
||||
```bash
|
||||
# Check what port the API server is on
|
||||
kubectl config view --minify -o jsonpath="{.clusters[0].cluster.server}"
|
||||
# Should show https://127.0.0.1:PORT or similar
|
||||
|
||||
# Verify local access works
|
||||
curl -k https://localhost:6443/healthz
|
||||
# Should return "ok"
|
||||
```
|
||||
|
||||
### 5. Update 1Password Credentials
|
||||
|
||||
After cluster recreation, update the credentials in 1Password:
|
||||
|
||||
```bash
|
||||
# On indri, get the new certificates:
|
||||
cat ~/.minikube/profiles/minikube/client.crt
|
||||
cat ~/.minikube/profiles/minikube/client.key
|
||||
cat ~/.minikube/ca.crt
|
||||
```
|
||||
|
||||
Update in 1Password (vault: `vg6xf6vvfmoh5hqjjhlhbeoaie`, item: `3jo4f2hnzvwfmamudfsbbbec7e`).
|
||||
|
||||
### 6. Update Kubeconfig on Gilbert
|
||||
|
||||
```bash
|
||||
# Fetch new CA cert from 1Password
|
||||
op --vault vg6xf6vvfmoh5hqjjhlhbeoaie item get 3jo4f2hnzvwfmamudfsbbbec7e --fields ca-cert | sed 's/^"//; s/"$//' > ~/.kube/minikube-indri/ca.crt
|
||||
```
|
||||
|
||||
### 7. Configure Tailscale Serve for K8s
|
||||
|
||||
```bash
|
||||
# On indri:
|
||||
tailscale serve --service="svc:k8s" --tcp=443 tcp://localhost:6443
|
||||
```
|
||||
|
||||
### 8. Verify Remote Access
|
||||
|
||||
```bash
|
||||
# From gilbert:
|
||||
curl -k --connect-timeout 5 https://k8s.tail8d86e.ts.net/healthz
|
||||
# Should return "ok"
|
||||
|
||||
kubectl --context=minikube-indri get nodes
|
||||
# Should show the minikube node
|
||||
```
|
||||
|
||||
### 9. Redeploy ArgoCD and Apps
|
||||
|
||||
Since this is a cluster recreation, we need to re-bootstrap:
|
||||
|
||||
```bash
|
||||
# On indri - apply secrets first
|
||||
op inject -i argocd/manifests/tailscale-operator/secret.yaml.tpl | kubectl apply -f -
|
||||
|
||||
# Create repo secret for ArgoCD
|
||||
PRIV_KEY=$(op read "op://vg6xf6vvfmoh5hqjjhlhbeoaie/csjncynh6htjvnh2l2da65y32q/private key?ssh-format=openssh")$'\n'
|
||||
kubectl create namespace argocd
|
||||
kubectl create secret generic repo-forge -n argocd \
|
||||
--from-literal=type=git \
|
||||
--from-literal=url='ssh://forgejo@indri.tail8d86e.ts.net:2200/eblume/blumeops.git' \
|
||||
--from-literal=insecure=true \
|
||||
--from-literal=sshPrivateKey="$PRIV_KEY"
|
||||
kubectl label secret repo-forge -n argocd argocd.argoproj.io/secret-type=repository
|
||||
|
||||
# Bootstrap operators
|
||||
kubectl create namespace tailscale
|
||||
kubectl apply -k argocd/manifests/tailscale-operator/
|
||||
kubectl apply -k argocd/manifests/argocd/
|
||||
|
||||
# Wait for ArgoCD
|
||||
kubectl wait --for=condition=available deployment/argocd-server -n argocd --timeout=300s
|
||||
|
||||
# Login and sync apps
|
||||
argocd login argocd.tail8d86e.ts.net --username admin --grpc-web
|
||||
argocd app sync apps
|
||||
argocd app sync tailscale-operator
|
||||
argocd app sync cloudnative-pg
|
||||
argocd app sync blumeops-pg
|
||||
argocd app sync grafana
|
||||
argocd app sync grafana-config
|
||||
argocd app sync miniflux
|
||||
argocd app sync devpi
|
||||
```
|
||||
|
||||
### 10. Verify All Services
|
||||
|
||||
```bash
|
||||
mise run indri-services-check
|
||||
argocd app list
|
||||
kubectl get pods --all-namespaces
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Volume Mounts for P6 (Kiwix/Transmission)
|
||||
|
||||
With the docker driver, volume mounts work differently than QEMU2:
|
||||
|
||||
**Option A: Docker Desktop File Sharing + hostPath**
|
||||
1. Mount sifaka NFS share on indri: `/Volumes/torrents`
|
||||
2. Add `/Volumes/torrents` to Docker Desktop file sharing
|
||||
3. Pods use hostPath pointing to that path
|
||||
|
||||
**Option B: NFS directly from pods**
|
||||
- Docker containers can make NFS mounts (unlike podman's rootless containers)
|
||||
- May need to test if sifaka allows connections from the Docker network
|
||||
|
||||
This will be fully tested in Phase 6.
|
||||
|
||||
---
|
||||
|
||||
## Cleanup
|
||||
|
||||
After successful migration:
|
||||
|
||||
1. **Remove QEMU2 artifacts:**
|
||||
3. **Docker minikube cluster created**:
|
||||
```bash
|
||||
brew uninstall qemu socket_vmnet
|
||||
minikube start \
|
||||
--driver=docker \
|
||||
--container-runtime=docker \
|
||||
--cpus=6 \
|
||||
--memory=11264 \
|
||||
--disk-size=200g \
|
||||
--apiserver-names=k8s.tail8d86e.ts.net,indri \
|
||||
--apiserver-port=6443 \
|
||||
--listen-address=0.0.0.0
|
||||
```
|
||||
Note: Memory set to 11264MB (11GB) to leave headroom for Docker Desktop overhead.
|
||||
|
||||
4. **Tailscale serve configured** for k8s API:
|
||||
- API server on localhost:50820 (port is dynamic with docker driver)
|
||||
- `tailscale serve --service=svc:k8s --tcp=443 tcp://localhost:50820`
|
||||
|
||||
5. **Remote kubectl access working** from gilbert:
|
||||
- Created `mise-tasks/ensure-minikube-indri-kubectl-config` script
|
||||
- Fetches certs from indri and sets up `~/.kube/minikube-indri/config.yml`
|
||||
- `kubectl --context=minikube-indri get nodes` works
|
||||
|
||||
6. **Ansible roles updated**:
|
||||
- `ansible/roles/minikube/` - docker driver, removed qemu2/NFS/socket_vmnet
|
||||
- `ansible/roles/tailscale_serve/` - removed svc:k8s (minikube role handles dynamic port)
|
||||
- Containerd registry mirrors configured for zot pull-through cache
|
||||
|
||||
7. **QEMU2 artifacts cleaned up**:
|
||||
- Stopped socket_vmnet service
|
||||
- Removed NFS LaunchDaemon
|
||||
- Removed minikube mount LaunchAgent
|
||||
- kubectl still works after cleanup
|
||||
|
||||
### Remaining 📋
|
||||
|
||||
1. **Redeploy ArgoCD and apps** - bootstrap the cluster with:
|
||||
```bash
|
||||
# On indri - apply secrets first
|
||||
op inject -i argocd/manifests/tailscale-operator/secret.yaml.tpl | kubectl apply -f -
|
||||
|
||||
# Create repo secret for ArgoCD
|
||||
PRIV_KEY=$(op read "op://vg6xf6vvfmoh5hqjjhlhbeoaie/csjncynh6htjvnh2l2da65y32q/private key?ssh-format=openssh")$'\n'
|
||||
kubectl create namespace argocd
|
||||
kubectl create secret generic repo-forge -n argocd \
|
||||
--from-literal=type=git \
|
||||
--from-literal=url='ssh://forgejo@indri.tail8d86e.ts.net:2200/eblume/blumeops.git' \
|
||||
--from-literal=insecure=true \
|
||||
--from-literal=sshPrivateKey="$PRIV_KEY"
|
||||
kubectl label secret repo-forge -n argocd argocd.argoproj.io/secret-type=repository
|
||||
|
||||
# Bootstrap operators
|
||||
kubectl create namespace tailscale
|
||||
kubectl apply -k argocd/manifests/tailscale-operator/
|
||||
kubectl apply -k argocd/manifests/argocd/
|
||||
|
||||
# Wait for ArgoCD
|
||||
kubectl wait --for=condition=available deployment/argocd-server -n argocd --timeout=300s
|
||||
|
||||
# Login and sync apps
|
||||
argocd login argocd.tail8d86e.ts.net --username admin --grpc-web
|
||||
argocd app sync apps
|
||||
argocd app sync tailscale-operator
|
||||
argocd app sync cloudnative-pg
|
||||
argocd app sync blumeops-pg
|
||||
argocd app sync grafana
|
||||
argocd app sync grafana-config
|
||||
argocd app sync miniflux
|
||||
argocd app sync devpi
|
||||
```
|
||||
|
||||
2. **Remove podman if no longer needed:**
|
||||
```bash
|
||||
podman machine stop
|
||||
podman machine rm
|
||||
brew uninstall podman
|
||||
```
|
||||
2. **Verify all services** with `mise run indri-services-check`
|
||||
|
||||
3. **Configure containerd registry mirrors** (will be done by ansible on next provision)
|
||||
|
||||
---
|
||||
|
||||
## Technical Notes
|
||||
|
||||
### API Server Port
|
||||
|
||||
With docker driver, the API server port is **dynamic** - Docker maps a random host port to 6443 inside the container. Current port: 50820.
|
||||
|
||||
The minikube ansible role queries the port after cluster start and configures tailscale serve accordingly.
|
||||
|
||||
### Registry Mirror Configuration
|
||||
|
||||
Containerd uses `/etc/containerd/certs.d/<registry>/hosts.toml` files:
|
||||
|
||||
```toml
|
||||
# /etc/containerd/certs.d/docker.io/hosts.toml
|
||||
server = "https://registry-1.docker.io"
|
||||
|
||||
[host."http://host.minikube.internal:5050"]
|
||||
capabilities = ["pull", "resolve"]
|
||||
skip_verify = true
|
||||
```
|
||||
|
||||
The ansible role configures mirrors for:
|
||||
- `registry.tail8d86e.ts.net` (private images)
|
||||
- `docker.io`
|
||||
- `ghcr.io`
|
||||
- `quay.io`
|
||||
|
||||
### Volume Mounts for P6 (Kiwix/Transmission)
|
||||
|
||||
With the docker driver, volume mounts work differently than podman or qemu2. Here's the analysis:
|
||||
|
||||
**Current Network State:**
|
||||
- Minikube container is on Docker network `192.168.49.0/24`
|
||||
- Sifaka NFS exports `/volume1/torrents` to:
|
||||
- `192.168.105.0/24` (old qemu2 VM network - no longer used)
|
||||
- `100.64.0.0/10` (Tailscale CGNAT range)
|
||||
- Minikube can resolve `sifaka` (192.168.1.203) but can't reach it (100% packet loss due to Docker network isolation)
|
||||
|
||||
**Option A: hostPath via Docker Desktop File Sharing** ⭐ RECOMMENDED
|
||||
1. Mount sifaka NFS share on indri macOS: `mount -t nfs sifaka:/volume1/torrents /Volumes/torrents`
|
||||
2. Docker Desktop file sharing exposes `/Volumes` into the Docker VM
|
||||
3. Pods use hostPath to access `/Volumes/torrents`
|
||||
|
||||
Pros:
|
||||
- Simplest approach, uses native Docker file sharing
|
||||
- No network reconfiguration needed on sifaka
|
||||
- Path is stable and predictable
|
||||
|
||||
Cons:
|
||||
- Requires persistent NFS mount on indri (LaunchDaemon)
|
||||
- File sharing performance may be slower than direct NFS
|
||||
|
||||
Implementation:
|
||||
```bash
|
||||
# Manual mount test
|
||||
ssh indri 'sudo mkdir -p /Volumes/torrents && sudo mount -t nfs -o resvport,rw sifaka:/volume1/torrents /Volumes/torrents'
|
||||
|
||||
# Verify Docker can see it
|
||||
ssh indri 'docker run --rm -v /Volumes/torrents:/data alpine ls /data'
|
||||
|
||||
# Pod manifest uses hostPath:
|
||||
# volumes:
|
||||
# - name: torrents
|
||||
# hostPath:
|
||||
# path: /Volumes/torrents
|
||||
# type: Directory
|
||||
```
|
||||
|
||||
**Option B: Update sifaka NFS exports for Docker network**
|
||||
1. Add `192.168.49.0/24` to sifaka's NFS exports
|
||||
2. Pods mount NFS directly using kubernetes NFS volume type
|
||||
|
||||
Cons:
|
||||
- Docker network might change (though `192.168.49.x` seems stable for minikube)
|
||||
- Requires sifaka configuration change
|
||||
- NFS mount from inside container may have permission issues
|
||||
|
||||
**Option C: Tailscale sidecar for NFS access**
|
||||
1. Pods include a Tailscale sidecar that joins the tailnet
|
||||
2. Mount NFS via Tailscale IP (sifaka is at 100.x.x.x)
|
||||
|
||||
Cons:
|
||||
- Complex setup with sidecar containers
|
||||
- Each pod needs Tailscale auth
|
||||
- Overkill for this use case
|
||||
|
||||
**Recommendation for P6:**
|
||||
Use **Option A** (hostPath via Docker Desktop file sharing). It's the simplest and most reliable approach. We'll need a LaunchDaemon for the NFS mount, but it's straightforward:
|
||||
|
||||
```xml
|
||||
<!-- /Library/LaunchDaemons/com.blumeops.nfs-torrents.plist -->
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
|
||||
<plist version="1.0">
|
||||
<dict>
|
||||
<key>Label</key>
|
||||
<string>com.blumeops.nfs-torrents</string>
|
||||
<key>ProgramArguments</key>
|
||||
<array>
|
||||
<string>/sbin/mount</string>
|
||||
<string>-t</string>
|
||||
<string>nfs</string>
|
||||
<string>-o</string>
|
||||
<string>resvport,rw</string>
|
||||
<string>sifaka:/volume1/torrents</string>
|
||||
<string>/Volumes/torrents</string>
|
||||
</array>
|
||||
<key>RunAtLoad</key>
|
||||
<true/>
|
||||
</dict>
|
||||
</plist>
|
||||
```
|
||||
|
||||
This is simpler than the qemu2 approach because there's no intermediate `minikube mount` step - Docker Desktop handles the path passthrough automatically.
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
- [ ] Docker Desktop installed and running on indri
|
||||
- [ ] QEMU2 minikube deleted
|
||||
- [ ] Docker minikube running
|
||||
- [ ] API server accessible on localhost:6443
|
||||
- [ ] Tailscale serve configured for svc:k8s → localhost:6443
|
||||
- [ ] Remote kubectl access working from gilbert
|
||||
- [x] Docker Desktop installed and running on indri
|
||||
- [x] QEMU2 minikube deleted
|
||||
- [x] Docker minikube running (6 CPUs, 11GB RAM)
|
||||
- [x] API server accessible on localhost:50820
|
||||
- [x] Tailscale serve configured for svc:k8s → localhost:50820
|
||||
- [x] Remote kubectl access working from gilbert
|
||||
- [x] Ansible roles updated for docker driver
|
||||
- [x] socket_vmnet stopped
|
||||
- [ ] ArgoCD redeployed and synced
|
||||
- [ ] All existing apps healthy (grafana, miniflux, devpi, etc.)
|
||||
- [ ] PostgreSQL cluster healthy
|
||||
- [ ] Containerd registry mirrors configured
|
||||
- [ ] `mise run indri-services-check` passes
|
||||
|
||||
---
|
||||
|
|
@ -312,29 +275,3 @@ If Docker driver doesn't work:
|
|||
1. Delete Docker minikube: `minikube delete`
|
||||
2. Recreate QEMU2 cluster (restore old ansible config from git)
|
||||
3. Accept the Tailscale TCP forwarding limitation and use SSH tunnel for remote kubectl
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- Docker Desktop has resource overhead but provides better macOS integration
|
||||
- The docker driver is more widely used and tested than qemu2
|
||||
- File sharing permissions may need adjustment in Docker Desktop settings
|
||||
- First cluster start may be slow as Docker pulls the minikube base image
|
||||
|
||||
## Implementation Notes (2026-01-21)
|
||||
|
||||
### QEMU2 Cleanup Done
|
||||
|
||||
Removed from indri:
|
||||
- `/Library/LaunchDaemons/com.blumeops.nfs-torrents.plist` - NFS mount daemon
|
||||
- `~/Library/LaunchAgents/com.blumeops.minikube-mount.plist` - minikube mount agent
|
||||
- Unmounted `/Volumes/torrents-nfs` NFS mount
|
||||
- Removed `/Volumes/torrents-nfs` mount point
|
||||
|
||||
### Previous QEMU2 Issues
|
||||
|
||||
The QEMU2 migration partially worked but had a critical issue:
|
||||
- Volume mounts worked via NFS → indri → minikube mount chain
|
||||
- But Tailscale TCP proxy to VM IP (192.168.105.2:6443) failed with TLS timeout
|
||||
- Root cause unknown - TCP connected but TLS handshake never completed
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue