P5.1: Migrate minikube from podman to QEMU2 driver (#38)
## Summary - Migrate minikube from podman driver to qemu2 driver for proper NFS/SMB volume mount support - Update ansible minikube role with qemu installation and containerd runtime - Remove podman role dependency from indri.yml - Add synology user creation steps and post-migration zot reconfiguration notes ## Why Phase 6 (Kiwix/Transmission migration) was blocked because the podman driver lacks kernel capabilities for filesystem mounts. QEMU2 creates an actual VM with full mount support. ## Deployment and Testing - [ ] Create k8s-storage user on Synology DSM - [ ] Store credentials in 1Password (synology-k8s-storage) - [ ] Export current k8s state - [ ] Stop and delete podman-based minikube cluster - [ ] Run ansible to create QEMU2 cluster - [ ] Test NFS volume mount with test pod - [ ] Redeploy ArgoCD and all apps - [ ] Verify all services healthy - [ ] Reconfigure zot registry mirrors for containerd (post-migration) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/38
This commit is contained in:
parent
7b60cca31e
commit
21848a7919
20 changed files with 490 additions and 542 deletions
|
|
@ -47,8 +47,6 @@
|
|||
tags: zot
|
||||
- role: zot_metrics
|
||||
tags: zot_metrics
|
||||
- role: podman
|
||||
tags: podman
|
||||
- role: minikube
|
||||
tags: minikube
|
||||
- role: minikube_metrics
|
||||
|
|
|
|||
|
|
@ -1,20 +1,18 @@
|
|||
---
|
||||
# Minikube cluster configuration
|
||||
minikube_cpus: 4
|
||||
# Note: Must be less than podman machine memory (8192MB) to account for overhead
|
||||
minikube_memory: 7800
|
||||
# Uses docker driver - requires Docker Desktop to be installed and running
|
||||
# with at least 12GB memory allocated in Docker Desktop settings
|
||||
minikube_cpus: 6
|
||||
minikube_memory: 11264 # Leave ~1GB headroom for Docker Desktop overhead
|
||||
minikube_disk_size: "200g"
|
||||
minikube_driver: podman
|
||||
minikube_container_runtime: cri-o
|
||||
minikube_driver: docker
|
||||
minikube_container_runtime: docker
|
||||
|
||||
# Remote access configuration
|
||||
# These allow kubectl from other machines (e.g., gilbert) to connect
|
||||
# k8s.tail8d86e.ts.net is exposed via Tailscale service (TCP passthrough)
|
||||
# k8s.tail8d86e.ts.net is exposed via Tailscale service (TCP passthrough to localhost)
|
||||
minikube_apiserver_names:
|
||||
- k8s.tail8d86e.ts.net
|
||||
- indri
|
||||
# Note: apiserver_port is the INTERNAL container port; with podman driver,
|
||||
# the host port is dynamically assigned. Check actual port with:
|
||||
# kubectl config view --minify -o jsonpath="{.clusters[0].cluster.server}"
|
||||
minikube_apiserver_port: 6443
|
||||
minikube_listen_address: "0.0.0.0"
|
||||
|
|
|
|||
|
|
@ -1,43 +0,0 @@
|
|||
# Zot pull-through cache on indri
|
||||
# Uses host.containers.internal which is stable across restarts
|
||||
# Applied by ansible minikube role
|
||||
|
||||
# Direct access to Zot for private images (blumeops/*)
|
||||
[[registry]]
|
||||
prefix = "host.containers.internal:5050"
|
||||
location = "host.containers.internal:5050"
|
||||
insecure = true
|
||||
|
||||
# Tailscale hostname for Zot - redirects to local access
|
||||
# Allows manifests to use registry.tail8d86e.ts.net which is cleaner
|
||||
[[registry]]
|
||||
prefix = "registry.tail8d86e.ts.net"
|
||||
location = "registry.tail8d86e.ts.net"
|
||||
|
||||
[[registry.mirror]]
|
||||
location = "host.containers.internal:5050"
|
||||
insecure = true
|
||||
|
||||
[[registry]]
|
||||
prefix = "docker.io"
|
||||
location = "docker.io"
|
||||
|
||||
[[registry.mirror]]
|
||||
location = "host.containers.internal:5050/docker.io"
|
||||
insecure = true
|
||||
|
||||
[[registry]]
|
||||
prefix = "ghcr.io"
|
||||
location = "ghcr.io"
|
||||
|
||||
[[registry.mirror]]
|
||||
location = "host.containers.internal:5050/ghcr.io"
|
||||
insecure = true
|
||||
|
||||
[[registry]]
|
||||
prefix = "quay.io"
|
||||
location = "quay.io"
|
||||
|
||||
[[registry.mirror]]
|
||||
location = "host.containers.internal:5050/quay.io"
|
||||
insecure = true
|
||||
|
|
@ -8,7 +8,7 @@
|
|||
minikube start
|
||||
changed_when: true
|
||||
|
||||
- name: Restart CRI-O in minikube
|
||||
- name: Restart containerd in minikube
|
||||
ansible.builtin.command:
|
||||
cmd: minikube ssh --native-ssh=false "sudo systemctl restart crio"
|
||||
cmd: minikube ssh --native-ssh=false "sudo systemctl restart containerd"
|
||||
changed_when: true
|
||||
|
|
|
|||
|
|
@ -1,11 +1,17 @@
|
|||
---
|
||||
# Minikube installation and cluster setup for indri
|
||||
# Requires podman machine to be running (see podman role)
|
||||
# Uses docker driver - requires Docker Desktop to be installed manually first
|
||||
# (Docker Desktop requires GUI setup, so it's not automated in this role)
|
||||
#
|
||||
# NOTE: Similar to podman, minikube start may have issues when run via SSH.
|
||||
# Prerequisites:
|
||||
# 1. Install Docker Desktop: brew install --cask docker
|
||||
# 2. Launch Docker Desktop and complete setup wizard
|
||||
# 3. Configure Docker Desktop with at least 12GB memory
|
||||
#
|
||||
# NOTE: minikube start may have issues when run via SSH.
|
||||
# If cluster fails to start, manually run on indri:
|
||||
# minikube start --driver=podman --container-runtime=cri-o \
|
||||
# --cpus=4 --memory=7800 --disk-size=200g \
|
||||
# minikube start --driver=docker --container-runtime=docker \
|
||||
# --cpus=6 --memory=11264 --disk-size=200g \
|
||||
# --apiserver-names=k8s.tail8d86e.ts.net --apiserver-names=indri \
|
||||
# --apiserver-port=6443 --listen-address=0.0.0.0
|
||||
|
||||
|
|
@ -19,6 +25,18 @@
|
|||
name: kubectl
|
||||
state: present
|
||||
|
||||
- name: Check if Docker is running
|
||||
ansible.builtin.command:
|
||||
cmd: docker info
|
||||
register: minikube_docker_status
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Warn if Docker is not running
|
||||
ansible.builtin.debug:
|
||||
msg: "WARNING: Docker does not appear to be running. Please start Docker Desktop manually."
|
||||
when: minikube_docker_status.rc != 0
|
||||
|
||||
- name: Check if minikube cluster exists
|
||||
ansible.builtin.command:
|
||||
cmd: minikube status --format={% raw %}'{{.Host}}'{% endraw %}
|
||||
|
|
@ -42,8 +60,10 @@
|
|||
--listen-address={{ minikube_listen_address }}
|
||||
register: minikube_start
|
||||
changed_when: minikube_start.rc == 0
|
||||
failed_when: false # Don't fail - may need manual intervention like podman
|
||||
when: minikube_status.rc != 0 or 'Running' not in minikube_status.stdout
|
||||
failed_when: false # Don't fail - may need manual intervention
|
||||
when:
|
||||
- minikube_docker_status.rc == 0
|
||||
- minikube_status.rc != 0 or 'Running' not in minikube_status.stdout
|
||||
|
||||
- name: Check minikube status after start attempt
|
||||
ansible.builtin.command:
|
||||
|
|
@ -57,54 +77,146 @@
|
|||
msg: "WARNING: minikube may not have started properly. Run 'minikube start' manually on indri if needed. Status: {{ minikube_final_status.stdout | default('unknown') }}"
|
||||
when: minikube_final_status.rc != 0 or 'Running' not in minikube_final_status.stdout
|
||||
|
||||
# Configure CRI-O to use zot as pull-through cache
|
||||
- name: Read desired zot mirror config
|
||||
ansible.builtin.slurp:
|
||||
src: "{{ role_path }}/files/zot-mirror.conf"
|
||||
register: minikube_desired_zot_config
|
||||
delegate_to: localhost
|
||||
when: minikube_final_status.rc == 0 and 'Running' in minikube_final_status.stdout
|
||||
# Configure containerd to use zot registry as pull-through cache
|
||||
# With docker driver, use host.minikube.internal to reach the host
|
||||
# Zot runs on indri:5050 and caches images from docker.io, ghcr.io, quay.io
|
||||
|
||||
- name: Check current zot mirror config in minikube
|
||||
- name: Create containerd registry mirror directories
|
||||
ansible.builtin.command:
|
||||
cmd: minikube ssh --native-ssh=false "cat /etc/containers/registries.conf.d/zot-mirror.conf 2>/dev/null || echo ''"
|
||||
register: minikube_existing_zot_config
|
||||
cmd: minikube ssh --native-ssh=false "sudo mkdir -p /etc/containerd/certs.d/{{ item }}"
|
||||
loop:
|
||||
- registry.tail8d86e.ts.net
|
||||
- docker.io
|
||||
- ghcr.io
|
||||
- quay.io
|
||||
changed_when: false
|
||||
when: minikube_final_status.rc == 0 and 'Running' in minikube_final_status.stdout
|
||||
|
||||
- name: Determine if zot mirror config needs update
|
||||
ansible.builtin.set_fact:
|
||||
minikube_zot_config_changed: "{{ (minikube_existing_zot_config.stdout | trim) != (minikube_desired_zot_config.content | b64decode | trim) }}"
|
||||
# Private registry (registry.tail8d86e.ts.net) - direct to zot
|
||||
- name: Check registry.tail8d86e.ts.net config
|
||||
ansible.builtin.command:
|
||||
cmd: minikube ssh --native-ssh=false "cat /etc/containerd/certs.d/registry.tail8d86e.ts.net/hosts.toml 2>/dev/null || echo ''"
|
||||
register: minikube_registry_config
|
||||
changed_when: false
|
||||
when: minikube_final_status.rc == 0 and 'Running' in minikube_final_status.stdout
|
||||
|
||||
- name: Copy zot mirror config to temp location
|
||||
ansible.builtin.copy:
|
||||
src: zot-mirror.conf
|
||||
dest: /tmp/zot-mirror.conf
|
||||
mode: "0644"
|
||||
when:
|
||||
- minikube_final_status.rc == 0
|
||||
- "'Running' in minikube_final_status.stdout"
|
||||
- minikube_zot_config_changed | default(false)
|
||||
|
||||
- name: Apply zot mirror config to minikube
|
||||
ansible.builtin.shell:
|
||||
- name: Configure registry.tail8d86e.ts.net mirror
|
||||
ansible.builtin.command:
|
||||
cmd: |
|
||||
set -o pipefail
|
||||
cat /tmp/zot-mirror.conf | minikube ssh --native-ssh=false "sudo tee /etc/containers/registries.conf.d/zot-mirror.conf > /dev/null"
|
||||
executable: /bin/bash
|
||||
changed_when: true # Task only runs when config needs updating
|
||||
when:
|
||||
- minikube_final_status.rc == 0
|
||||
- "'Running' in minikube_final_status.stdout"
|
||||
- minikube_zot_config_changed | default(false)
|
||||
notify: Restart CRI-O in minikube
|
||||
minikube ssh --native-ssh=false 'echo "server = \"http://host.minikube.internal:5050\"
|
||||
|
||||
- name: Clean up temp config file
|
||||
ansible.builtin.file:
|
||||
path: /tmp/zot-mirror.conf
|
||||
state: absent
|
||||
[host.\"http://host.minikube.internal:5050\"]
|
||||
capabilities = [\"pull\", \"resolve\", \"push\"]
|
||||
skip_verify = true" | sudo tee /etc/containerd/certs.d/registry.tail8d86e.ts.net/hosts.toml'
|
||||
changed_when: true
|
||||
when:
|
||||
- minikube_final_status.rc == 0
|
||||
- "'Running' in minikube_final_status.stdout"
|
||||
- minikube_zot_config_changed | default(false)
|
||||
- minikube_final_status.rc == 0 and 'Running' in minikube_final_status.stdout
|
||||
- "'host.minikube.internal:5050' not in minikube_registry_config.stdout"
|
||||
notify: Restart containerd in minikube
|
||||
|
||||
# Docker Hub (docker.io) - zot pull-through cache
|
||||
- name: Check docker.io config
|
||||
ansible.builtin.command:
|
||||
cmd: minikube ssh --native-ssh=false "cat /etc/containerd/certs.d/docker.io/hosts.toml 2>/dev/null || echo ''"
|
||||
register: minikube_dockerio_config
|
||||
changed_when: false
|
||||
when: minikube_final_status.rc == 0 and 'Running' in minikube_final_status.stdout
|
||||
|
||||
- name: Configure docker.io mirror through zot
|
||||
ansible.builtin.command:
|
||||
cmd: |
|
||||
minikube ssh --native-ssh=false 'echo "server = \"https://registry-1.docker.io\"
|
||||
|
||||
[host.\"http://host.minikube.internal:5050\"]
|
||||
capabilities = [\"pull\", \"resolve\"]
|
||||
skip_verify = true" | sudo tee /etc/containerd/certs.d/docker.io/hosts.toml'
|
||||
changed_when: true
|
||||
when:
|
||||
- minikube_final_status.rc == 0 and 'Running' in minikube_final_status.stdout
|
||||
- "'host.minikube.internal:5050' not in minikube_dockerio_config.stdout"
|
||||
notify: Restart containerd in minikube
|
||||
|
||||
# GitHub Container Registry (ghcr.io) - zot pull-through cache
|
||||
- name: Check ghcr.io config
|
||||
ansible.builtin.command:
|
||||
cmd: minikube ssh --native-ssh=false "cat /etc/containerd/certs.d/ghcr.io/hosts.toml 2>/dev/null || echo ''"
|
||||
register: minikube_ghcr_config
|
||||
changed_when: false
|
||||
when: minikube_final_status.rc == 0 and 'Running' in minikube_final_status.stdout
|
||||
|
||||
- name: Configure ghcr.io mirror through zot
|
||||
ansible.builtin.command:
|
||||
cmd: |
|
||||
minikube ssh --native-ssh=false 'echo "server = \"https://ghcr.io\"
|
||||
|
||||
[host.\"http://host.minikube.internal:5050\"]
|
||||
capabilities = [\"pull\", \"resolve\"]
|
||||
skip_verify = true" | sudo tee /etc/containerd/certs.d/ghcr.io/hosts.toml'
|
||||
changed_when: true
|
||||
when:
|
||||
- minikube_final_status.rc == 0 and 'Running' in minikube_final_status.stdout
|
||||
- "'host.minikube.internal:5050' not in minikube_ghcr_config.stdout"
|
||||
notify: Restart containerd in minikube
|
||||
|
||||
# Quay.io - zot pull-through cache
|
||||
- name: Check quay.io config
|
||||
ansible.builtin.command:
|
||||
cmd: minikube ssh --native-ssh=false "cat /etc/containerd/certs.d/quay.io/hosts.toml 2>/dev/null || echo ''"
|
||||
register: minikube_quay_config
|
||||
changed_when: false
|
||||
when: minikube_final_status.rc == 0 and 'Running' in minikube_final_status.stdout
|
||||
|
||||
- name: Configure quay.io mirror through zot
|
||||
ansible.builtin.command:
|
||||
cmd: |
|
||||
minikube ssh --native-ssh=false 'echo "server = \"https://quay.io\"
|
||||
|
||||
[host.\"http://host.minikube.internal:5050\"]
|
||||
capabilities = [\"pull\", \"resolve\"]
|
||||
skip_verify = true" | sudo tee /etc/containerd/certs.d/quay.io/hosts.toml'
|
||||
changed_when: true
|
||||
when:
|
||||
- minikube_final_status.rc == 0 and 'Running' in minikube_final_status.stdout
|
||||
- "'host.minikube.internal:5050' not in minikube_quay_config.stdout"
|
||||
notify: Restart containerd in minikube
|
||||
|
||||
# Configure Tailscale serve for k8s API access
|
||||
# With docker driver, the API server port is dynamic (not fixed at 6443)
|
||||
# We query the current port and configure tailscale serve accordingly
|
||||
- name: Get minikube API server URL
|
||||
ansible.builtin.command:
|
||||
cmd: kubectl config view --minify -o jsonpath="{.clusters[0].cluster.server}"
|
||||
register: minikube_api_url
|
||||
changed_when: false
|
||||
when: minikube_final_status.rc == 0 and 'Running' in minikube_final_status.stdout
|
||||
|
||||
- name: Extract API server port from URL
|
||||
ansible.builtin.set_fact:
|
||||
minikube_api_port: "{{ minikube_api_url.stdout | regex_search(':([0-9]+)$', '\\1') | first }}"
|
||||
when:
|
||||
- minikube_final_status.rc == 0 and 'Running' in minikube_final_status.stdout
|
||||
- minikube_api_url.stdout is defined
|
||||
|
||||
- name: Check current tailscale serve config for k8s
|
||||
ansible.builtin.command:
|
||||
cmd: tailscale serve status --json
|
||||
register: minikube_tailscale_serve_status
|
||||
changed_when: false
|
||||
when: minikube_api_port is defined
|
||||
|
||||
- name: Parse tailscale serve k8s config
|
||||
ansible.builtin.set_fact:
|
||||
minikube_tailscale_k8s_tcp: "{{ ((minikube_tailscale_serve_status.stdout | from_json).Services['svc:k8s'].TCP['443'].TCPForward | default('')) }}"
|
||||
when:
|
||||
- minikube_api_port is defined
|
||||
- minikube_tailscale_serve_status.stdout is defined
|
||||
- "'svc:k8s' in (minikube_tailscale_serve_status.stdout | from_json).Services | default({})"
|
||||
failed_when: false
|
||||
|
||||
- name: Configure tailscale serve for k8s API
|
||||
ansible.builtin.command:
|
||||
cmd: tailscale serve --service="svc:k8s" --tcp=443 tcp://localhost:{{ minikube_api_port }}
|
||||
when:
|
||||
- minikube_api_port is defined
|
||||
- minikube_tailscale_k8s_tcp is not defined or minikube_tailscale_k8s_tcp != 'localhost:' + minikube_api_port
|
||||
changed_when: true
|
||||
|
|
|
|||
|
|
@ -4,6 +4,7 @@
|
|||
|
||||
tailscale_serve_services:
|
||||
# NOTE: svc:grafana, svc:pg, svc:feed, svc:pypi removed - now hosted in k8s
|
||||
# NOTE: svc:k8s is configured by the minikube role (port is dynamic with docker driver)
|
||||
|
||||
- name: svc:forge
|
||||
https:
|
||||
|
|
@ -22,11 +23,3 @@ tailscale_serve_services:
|
|||
https:
|
||||
port: 443
|
||||
upstream: http://localhost:5050
|
||||
|
||||
# Kubernetes API server (TCP passthrough for mTLS)
|
||||
# NOTE: Port is dynamic with podman driver - check with:
|
||||
# ssh indri "kubectl config view --minify -o jsonpath='{.clusters[0].cluster.server}'"
|
||||
- name: svc:k8s
|
||||
tcp:
|
||||
port: 443
|
||||
upstream: tcp://localhost:44491
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ metadata:
|
|||
name: argocd-server-tailscale
|
||||
namespace: argocd
|
||||
annotations:
|
||||
tailscale.com/proxy-class: "crio-compat"
|
||||
tailscale.com/proxy-class: "default"
|
||||
spec:
|
||||
ingressClassName: tailscale
|
||||
defaultBackend:
|
||||
|
|
|
|||
|
|
@ -7,7 +7,7 @@ metadata:
|
|||
namespace: databases
|
||||
annotations:
|
||||
tailscale.com/hostname: "pg"
|
||||
tailscale.com/proxy-class: "crio-compat"
|
||||
tailscale.com/proxy-class: "default"
|
||||
spec:
|
||||
type: LoadBalancer
|
||||
loadBalancerClass: tailscale
|
||||
|
|
|
|||
|
|
@ -4,7 +4,7 @@ metadata:
|
|||
name: devpi-tailscale
|
||||
namespace: devpi
|
||||
annotations:
|
||||
tailscale.com/proxy-class: "crio-compat"
|
||||
tailscale.com/proxy-class: "default"
|
||||
spec:
|
||||
ingressClassName: tailscale
|
||||
defaultBackend:
|
||||
|
|
|
|||
|
|
@ -8,7 +8,7 @@ metadata:
|
|||
name: grafana-tailscale
|
||||
namespace: monitoring
|
||||
annotations:
|
||||
tailscale.com/proxy-class: "crio-compat"
|
||||
tailscale.com/proxy-class: "default"
|
||||
spec:
|
||||
ingressClassName: tailscale
|
||||
defaultBackend:
|
||||
|
|
|
|||
|
|
@ -60,3 +60,13 @@ Connects to PostgreSQL via internal k8s DNS:
|
|||
|
||||
The database is also accessible externally via Tailscale at:
|
||||
`pg.tail8d86e.ts.net:5432`
|
||||
|
||||
## Restore from Backup
|
||||
|
||||
If the database needs to be restored from a borgmatic backup:
|
||||
|
||||
1. List archives: `borgmatic list`
|
||||
2. Extract dump from archive using `borg extract` to `/tmp/restore`
|
||||
3. Restore with `pg_restore --clean --if-exists --no-owner --no-acl`
|
||||
4. Fix ownership - ensure user `miniflux` owns all tables, sequences, and types in the `public` schema (restore runs as `eblume`)
|
||||
5. Restart miniflux deployment
|
||||
|
|
|
|||
|
|
@ -4,7 +4,7 @@ metadata:
|
|||
name: miniflux-tailscale
|
||||
namespace: miniflux
|
||||
annotations:
|
||||
tailscale.com/proxy-class: "crio-compat"
|
||||
tailscale.com/proxy-class: "default"
|
||||
spec:
|
||||
ingressClassName: tailscale
|
||||
defaultBackend:
|
||||
|
|
|
|||
|
|
@ -6,7 +6,7 @@ Manifests for the Tailscale Kubernetes Operator, managed via ArgoCD.
|
|||
|
||||
- `operator.yaml` - Static manifest from https://github.com/tailscale/tailscale/tree/main/cmd/k8s-operator/deploy/manifests
|
||||
- Secret block removed from `operator.yaml` - managed separately via `secret.yaml.tpl`
|
||||
- Image reference changed to fully-qualified `docker.io/tailscale/k8s-operator:stable` for CRI-O compatibility
|
||||
- Image reference changed to fully-qualified `docker.io/tailscale/k8s-operator:stable`
|
||||
|
||||
## Prerequisites
|
||||
|
||||
|
|
@ -71,7 +71,7 @@ kubectl logs -n tailscale -l app.kubernetes.io/name=operator
|
|||
|------|-------------|
|
||||
| `kustomization.yaml` | Kustomize configuration for all manifests |
|
||||
| `operator.yaml` | Operator deployment, CRDs, RBAC (secret removed) |
|
||||
| `proxyclass.yaml` | ProxyClass with fully-qualified images for CRI-O |
|
||||
| `proxyclass.yaml` | ProxyClass with fully-qualified images |
|
||||
| `dnsconfig.yaml` | DNSConfig for cluster-to-tailnet name resolution |
|
||||
| `egress-forge.yaml` | Egress proxy for accessing forge on indri |
|
||||
| `secret.yaml.tpl` | 1Password template for OAuth credentials (manual) |
|
||||
|
|
@ -81,10 +81,10 @@ kubectl logs -n tailscale -l app.kubernetes.io/name=operator
|
|||
|
||||
- **TODO:** The OAuth secret (`operator-oauth`) is not managed by ArgoCD and must be applied
|
||||
manually. Future improvement: integrate with a secrets operator (e.g., External Secrets).
|
||||
- Services using the Tailscale LoadBalancer must reference the ProxyClass:
|
||||
- Services using the Tailscale LoadBalancer should reference the ProxyClass:
|
||||
```yaml
|
||||
annotations:
|
||||
tailscale.com/proxy-class: "crio-compat"
|
||||
tailscale.com/proxy-class: "default"
|
||||
```
|
||||
- The egress proxy for forge targets `indri.tail8d86e.ts.net` directly (not `forge.tail8d86e.ts.net`)
|
||||
because Tailscale Serve hostnames are virtual and only work via the Tailscale client.
|
||||
|
|
|
|||
|
|
@ -11,7 +11,7 @@ metadata:
|
|||
namespace: tailscale
|
||||
annotations:
|
||||
tailscale.com/tailnet-fqdn: indri.tail8d86e.ts.net
|
||||
tailscale.com/proxy-class: "crio-compat"
|
||||
tailscale.com/proxy-class: "default"
|
||||
spec:
|
||||
type: ExternalName
|
||||
externalName: placeholder
|
||||
|
|
|
|||
|
|
@ -1,17 +1,11 @@
|
|||
# ProxyClass: crio-compat
|
||||
# ProxyClass: default
|
||||
#
|
||||
# Why this exists:
|
||||
# CRI-O (the container runtime used by minikube) cannot resolve short image
|
||||
# names like "tailscale/tailscale:stable". It requires fully-qualified names
|
||||
# with an explicit registry prefix (e.g., "docker.io/tailscale/tailscale:stable").
|
||||
#
|
||||
# The Tailscale operator creates proxy pods (StatefulSets) for each LoadBalancer
|
||||
# Service or Ingress. By default, these pods use short image names which fail
|
||||
# on CRI-O with "ImageInspectError".
|
||||
# Specifies fully-qualified image names for Tailscale proxy pods.
|
||||
# This ensures consistent behavior across different container runtimes.
|
||||
#
|
||||
# Usage:
|
||||
# Add this annotation to any Tailscale Service or Ingress:
|
||||
# tailscale.com/proxy-class: "crio-compat"
|
||||
# tailscale.com/proxy-class: "default"
|
||||
#
|
||||
# This tells the operator to use the fully-qualified image names defined below
|
||||
# when creating the proxy pod for that resource.
|
||||
|
|
@ -19,7 +13,7 @@
|
|||
apiVersion: tailscale.com/v1alpha1
|
||||
kind: ProxyClass
|
||||
metadata:
|
||||
name: crio-compat
|
||||
name: default
|
||||
spec:
|
||||
statefulSet:
|
||||
pod:
|
||||
|
|
|
|||
|
|
@ -1,31 +0,0 @@
|
|||
#!/bin/bash
|
||||
# kubectl exec credential plugin for 1Password
|
||||
# Usage: kubectl-credential-1password <vault-id> <item-id> <cert-field> <key-field>
|
||||
#
|
||||
# Fetches client certificate and key from 1Password and outputs
|
||||
# ExecCredential JSON for kubectl authentication.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
VAULT_ID="$1"
|
||||
ITEM_ID="$2"
|
||||
CERT_FIELD="$3"
|
||||
KEY_FIELD="$4"
|
||||
|
||||
# Fetch credentials from 1Password (strips surrounding quotes from text fields)
|
||||
CLIENT_CERT=$(op --vault "$VAULT_ID" item get "$ITEM_ID" --fields "$CERT_FIELD" | sed 's/^"//; s/"$//')
|
||||
CLIENT_KEY=$(op --vault "$VAULT_ID" item get "$ITEM_ID" --fields "$KEY_FIELD" | sed 's/^"//; s/"$//')
|
||||
|
||||
# Output ExecCredential JSON
|
||||
# Note: jq is used to properly escape the PEM data for JSON
|
||||
jq -n \
|
||||
--arg cert "$CLIENT_CERT" \
|
||||
--arg key "$CLIENT_KEY" \
|
||||
'{
|
||||
"apiVersion": "client.authentication.k8s.io/v1beta1",
|
||||
"kind": "ExecCredential",
|
||||
"status": {
|
||||
"clientCertificateData": $cert,
|
||||
"clientKeyData": $key
|
||||
}
|
||||
}'
|
||||
59
mise-tasks/ensure-minikube-indri-kubectl-config
Executable file
59
mise-tasks/ensure-minikube-indri-kubectl-config
Executable file
|
|
@ -0,0 +1,59 @@
|
|||
#!/usr/bin/env bash
|
||||
#MISE description="Ensure kubectl config for minikube-indri is set up on this workstation"
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
CONFIG_DIR="$HOME/.kube/minikube-indri"
|
||||
CONFIG_FILE="$CONFIG_DIR/config.yml"
|
||||
|
||||
echo "Ensuring minikube-indri kubectl config..."
|
||||
|
||||
# Create directory if needed
|
||||
mkdir -p "$CONFIG_DIR"
|
||||
|
||||
# Fetch certificates from indri
|
||||
echo "Fetching certificates from indri..."
|
||||
CA_CERT=$(ssh indri 'cat ~/.minikube/ca.crt')
|
||||
CLIENT_CERT=$(ssh indri 'cat ~/.minikube/profiles/minikube/client.crt')
|
||||
CLIENT_KEY=$(ssh indri 'cat ~/.minikube/profiles/minikube/client.key')
|
||||
|
||||
# Write certificate files
|
||||
echo "$CA_CERT" > "$CONFIG_DIR/ca.crt"
|
||||
echo "$CLIENT_CERT" > "$CONFIG_DIR/client.crt"
|
||||
echo "$CLIENT_KEY" > "$CONFIG_DIR/client.key"
|
||||
chmod 600 "$CONFIG_DIR/client.key"
|
||||
|
||||
# Write kubeconfig
|
||||
cat > "$CONFIG_FILE" << EOF
|
||||
apiVersion: v1
|
||||
kind: Config
|
||||
clusters:
|
||||
- cluster:
|
||||
certificate-authority: $CONFIG_DIR/ca.crt
|
||||
server: https://k8s.tail8d86e.ts.net
|
||||
name: minikube-indri
|
||||
contexts:
|
||||
- context:
|
||||
cluster: minikube-indri
|
||||
user: minikube-indri
|
||||
name: minikube-indri
|
||||
current-context: minikube-indri
|
||||
users:
|
||||
- name: minikube-indri
|
||||
user:
|
||||
client-certificate: $CONFIG_DIR/client.crt
|
||||
client-key: $CONFIG_DIR/client.key
|
||||
EOF
|
||||
|
||||
echo "Config written to $CONFIG_FILE"
|
||||
|
||||
# Warn if KUBECONFIG doesn't include this file
|
||||
if [[ -z "${KUBECONFIG:-}" ]] || [[ ":$KUBECONFIG:" != *":$CONFIG_FILE:"* ]]; then
|
||||
echo ""
|
||||
echo "WARNING: KUBECONFIG does not include $CONFIG_FILE"
|
||||
echo "Add this to your shell config:"
|
||||
echo " export KUBECONFIG=\"\$KUBECONFIG:$CONFIG_FILE\""
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "Test with: kubectl --context=minikube-indri get nodes"
|
||||
208
plans/k8s-migration/P5.1_docker_migration.md
Normal file
208
plans/k8s-migration/P5.1_docker_migration.md
Normal file
|
|
@ -0,0 +1,208 @@
|
|||
# Phase 5.1: Migrate Minikube from QEMU2 to Docker Driver
|
||||
|
||||
**Goal**: Replace the qemu2 driver with docker to fix remote API access and simplify volume mounts
|
||||
|
||||
**Status**: Complete (2026-01-21) - Cluster running, ArgoCD deployed, apps synced
|
||||
|
||||
**Prerequisites**: [Phase 5](P5_devpi.complete.md) complete
|
||||
|
||||
---
|
||||
|
||||
## Background
|
||||
|
||||
### Original Problem (Podman → QEMU2)
|
||||
|
||||
During Phase 6 (Kiwix/Transmission migration), we discovered that the **podman driver has fundamental limitations** that prevent mounting external volumes:
|
||||
|
||||
1. **SMB CSI driver fails** with "Operation not permitted" - the rootless container lacks kernel-level mount capabilities
|
||||
2. **`minikube mount` fails** - 9p mount gets "permission denied" inside the podman VM
|
||||
3. **hostPath volumes** only work for paths inside the minikube container, not the macOS host
|
||||
|
||||
We migrated to QEMU2 to get a full VM with kernel capabilities.
|
||||
|
||||
### New Problem (QEMU2 → Docker)
|
||||
|
||||
The QEMU2 driver introduced a **new problem**: the Kubernetes API server is inside the VM at `192.168.105.2:6443`, and Tailscale's TCP proxy cannot forward to it properly:
|
||||
|
||||
- TCP connections succeed (nc -zv works)
|
||||
- TLS handshake times out
|
||||
- Root cause unknown, but likely related to Tailscale serve's handling of non-localhost upstreams
|
||||
|
||||
Additionally, the volume mount solution with QEMU2 was complex:
|
||||
- Required NFS mount from sifaka → indri
|
||||
- Then `minikube mount` to pass through to VM
|
||||
- Two LaunchAgents/LaunchDaemons for persistence
|
||||
- macOS GUI approval required for network access
|
||||
|
||||
### Why Docker?
|
||||
|
||||
The **docker driver** solves both problems:
|
||||
|
||||
1. **API Server on localhost**: Docker Desktop handles port forwarding from container to localhost automatically, so `tailscale serve --tcp=443 tcp://localhost:PORT` works
|
||||
|
||||
2. **Simpler volume mounts**: Docker Desktop has built-in macOS file sharing. Paths shared with Docker are accessible inside containers.
|
||||
|
||||
3. **Official Tailscale recommendation**: Tailscale's own [Kubernetes guide](https://tailscale.com/learn/managing-access-to-kubernetes-with-tailscale) uses minikube with the docker driver.
|
||||
|
||||
---
|
||||
|
||||
## Implementation Summary
|
||||
|
||||
### Infrastructure Changes
|
||||
|
||||
1. **Docker Desktop installed** (manual via `brew install --cask docker`)
|
||||
- Configured with 12GB memory in Docker Desktop settings
|
||||
- Kubernetes option disabled (using minikube instead)
|
||||
|
||||
2. **Docker minikube cluster created**:
|
||||
```bash
|
||||
minikube start \
|
||||
--driver=docker \
|
||||
--container-runtime=docker \
|
||||
--cpus=6 \
|
||||
--memory=11264 \
|
||||
--disk-size=200g \
|
||||
--apiserver-names=k8s.tail8d86e.ts.net,indri \
|
||||
--apiserver-port=6443 \
|
||||
--listen-address=0.0.0.0
|
||||
```
|
||||
|
||||
3. **Tailscale serve configured** for k8s API:
|
||||
- API server on localhost (port is dynamic with docker driver)
|
||||
- `tailscale serve --service=svc:k8s --tcp=443 tcp://localhost:<PORT>`
|
||||
|
||||
4. **Remote kubectl access working** from gilbert:
|
||||
- Created `mise-tasks/ensure-minikube-indri-kubectl-config` script
|
||||
- Fetches certs from indri and sets up `~/.kube/minikube-indri/config.yml`
|
||||
|
||||
### Ansible Roles Updated
|
||||
|
||||
- `ansible/roles/minikube/` - docker driver, removed qemu2/NFS/socket_vmnet
|
||||
- `ansible/roles/tailscale_serve/` - removed svc:k8s (minikube role handles dynamic port)
|
||||
- Containerd registry mirrors configured for zot pull-through cache
|
||||
|
||||
### ArgoCD Bootstrap
|
||||
|
||||
All apps deployed and synced from `feature/p5.1-qemu2-migration` branch:
|
||||
|
||||
| App | Status | Notes |
|
||||
|-----|--------|-------|
|
||||
| tailscale-operator | Healthy | Manages Tailscale ingresses |
|
||||
| argocd | Healthy | Self-managed |
|
||||
| cloudnative-pg | Healthy | PostgreSQL operator |
|
||||
| blumeops-pg | Progressing | PostgreSQL cluster starting |
|
||||
| grafana | Progressing | Needs grafana-admin secret |
|
||||
| grafana-config | Healthy | Dashboards and ingress |
|
||||
| miniflux | Progressing | Needs miniflux-config secret |
|
||||
| devpi | Progressing | Starting up |
|
||||
|
||||
### Secrets Still Needed
|
||||
|
||||
After PR merge, apply these secrets manually:
|
||||
|
||||
```bash
|
||||
# Grafana admin password
|
||||
op inject -i argocd/manifests/grafana-config/secret-admin.yaml.tpl | kubectl --context=minikube-indri apply -f -
|
||||
|
||||
# Miniflux config
|
||||
op inject -i argocd/manifests/miniflux/secret.yaml.tpl | kubectl --context=minikube-indri apply -f -
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Technical Notes
|
||||
|
||||
### API Server Port
|
||||
|
||||
With docker driver, the API server port is **dynamic** - Docker maps a random host port to 6443 inside the container.
|
||||
|
||||
The minikube ansible role queries the port after cluster start and configures tailscale serve accordingly.
|
||||
|
||||
### Registry Mirror Configuration
|
||||
|
||||
Containerd uses `/etc/containerd/certs.d/<registry>/hosts.toml` files. The ansible role configures mirrors for:
|
||||
- `registry.tail8d86e.ts.net` (private images)
|
||||
- `docker.io`
|
||||
- `ghcr.io`
|
||||
- `quay.io`
|
||||
|
||||
### ProxyClass Renamed
|
||||
|
||||
Changed from `crio-compat` to `default` - the old name was misleading since we're no longer using CRI-O.
|
||||
|
||||
### Volume Mounts for P6 (Kiwix/Transmission)
|
||||
|
||||
**Solution: Direct NFS from pods to sifaka** ✅ TESTED AND WORKING
|
||||
|
||||
Docker NATs outbound traffic through indri's LAN IP (192.168.1.50), so sifaka's NFS exports need to allow `192.168.1.0/24`.
|
||||
|
||||
Sifaka NFS exports configured:
|
||||
- `192.168.1.0/24` - Docker containers via indri NAT
|
||||
- `100.64.0.0/10` - Tailscale clients
|
||||
|
||||
Pods can mount NFS directly:
|
||||
```yaml
|
||||
volumes:
|
||||
- name: torrents
|
||||
nfs:
|
||||
server: sifaka
|
||||
path: /volume1/torrents
|
||||
```
|
||||
|
||||
No LaunchAgents, no `minikube mount`, no SMB CSI driver needed.
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
- [x] Docker Desktop installed and running on indri
|
||||
- [x] QEMU2 minikube deleted
|
||||
- [x] Docker minikube running (6 CPUs, 11GB RAM)
|
||||
- [x] API server accessible on localhost
|
||||
- [x] Tailscale serve configured for svc:k8s
|
||||
- [x] Remote kubectl access working from gilbert
|
||||
- [x] Ansible roles updated for docker driver
|
||||
- [x] socket_vmnet stopped
|
||||
- [x] ArgoCD deployed and synced
|
||||
- [x] All apps synced to feature branch
|
||||
- [x] Apply app secrets (grafana-admin, miniflux-db, devpi-root, eblume, borgmatic)
|
||||
- [x] Verify all apps healthy after secrets applied
|
||||
- [x] Miniflux database restored from borgmatic backup
|
||||
- [ ] Merge PR and reset apps to main branch
|
||||
- [ ] `mise run indri-services-check` passes
|
||||
|
||||
---
|
||||
|
||||
## Post-Merge Steps
|
||||
|
||||
After PR is merged:
|
||||
|
||||
```bash
|
||||
# Reset all blumeops apps to main branch
|
||||
argocd app set apps --revision main
|
||||
argocd app set argocd --revision main
|
||||
argocd app set blumeops-pg --revision main
|
||||
argocd app set devpi --revision main
|
||||
argocd app set grafana-config --revision main
|
||||
argocd app set miniflux --revision main
|
||||
argocd app set tailscale-operator --revision main
|
||||
|
||||
# Sync all apps
|
||||
argocd app sync apps
|
||||
argocd app sync argocd
|
||||
argocd app sync tailscale-operator
|
||||
argocd app sync blumeops-pg
|
||||
argocd app sync grafana-config
|
||||
argocd app sync miniflux
|
||||
argocd app sync devpi
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Rollback Plan
|
||||
|
||||
If Docker driver doesn't work:
|
||||
|
||||
1. Delete Docker minikube: `minikube delete`
|
||||
2. Recreate QEMU2 cluster (restore old ansible config from git)
|
||||
3. Accept the Tailscale TCP forwarding limitation and use SSH tunnel for remote kubectl
|
||||
|
|
@ -1,235 +0,0 @@
|
|||
# Phase 5.1: Migrate Minikube from Podman to QEMU2 Driver
|
||||
|
||||
**Goal**: Replace the podman driver with qemu2 to enable proper volume mounts (hostPath, NFS, SMB CSI)
|
||||
|
||||
**Status**: Planning
|
||||
|
||||
**Prerequisites**: [Phase 5](P5_devpi.complete.md) complete
|
||||
|
||||
---
|
||||
|
||||
## Background
|
||||
|
||||
During Phase 6 (Kiwix/Transmission migration), we discovered that the **podman driver has fundamental limitations** that prevent mounting external volumes:
|
||||
|
||||
1. **SMB CSI driver fails** with "Operation not permitted" - the rootless container lacks kernel-level mount capabilities
|
||||
2. **`minikube mount` fails** - 9p mount gets "permission denied" inside the podman VM
|
||||
3. **hostPath volumes** only work for paths inside the minikube container, not the macOS host
|
||||
|
||||
These are documented limitations of the podman driver, which is labeled "experimental" in the [minikube documentation](https://minikube.sigs.k8s.io/docs/drivers/podman/).
|
||||
|
||||
### Failed P6 Attempt
|
||||
|
||||
Branch `feature/p6-kiwix-transmission` contains the P6 implementation that was blocked by these issues. The manifests are complete and tested, but couldn't mount the torrents volume.
|
||||
|
||||
**What was tried:**
|
||||
- NFS volume mounts - failed due to missing CAP_SYS_ADMIN in podman container
|
||||
- SMB CSI driver (v1.17.0) - mount fails with EPERM (same root cause)
|
||||
- `minikube mount /Volumes/torrents:/Volumes/torrents` - 9p mount permission denied
|
||||
- hostPath PV pointing to `/Volumes/torrents` - path doesn't exist inside minikube container
|
||||
- Installing cifs-utils in minikube VM - still fails at kernel level
|
||||
|
||||
All of these failures trace back to the same root cause: the podman driver runs minikube in a rootless container that lacks the kernel capabilities required for filesystem mounts.
|
||||
|
||||
### Why QEMU2?
|
||||
|
||||
Multiple sources recommend QEMU2 as the best driver for Apple Silicon Macs:
|
||||
|
||||
> "Qemu emulator is the best option to run a Kubernetes Cluster using minikube on MAC arm64-based systems without any issues."
|
||||
> — [DevOpsCube](https://devopscube.com/minikube-mac/)
|
||||
|
||||
QEMU2 creates an actual VM (not a container), which has:
|
||||
- Full kernel capabilities for mounts
|
||||
- Proper 9p/virtio filesystem support
|
||||
- Native NFS client support
|
||||
|
||||
---
|
||||
|
||||
## Plan
|
||||
|
||||
### 1. Export Current State
|
||||
|
||||
Before destroying the cluster, capture the current state:
|
||||
|
||||
```bash
|
||||
# List all ArgoCD apps and their sync status
|
||||
argocd app list
|
||||
|
||||
# Backup any runtime state that matters (should be minimal - everything is in git)
|
||||
kubectl --context=minikube-indri get all --all-namespaces -o yaml > /tmp/k8s-backup.yaml
|
||||
```
|
||||
|
||||
### 2. Stop and Delete Podman Minikube
|
||||
|
||||
```bash
|
||||
# Stop the cluster
|
||||
minikube stop
|
||||
|
||||
# Delete the cluster and all data
|
||||
minikube delete
|
||||
|
||||
# Verify podman VM is cleaned up
|
||||
podman machine list
|
||||
```
|
||||
|
||||
### 3. Update Ansible Roles for QEMU2
|
||||
|
||||
The installation must be orchestrated via ansible, following the existing patterns for `podman` and `minikube` roles.
|
||||
|
||||
**Changes needed:**
|
||||
|
||||
1. **Update `ansible/roles/minikube/` role:**
|
||||
- Change driver from `podman` to `qemu2`
|
||||
- Add QEMU as a dependency (via Brewfile or role)
|
||||
- Optionally add socket_vmnet for full networking support
|
||||
- Update any driver-specific configuration
|
||||
|
||||
2. **Update `Brewfile`:**
|
||||
```ruby
|
||||
brew "qemu"
|
||||
# Optional: brew "socket_vmnet"
|
||||
```
|
||||
|
||||
3. **Update minikube start command in role:**
|
||||
```bash
|
||||
minikube start \
|
||||
--driver=qemu2 \
|
||||
--cpus=4 \
|
||||
--memory=8192 \
|
||||
--disk-size=50g \
|
||||
--container-runtime=containerd \
|
||||
--kubernetes-version=stable
|
||||
```
|
||||
|
||||
4. **Remove or update podman role** (may still be useful for container builds)
|
||||
|
||||
### 4. Run Ansible to Create QEMU2 Cluster
|
||||
|
||||
```bash
|
||||
# Run the updated minikube role
|
||||
mise run provision-indri -- --tags minikube
|
||||
|
||||
# Verify cluster is running
|
||||
minikube status
|
||||
kubectl get nodes
|
||||
```
|
||||
|
||||
### 5. Configure Host Path Access
|
||||
|
||||
With QEMU2, we need to either:
|
||||
|
||||
**Option A: Use `minikube mount` (9p)**
|
||||
```bash
|
||||
# Start persistent mount (run in background or via launchd)
|
||||
minikube mount /Volumes/torrents:/Volumes/torrents &
|
||||
```
|
||||
|
||||
**Option B: Use NFS export from macOS**
|
||||
```bash
|
||||
# Add NFS export on macOS
|
||||
echo "/Volumes/torrents -alldirs -mapall=$(id -u):$(id -g) -network 192.168.0.0 -mask 255.255.0.0" | sudo tee -a /etc/exports
|
||||
sudo nfsd restart
|
||||
|
||||
# In k8s, use NFS volume type directly
|
||||
```
|
||||
|
||||
### 6. Test Volume Mount with Test Pod
|
||||
|
||||
Create a test pod that mounts the torrents volume:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: volume-test
|
||||
namespace: default
|
||||
spec:
|
||||
containers:
|
||||
- name: test
|
||||
image: busybox
|
||||
command: ["sh", "-c", "ls -la /data && sleep 3600"]
|
||||
volumeMounts:
|
||||
- name: torrents
|
||||
mountPath: /data
|
||||
volumes:
|
||||
- name: torrents
|
||||
hostPath:
|
||||
path: /Volumes/torrents
|
||||
type: Directory
|
||||
```
|
||||
|
||||
Verify:
|
||||
```bash
|
||||
kubectl apply -f volume-test.yaml
|
||||
kubectl logs volume-test
|
||||
kubectl exec volume-test -- ls -la /data
|
||||
```
|
||||
|
||||
### 7. Redeploy ArgoCD and Existing Apps
|
||||
|
||||
```bash
|
||||
# Re-add ArgoCD
|
||||
kubectl create namespace argocd
|
||||
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
|
||||
|
||||
# Wait for ArgoCD to be ready
|
||||
kubectl wait --for=condition=available deployment/argocd-server -n argocd --timeout=300s
|
||||
|
||||
# Re-configure ArgoCD (repo credentials, etc.)
|
||||
# ... follow P1 setup steps ...
|
||||
|
||||
# Sync all apps
|
||||
argocd app sync apps
|
||||
```
|
||||
|
||||
### 8. Verify All Services
|
||||
|
||||
```bash
|
||||
# Run health check
|
||||
mise run indri-services-check
|
||||
|
||||
# Verify each k8s service
|
||||
argocd app list
|
||||
kubectl get pods --all-namespaces
|
||||
```
|
||||
|
||||
### 9. Clean Up Test Pod
|
||||
|
||||
```bash
|
||||
kubectl delete pod volume-test
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
- [ ] Podman minikube deleted
|
||||
- [ ] QEMU2 minikube running
|
||||
- [ ] `minikube mount` or NFS working
|
||||
- [ ] Test pod can read `/Volumes/torrents`
|
||||
- [ ] ArgoCD redeployed and synced
|
||||
- [ ] All existing apps healthy (grafana, miniflux, devpi, etc.)
|
||||
- [ ] PostgreSQL cluster healthy
|
||||
- [ ] Test pod deleted
|
||||
- [ ] `mise run indri-services-check` passes (except intentionally offline services)
|
||||
|
||||
---
|
||||
|
||||
## Rollback Plan
|
||||
|
||||
If QEMU2 doesn't work:
|
||||
|
||||
1. Delete QEMU2 cluster: `minikube delete`
|
||||
2. Recreate podman cluster following P0/P1 steps
|
||||
3. Redeploy apps from git
|
||||
|
||||
All state is in git, so cluster recreation is straightforward.
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- The QEMU2 VM will use more resources than podman (actual VM vs container)
|
||||
- First boot may be slower due to VM initialization
|
||||
- socket_vmnet provides better networking but requires sudo setup
|
||||
- Consider creating a LaunchAgent for `minikube mount` if using that approach
|
||||
|
|
@ -2,9 +2,9 @@
|
|||
|
||||
**Goal**: Migrate kiwix-serve and transmission torrent daemon to k8s with shared storage
|
||||
|
||||
**Status**: BLOCKED - waiting for [Phase 5.1](P5.1_qemu2_migration.md) (QEMU2 migration)
|
||||
**Status**: Ready to implement
|
||||
|
||||
**Prerequisites**: [Phase 5.1](P5.1_qemu2_migration.md) complete (minikube on QEMU2 driver)
|
||||
**Prerequisites**: [Phase 5.1](P5.1_docker_migration.md) complete (minikube on docker driver)
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -62,19 +62,18 @@ New architecture in k8s:
|
|||
|
||||
## Architecture Decisions
|
||||
|
||||
### Storage: SMB on Sifaka (or NFS after QEMU2 migration)
|
||||
### Storage: Direct NFS to Sifaka ✅ TESTED
|
||||
|
||||
**Note:** The original plan chose SMB over NFS, but both failed with podman driver. After QEMU2 migration, either should work. SMB is still preferred for:
|
||||
- Native Synology SMB support with good macOS compatibility
|
||||
- ReadWriteMany access mode for concurrent pod access
|
||||
- SMB CSI driver already mirrored to forge
|
||||
**Solution:** Direct NFS volume mounts from pods to sifaka. No SMB CSI driver or `minikube mount` needed.
|
||||
|
||||
**Alternative after QEMU2:** NFS may be simpler with `minikube mount` or direct NFS volume type.
|
||||
With the docker driver, minikube containers NAT outbound traffic through indri's LAN IP (192.168.1.50). Sifaka's NFS exports are configured to allow:
|
||||
- `192.168.1.0/24` - Docker containers via indri NAT
|
||||
- `100.64.0.0/10` - Tailscale clients
|
||||
|
||||
**Storage path:** `/volume1/torrents/` on sifaka (SMB share name: `torrents`)
|
||||
**Storage path:** `/volume1/torrents/` on sifaka (NFS export)
|
||||
- General-purpose torrent download directory
|
||||
- Contains ZIM files, Linux ISOs, and whatever else users download
|
||||
- Accessed via SMB credentials stored in k8s Secret
|
||||
- Accessed via native k8s NFS volume (no credentials needed - IP-based access)
|
||||
|
||||
**No backup needed:**
|
||||
- Sifaka is RAID 5/6, already the backup target
|
||||
|
|
@ -142,49 +141,19 @@ This allows adding new ZIM archives by:
|
|||
|
||||
## Prerequisites (Manual Steps)
|
||||
|
||||
### 1. Configure SMB Share on Sifaka
|
||||
### 1. Configure NFS Export on Sifaka
|
||||
|
||||
**Status: DONE** - The `torrents` shared folder has been created at `/volume1/torrents`.
|
||||
**Status: DONE** - The `torrents` shared folder exists at `/volume1/torrents` with NFS exports allowing:
|
||||
- `192.168.1.0/24` - Docker containers via indri NAT
|
||||
- `100.64.0.0/10` - Tailscale clients
|
||||
|
||||
### 2. Create Dedicated Synology User for Kubernetes (USER ACTION REQUIRED)
|
||||
|
||||
Create a dedicated Synology user for k8s SMB access (do not use personal account):
|
||||
|
||||
On Synology DSM (Control Panel → User & Group):
|
||||
1. Create new user: `k8s-smb` (or similar)
|
||||
- Set a strong password
|
||||
- No admin privileges needed
|
||||
- Deny access to all applications (only needs file services)
|
||||
2. Set permissions on the `torrents` share:
|
||||
- Give `k8s-smb` user Read/Write access
|
||||
- Remove or limit other user access as appropriate
|
||||
3. Store credentials in 1Password:
|
||||
- Vault: `vg6xf6vvfmoh5hqjjhlhbeoaie` (blumeops vault)
|
||||
- Item name: `synology-smb-k8s`
|
||||
- Fields: `username` (k8s-smb), `password`
|
||||
|
||||
### 3. Mirror SMB CSI Driver Helm Chart to Forge (USER ACTION REQUIRED)
|
||||
|
||||
Mirror the SMB CSI driver chart to forge for GitOps deployment:
|
||||
|
||||
```bash
|
||||
# Clone the upstream chart repo
|
||||
cd ~/code/3rd
|
||||
git clone https://github.com/kubernetes-csi/csi-driver-smb.git
|
||||
cd csi-driver-smb
|
||||
|
||||
# Push to forge mirror
|
||||
git remote add forge ssh://forgejo@indri.tail8d86e.ts.net:2200/eblume/csi-driver-smb.git
|
||||
git push forge --all --tags
|
||||
```
|
||||
|
||||
### 4. Copy Existing Downloads to Sifaka
|
||||
### 2. Copy Existing Downloads to Sifaka
|
||||
|
||||
Before migration, copy existing downloads to avoid re-downloading ~138GB:
|
||||
|
||||
```bash
|
||||
# From indri - mount the SMB share via Finder or command line
|
||||
open smb://sifaka/torrents
|
||||
# From indri - mount the NFS share
|
||||
sudo mount -t nfs sifaka:/volume1/torrents /Volumes/torrents
|
||||
|
||||
# Then rsync (adjust mount path as needed)
|
||||
rsync -avP ~/transmission/ /Volumes/torrents/
|
||||
|
|
@ -193,69 +162,21 @@ rsync -avP ~/transmission/ /Volumes/torrents/
|
|||
ls -la /Volumes/torrents/*.zim
|
||||
```
|
||||
|
||||
### 5. Store SMB Credentials in 1Password
|
||||
|
||||
**Note:** This is covered in step 2 above. The 1Password item should be:
|
||||
- Vault: `vg6xf6vvfmoh5hqjjhlhbeoaie` (blumeops vault)
|
||||
- Item name: `synology-smb-k8s`
|
||||
- Fields: `username` (k8s-smb), `password`
|
||||
|
||||
---
|
||||
|
||||
## Steps
|
||||
|
||||
### 1. Deploy SMB CSI Driver via ArgoCD
|
||||
### 1. Create Shared NFS PersistentVolume
|
||||
|
||||
**File:** `argocd/manifests/smb-csi/values.yaml`
|
||||
This PV is shared between transmission and kiwix namespaces. Uses direct NFS - no CSI driver needed.
|
||||
|
||||
```yaml
|
||||
# Minimal values - defaults are generally fine
|
||||
controller:
|
||||
replicas: 1
|
||||
```
|
||||
|
||||
**File:** `argocd/apps/smb-csi.yaml`
|
||||
|
||||
```yaml
|
||||
apiVersion: argoproj.io/v1alpha1
|
||||
kind: Application
|
||||
metadata:
|
||||
name: smb-csi
|
||||
namespace: argocd
|
||||
spec:
|
||||
project: default
|
||||
sources:
|
||||
# Helm chart from forge mirror
|
||||
- repoURL: ssh://forgejo@indri.tail8d86e.ts.net:2200/eblume/csi-driver-smb.git
|
||||
targetRevision: v1.17.0
|
||||
path: charts/csi-driver-smb
|
||||
helm:
|
||||
releaseName: csi-driver-smb
|
||||
valueFiles:
|
||||
- $values/argocd/manifests/smb-csi/values.yaml
|
||||
# Values from our git repo
|
||||
- repoURL: ssh://forgejo@indri.tail8d86e.ts.net:2200/eblume/blumeops.git
|
||||
targetRevision: main
|
||||
ref: values
|
||||
destination:
|
||||
server: https://kubernetes.default.svc
|
||||
namespace: kube-system
|
||||
syncPolicy:
|
||||
syncOptions:
|
||||
- CreateNamespace=true
|
||||
```
|
||||
|
||||
### 2. Create Shared SMB PersistentVolume
|
||||
|
||||
This PV is shared between transmission and kiwix namespaces.
|
||||
|
||||
**File:** `argocd/manifests/torrent/pv-smb.yaml`
|
||||
**File:** `argocd/manifests/torrent/pv-nfs.yaml`
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: PersistentVolume
|
||||
metadata:
|
||||
name: torrents-smb-pv
|
||||
name: torrents-nfs-pv
|
||||
spec:
|
||||
capacity:
|
||||
storage: 1Ti
|
||||
|
|
@ -263,43 +184,12 @@ spec:
|
|||
- ReadWriteMany
|
||||
persistentVolumeReclaimPolicy: Retain
|
||||
storageClassName: ""
|
||||
mountOptions:
|
||||
- dir_mode=0777
|
||||
- file_mode=0777
|
||||
- uid=1000
|
||||
- gid=1000
|
||||
- noperm
|
||||
- mfsymlinks
|
||||
- cache=strict
|
||||
- noserverino # Required to prevent data corruption
|
||||
csi:
|
||||
driver: smb.csi.k8s.io
|
||||
volumeHandle: torrents-smb-pv
|
||||
volumeAttributes:
|
||||
source: //sifaka/torrents
|
||||
nodeStageSecretRef:
|
||||
name: smbcreds
|
||||
namespace: torrent
|
||||
nfs:
|
||||
server: sifaka
|
||||
path: /volume1/torrents
|
||||
```
|
||||
|
||||
**File:** `argocd/manifests/torrent/secret-smb.yaml.tpl`
|
||||
|
||||
```yaml
|
||||
# Template - apply manually with credentials from 1Password
|
||||
# kubectl --context=minikube create secret generic smbcreds \
|
||||
# --namespace torrent \
|
||||
# --from-literal=username=$(op read "op://vg6xf6vvfmoh5hqjjhlhbeoaie/synology-smb-k8s/username") \
|
||||
# --from-literal=password=$(op read "op://vg6xf6vvfmoh5hqjjhlhbeoaie/synology-smb-k8s/password")
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: smbcreds
|
||||
namespace: torrent
|
||||
type: Opaque
|
||||
stringData:
|
||||
username: "{{ op://vg6xf6vvfmoh5hqjjhlhbeoaie/synology-smb-k8s/username }}"
|
||||
password: "{{ op://vg6xf6vvfmoh5hqjjhlhbeoaie/synology-smb-k8s/password }}"
|
||||
```
|
||||
No secrets needed - NFS uses IP-based access control configured on sifaka.
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -319,7 +209,7 @@ spec:
|
|||
accessModes:
|
||||
- ReadWriteMany
|
||||
storageClassName: ""
|
||||
volumeName: torrents-smb-pv
|
||||
volumeName: torrents-nfs-pv
|
||||
resources:
|
||||
requests:
|
||||
storage: 1Ti
|
||||
|
|
@ -439,8 +329,7 @@ apiVersion: kustomize.config.k8s.io/v1beta1
|
|||
kind: Kustomization
|
||||
namespace: torrent
|
||||
resources:
|
||||
- pv-smb.yaml
|
||||
- secret-smb.yaml.tpl
|
||||
- pv-nfs.yaml
|
||||
- pvc.yaml
|
||||
- deployment.yaml
|
||||
- service.yaml
|
||||
|
|
@ -473,7 +362,7 @@ spec:
|
|||
|
||||
## Kiwix Service
|
||||
|
||||
### 3. Create Kiwix PVC (References Same PV)
|
||||
### 2. Create Kiwix PVC (References Same PV)
|
||||
|
||||
**File:** `argocd/manifests/kiwix/pvc.yaml`
|
||||
|
||||
|
|
@ -487,7 +376,7 @@ spec:
|
|||
accessModes:
|
||||
- ReadWriteMany # Need write for the sync sidecar to work
|
||||
storageClassName: ""
|
||||
volumeName: torrents-smb-pv
|
||||
volumeName: torrents-nfs-pv
|
||||
resources:
|
||||
requests:
|
||||
storage: 1Ti
|
||||
|
|
@ -1096,10 +985,7 @@ If migration fails:
|
|||
|------|---------|
|
||||
| **Transmission (torrent namespace)** | |
|
||||
| `argocd/apps/torrent.yaml` | ArgoCD Application for transmission |
|
||||
| `argocd/apps/smb-csi.yaml` | ArgoCD Application for SMB CSI driver |
|
||||
| `argocd/manifests/smb-csi/values.yaml` | SMB CSI driver Helm values |
|
||||
| `argocd/manifests/torrent/pv-smb.yaml` | Shared SMB PersistentVolume |
|
||||
| `argocd/manifests/torrent/secret-smb.yaml.tpl` | SMB credentials secret template |
|
||||
| `argocd/manifests/torrent/pv-nfs.yaml` | Shared NFS PersistentVolume |
|
||||
| `argocd/manifests/torrent/pvc.yaml` | Transmission PVC |
|
||||
| `argocd/manifests/torrent/deployment.yaml` | Transmission deployment |
|
||||
| `argocd/manifests/torrent/service.yaml` | Transmission service |
|
||||
|
|
@ -1134,11 +1020,10 @@ If migration fails:
|
|||
|
||||
## Verification Checklist
|
||||
|
||||
- [x] SMB share configured on sifaka (`/volume1/torrents`)
|
||||
- [ ] Dedicated Synology user (`k8s-smb`) created for k8s access
|
||||
- [ ] SMB CSI driver deployed to k8s
|
||||
- [x] NFS export configured on sifaka (`/volume1/torrents`)
|
||||
- [x] NFS exports allow 192.168.1.0/24 and 100.64.0.0/10
|
||||
- [x] Direct NFS mount from pod tested and working
|
||||
- [ ] Existing downloads copied to sifaka
|
||||
- [ ] SMB credentials secret created in k8s (using `k8s-smb` user)
|
||||
- [ ] Transmission pod running in k8s (`torrent` namespace)
|
||||
- [ ] https://torrent.tail8d86e.ts.net accessible (web UI)
|
||||
- [ ] Can add torrents manually via web UI
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue