## Summary - Split k8s migration plan into phases folder for easier navigation - Added `tag:k8s` to Pulumi ACLs for Kubernetes workloads - Phase 1 work in progress ## Phase 1 Goals - Tailscale Kubernetes Operator - CloudNativePG Operator - PostgreSQL cluster for future app migrations ## Deployment and Testing - [ ] Review Phase 1 plan - [ ] `mise run tailnet-preview` to verify ACL changes - [ ] `mise run tailnet-up` to apply ACL changes - [ ] Create Tailscale OAuth client (manual) - [ ] Deploy operators and PostgreSQL cluster 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/29
35 KiB
Phase 0: Foundation (Complete)
Goal: Container registry + minikube cluster without disrupting existing services
Status: Complete
Important: Tailscale Service Creation Order
WARNING: You MUST create services in the Tailscale admin console BEFORE running
tailscale servecommands via ansible. If you runtailscale serve --service svc:foobefore the service exists in the admin console, the local config will be in a bad state.To fix a misconfigured service:
tailscale serve --service svc:foo resetThen create the service in admin console and try again.
Step 0.1: Update Pulumi ACLs (BEFORE Tailscale serve)
Files to modify:
pulumi/policy.hujson
Changes:
- Add new tag to
tagOwnerssection (around line 104, after"tag:feed"):
"tag:registry": ["autogroup:admin", "tag:blumeops"],
- Add test cases to
testssection:- Update Erich's accept list (around line 111) to include registry:
"accept": ["tag:grafana:443", "tag:kiwix:443", "tag:feed:443", "tag:loki:3100", "tag:pg:5432", "tag:homelab:22", "tag:registry:443"],- Update Allison's deny list (around line 117) to deny registry:
"deny": ["tag:grafana:443", "tag:loki:3100", "tag:nas:445", "tag:registry:443"],
Note:
- No member grant needed - admins have full access via wildcard, members don't need registry
tag:k8sis added later in Phase 1 when the Tailscale Kubernetes Operator is deployed- Zot supports htpasswd auth if we later need finer-grained control
Testing:
mise run tailnet-preview # Review changes - should show new tag
mise run tailnet-up # Apply changes
Implementation Details:
- Also need to add
"tag:registry"to indri's tags inpulumi/__main__.py(theDeviceTagsresource), not just define it inpolicy.hujson. The policy file defines the tag ownership rules, but the device tags are managed separately in the Python code.
Step 0.2: Create Tailscale Services in Admin Console (MANUAL)
CRITICAL: Do this BEFORE running any ansible that calls
tailscale serve
- Go to https://login.tailscale.com/admin/services
- Create service
registrywith:- Port: 443 (HTTPS)
- Host: indri
Implementation Details:
- Tag is applied to indri via Pulumi in Step 0.1, not manually in admin console.
Verification:
# Service should appear (even if not yet serving)
tailscale status | grep registry
Step 0.3: Create Zot Registry Ansible Role
Note: Zot is NOT in homebrew (no formula or tap). Clone to ~/code/3rd/ on indri and build from source (requires Go).
Prerequisites on indri (ALREADY COMPLETED):
# Clone zot from forge mirror (use localhost:3001 - hairpinning doesn't work on indri)
ssh indri 'git clone http://localhost:3001/eblume/zot.git ~/code/3rd/zot'
# Set up Go via mise (creates mise.toml in repo directory)
ssh indri 'cd ~/code/3rd/zot && mise use go@1.25'
# Build (creates bin/zot-darwin-arm64, ~183MB)
ssh indri 'cd ~/code/3rd/zot && mise x -- make binary'
# Verify binary exists
ssh indri 'ls -la ~/code/3rd/zot/bin/zot-darwin-arm64'
Build verified: Binary at ~/code/3rd/zot/bin/zot-darwin-arm64 (183MB, ARM64 native).
New files:
ansible/roles/zot/
├── defaults/main.yml
├── tasks/main.yml
├── templates/
│ ├── config.json.j2
│ └── zot.plist.j2
└── handlers/main.yml
Key configuration (defaults/main.yml):
zot_repo_dir: "/Users/erichblume/code/3rd/zot"
zot_binary: "{{ zot_repo_dir }}/bin/zot-darwin-arm64"
zot_data_dir: "/Users/erichblume/zot"
zot_config_dir: "/Users/erichblume/.config/zot"
zot_port: 5000
zot_log_dir: "/Users/erichblume/Library/Logs"
# Pull-through cache registries (on-demand sync)
zot_sync_registries:
- name: docker.io
url: https://registry-1.docker.io
- name: ghcr.io
url: https://ghcr.io
- name: quay.io
url: https://quay.io
Zot config.json template (key sections):
{
"storage": {
"rootDirectory": "/Users/erichblume/zot"
},
"http": {
"address": "0.0.0.0",
"port": "5000"
},
"extensions": {
"sync": {
"enable": true,
"registries": [
{
"urls": ["https://registry-1.docker.io"],
"content": [{"prefix": "**"}],
"onDemand": true,
"tlsVerify": true
},
{
"urls": ["https://ghcr.io"],
"content": [{"prefix": "**"}],
"onDemand": true,
"tlsVerify": true
},
{
"urls": ["https://quay.io"],
"content": [{"prefix": "**"}],
"onDemand": true,
"tlsVerify": true
}
]
}
}
}
Two modes of operation:
-
Pull-through cache (automatic): When you pull
registry.tail8d86e.ts.net/docker.io/library/nginx:latest, Zot fetches from Docker Hub and caches locally. Subsequent pulls are local. -
Private images (manual push): Push your own images to any path NOT matching a sync prefix:
# From gilbert (after building) podman push myapp:v1 registry.tail8d86e.ts.net/blumeops/myapp:v1
Namespace convention:
registry.tail8d86e.ts.net/docker.io/*→ cached from Docker Hubregistry.tail8d86e.ts.net/ghcr.io/*→ cached from GHCRregistry.tail8d86e.ts.net/quay.io/*→ cached from Quayregistry.tail8d86e.ts.net/blumeops/*→ private images (built by you/Woodpecker)
LaunchAgent template (zot.plist.j2):
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "...">
<plist version="1.0">
<dict>
<key>Label</key>
<string>mcquack.eblume.zot</string>
<key>ProgramArguments</key>
<array>
<!-- ABSOLUTE PATH to built binary in ~/code/3rd/zot -->
<string>{{ zot_binary }}</string>
<string>serve</string>
<string>{{ zot_config_dir }}/config.json</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>{{ zot_log_dir }}/mcquack.zot.out.log</string>
<key>StandardErrorPath</key>
<string>{{ zot_log_dir }}/mcquack.zot.err.log</string>
</dict>
</plist>
Handlers (handlers/main.yml):
- name: Restart zot
ansible.builtin.shell: |
launchctl unload ~/Library/LaunchAgents/mcquack.eblume.zot.plist 2>/dev/null || true
launchctl load ~/Library/LaunchAgents/mcquack.eblume.zot.plist
changed_when: true
Tasks should notify handler on config change:
- name: Deploy zot config
ansible.builtin.template:
src: config.json.j2
dest: "{{ zot_config_dir }}/config.json"
notify: Restart zot
Testing (after deploying role):
# Check LaunchAgent is running
ssh indri 'launchctl list | grep zot'
# Check zot is responding
ssh indri 'curl -s http://localhost:5000/v2/_catalog'
# Expected: {"repositories":[]}
# Check logs for errors
ssh indri 'tail -20 ~/Library/Logs/mcquack.zot.err.log'
# Test pull-through cache via curl (podman not installed until Step 0.8)
ssh indri 'curl -s http://localhost:5000/v2/docker.io/library/alpine/manifests/latest -H "Accept: application/vnd.oci.image.manifest.v1+json"'
# Should return manifest JSON (triggers cache fetch from Docker Hub)
ssh indri 'curl -s http://localhost:5000/v2/_catalog'
# Expected: {"repositories":["docker.io/library/alpine"]}
Implementation Details:
- Changed port from 5000 to 5050 because macOS ControlCenter (AirPlay Receiver) uses port 5000 by default.
- Fixed sync config: use
"content": [{"prefix": "**", "destination": "/{{ registry.name }}"}]instead of"prefix": "{{ registry.name }}/**". The destination rewrites the local path, while prefix**matches all upstream repos.
Step 0.4: Add Zot to Tailscale Serve
Files to modify:
ansible/roles/tailscale_serve/defaults/main.yml
Changes:
# Add to tailscale_serve_services list
- name: svc:registry
https:
port: 443
upstream: http://localhost:5000
Testing:
# Deploy tailscale serve config
mise run provision-indri -- --tags tailscale-serve
# Verify from gilbert (not indri - hairpinning doesn't work)
curl -s https://registry.tail8d86e.ts.net/v2/_catalog
# Expected: {"repositories":["docker.io/library/alpine"]} (from Step 0.3 test)
# Test private image push from gilbert
podman pull alpine:latest
podman tag alpine:latest registry.tail8d86e.ts.net/blumeops/test:v1
podman push registry.tail8d86e.ts.net/blumeops/test:v1
curl -s https://registry.tail8d86e.ts.net/v2/_catalog
# Expected: {"repositories":["blumeops/test","docker.io/library/alpine"]}
Implementation Details:
- Changed upstream port from 5000 to 5050 (see Step 0.3 implementation details).
- After running
tailscale serve, the service must be approved in Tailscale admin console at https://login.tailscale.com/admin/services before it becomes accessible. - Podman needed on gilbert for testing - added to Brewfile. Requires
podman machine init && podman machine startafter install.
Step 0.5: Create Zot Metrics Role
New files:
ansible/roles/zot_metrics/
├── defaults/main.yml
├── tasks/main.yml
├── templates/
│ ├── zot-metrics.sh.j2
│ └── zot-metrics.plist.j2
└── handlers/main.yml
Metrics script pattern (zot-metrics.sh.j2):
#!/bin/bash
# Collect Zot registry metrics for Prometheus textfile collector
set -euo pipefail
METRICS_FILE="/opt/homebrew/var/node_exporter/textfile/zot.prom"
TEMP_FILE="${METRICS_FILE}.tmp"
# Check if zot is up
if curl -sf http://localhost:5000/v2/_catalog > /dev/null 2>&1; then
echo "zot_up 1" > "$TEMP_FILE"
else
echo "zot_up 0" > "$TEMP_FILE"
fi
mv "$TEMP_FILE" "$METRICS_FILE"
Note: Start with just zot_up for now. Additional metrics (storage usage, cache stats) can be added later after reviewing zot's metrics endpoint.
Testing:
# Deploy metrics role
mise run provision-indri -- --tags zot_metrics
# Check metrics file exists and is updated
ssh indri 'cat /opt/homebrew/var/node_exporter/textfile/zot.prom'
# Expected: zot_up 1
# Verify metrics appear in Prometheus (after a scrape cycle)
curl -s "http://indri:9090/api/v1/query?query=zot_up" | jq '.data.result[0].value[1]'
# Expected: "1"
Step 0.6: Add Zot Log Collection to Alloy
Files to modify:
ansible/roles/alloy/defaults/main.yml
Changes:
Add to the alloy_mcquack_logs list:
- path: /Users/erichblume/Library/Logs/mcquack.zot.out.log
service: zot
stream: stdout
- path: /Users/erichblume/Library/Logs/mcquack.zot.err.log
service: zot
stream: stderr
Testing:
# Deploy alloy config (handler restarts alloy automatically if config changed)
mise run provision-indri -- --tags alloy
# Wait a minute, then check Loki for zot logs
# In Grafana Explore, query: {service="zot"}
Step 0.7: Update indri-services-check Script
Files to modify:
mise-tasks/indri-services-check
Changes to add:
# Add after existing service checks (around line 55)
check_service "zot" "ssh indri 'launchctl list | grep zot | grep -v \"^-\"'"
check_service "zot-metrics" "ssh indri 'launchctl list | grep zot-metrics | grep -v \"^-\"'"
# Add to HTTP endpoints section (around line 65)
check_http "Zot Registry" "http://indri:5000/v2/_catalog"
# Add metrics file check
check_service "Zot metrics" "ssh indri 'test -f /opt/homebrew/var/node_exporter/textfile/zot.prom'"
Testing:
# Run the health check
mise run indri-services-check
# Expected output includes:
# zot... OK
# zot-metrics... OK
# Zot Registry... OK
# Zot metrics... OK
Implementation Details:
- Used Tailscale service URL (
https://registry.tail8d86e.ts.net/v2/_catalog) instead of internal endpoint to verify full path works.
Step 0.8: Install and Configure Podman on Indri
New files:
ansible/roles/podman/
├── tasks/main.yml
└── handlers/main.yml
Tasks (tasks/main.yml):
- name: Install podman via homebrew
community.general.homebrew:
name: podman
state: present
- name: Initialize podman machine (if not exists)
ansible.builtin.command:
cmd: podman machine init --cpus 4 --memory 8192 --disk-size 220
register: podman_init
changed_when: podman_init.rc == 0
failed_when: podman_init.rc not in [0, 125] # 125 = already exists
- name: Start podman machine
ansible.builtin.command:
cmd: podman machine start
register: podman_start
changed_when: "'started successfully' in podman_start.stdout"
failed_when: podman_start.rc not in [0, 125] # 125 = already running
Testing:
# Deploy podman role
mise run provision-indri -- --tags podman
# Verify podman is working
ssh indri 'podman info'
ssh indri 'podman run --rm hello-world'
Implementation Details:
- KNOWN ISSUE:
podman machine initandpodman machine starthave reliability issues when run via Ansible/SSH. The machine sometimes gets stuck in "Starting" state due to a race condition (see https://github.com/containers/podman/issues/16945). Apple Hypervisor may also require GUI session context. - WORKAROUND: If the machine fails to start via Ansible, manually run on indri:
podman machine rm -f podman-machine-default podman machine init --cpus 4 --memory 8192 --disk-size 220 podman machine start - LaunchAgent approach was attempted but didn't resolve the issue reliably.
- TODO: Investigate proper automation solution for reliable podman machine management.
Step 0.9: Install and Configure Minikube
New files:
ansible/roles/minikube/
├── defaults/main.yml
├── tasks/main.yml
└── handlers/main.yml
Defaults:
minikube_cpus: 4
minikube_memory: 8192
minikube_disk_size: "200g"
minikube_driver: podman
minikube_container_runtime: cri-o
Note on storage: The disk-size is for node-local storage only (container images, emptyDir, local PVs). Pods can also mount external storage:
- hostPath - indri filesystem (e.g.,
~/transmission/for kiwix ZIM files) - NFS - sifaka volumes (Synology supports NFS natively, easiest for k8s)
- SMB/CIFS - requires csi-driver-smb; sifaka currently uses SMB for desktop mounts
Tasks:
- name: Install minikube via homebrew
community.general.homebrew:
name: minikube
state: present
- name: Check if minikube cluster exists
ansible.builtin.command:
cmd: minikube status --format='{{.Host}}'
register: minikube_status
changed_when: false
failed_when: false
- name: Start minikube cluster
ansible.builtin.command:
cmd: >
minikube start
--driver={{ minikube_driver }}
--container-runtime={{ minikube_container_runtime }}
--cpus={{ minikube_cpus }}
--memory={{ minikube_memory }}
--disk-size={{ minikube_disk_size }}
when: minikube_status.rc != 0 or 'Running' not in minikube_status.stdout
Testing:
# Deploy minikube role
mise run provision-indri -- --tags minikube
# Verify cluster is running
ssh indri 'minikube status'
# Expected: host: Running, kubelet: Running, apiserver: Running
# Test kubectl access from indri
ssh indri 'kubectl get nodes'
# Expected: minikube Ready control-plane ...
Implementation Details:
- Changed
minikube_memoryfrom 8192 to 7800 because podman machine reports slightly less available memory (7908MB) due to VM overhead. Minikube rejects memory requests exceeding what podman reports. - Deployed with Kubernetes v1.34.0 and CRI-O 1.24.6.
Step 0.10: Configure Kubeconfig on Gilbert
Goal: Enable kubectl and k9s on gilbert to connect to the minikube cluster running on indri.
Considerations:
- Minikube runs inside a podman VM on indri, so the API server isn't directly exposed on indri's network interface
- Admin users have full Tailscale access to indri via
autogroup:admin → * → * - Be careful not to overwrite existing work kubeconfigs
Possible approaches:
- SSH tunneling to forward the API server port
minikube tunnelrunning on indri (exposes LoadBalancer services)- Configure minikube with
--apiserver-names=indriat cluster creation time - Use
kubectlvia SSH wrapper:ssh indri kubectl ...
Verification:
# From gilbert, these should work:
kubectl get nodes
kubectl get namespaces
k9s # Should show the minikube cluster
The exact approach will be determined during implementation based on what works best with the podman driver.
Implementation Details:
Chose Option 3: Recreate cluster with --apiserver-names after researching alternatives:
- SSH tunneling - Requires keeping a tunnel running or complex on-demand setup
- SOCKS5 proxy with kubeconfig
proxy-url- Kubeconfig supportsproxy-url: socks5://localhost:1080per-context, but still requires managing the proxy --apiserver-names+--listen-address- Native minikube support, cleanest solution
Cluster Setup: Recreated the minikube cluster with additional flags:
minikube delete
minikube start \
--driver=podman \
--container-runtime=cri-o \
--cpus=4 --memory=7800 --disk-size=200g \
--apiserver-names=indri \
--listen-address=0.0.0.0
--apiserver-names=indriadds "indri" to the API server certificate SAN--listen-address=0.0.0.0tells podman to expose the API port on all interfaces- API server port is dynamic (check with
kubectl config view --minify -o jsonpath="{.clusters[0].cluster.server}"on indri)
Credential Management with 1Password:
Rather than copying private keys between machines, credentials are stored in 1Password and fetched on-demand using kubectl's exec credential plugin. This mirrors the 1Password SSH agent pattern for biometric-protected key access.
-
Store credentials in 1Password (vault:
vg6xf6vvfmoh5hqjjhlhbeoaie, item:3jo4f2hnzvwfmamudfsbbbec7e):client-cert- Contents of~/.minikube/profiles/minikube/client.crt(text field)client-key- Contents of~/.minikube/profiles/minikube/client.key(text field)ca-cert- Contents of~/.minikube/ca.crt(text field, not secret but stored for convenience)
-
Created credential helper script at
bin/kubectl-credential-1password:#!/bin/bash # Fetches client cert/key from 1Password, outputs ExecCredential JSON # Usage: kubectl-credential-1password <vault-id> <item-id> <cert-field> <key-field>Symlinked to
~/.local/bin/kubectl-credential-1password -
Kubeconfig setup on gilbert:
# Store CA cert locally (not secret - public key for server verification) mkdir -p ~/.kube/minikube-indri op --vault <vault> item get <item> --fields ca-cert | sed 's/^"//; s/"$//' > ~/.kube/minikube-indri/ca.crt # Configure cluster kubectl config set-cluster minikube-indri \ --server=https://indri:<port> \ --certificate-authority=/Users/eblume/.kube/minikube-indri/ca.crt # Configure credentials with exec plugin kubectl config set-credentials minikube-indri \ --exec-api-version=client.authentication.k8s.io/v1beta1 \ --exec-command=kubectl-credential-1password \ --exec-arg=<vault-id> \ --exec-arg=<item-id> \ --exec-arg=client-cert \ --exec-arg=client-key # Create context kubectl config set-context minikube-indri \ --cluster=minikube-indri \ --user=minikube-indri -
Usage:
kubectl --context=minikube-indri get nodes # or kubectl config use-context minikube-indri kubectl get nodes
Security Notes:
- Client private key never stored on disk - fetched from 1Password on each kubectl command
- CA cert stored on disk (not secret - it's a public key for server verification)
- 1Password biometric/password prompt required for credential access
opcommand strips quotes from text fields withsed 's/^"//; s/"$//'
References:
- minikube start options
- Using kubectl via SSH Tunnel
- SOCKS5 Proxy Access to K8s API
- kubectl-tokensshtunnel
- Securing kubectl config with 1Password
Step 0.11: Add Minikube to indri-services-check
Files to modify:
mise-tasks/indri-services-check
Changes:
# Add new section for Kubernetes
echo ""
echo "Kubernetes cluster:"
check_service "minikube" "ssh indri 'minikube status --format={{.Host}} | grep -q Running'"
check_service "k8s-apiserver" "ssh indri 'kubectl get --raw /healthz'"
Testing:
mise run indri-services-check
# Expected output includes:
# Kubernetes cluster:
# minikube... OK
# k8s-apiserver... OK
Implementation Notes:
- Added a third check
k8s-apiserver (remote)that verifies kubectl access from gilbert, not just via SSH to indri. This ensures the 1Password credential flow and remote API server access are working. - The remote check uses both
--kubeconfigand--contextflags explicitly since the script runs in bash (not fish) and doesn't inherit the KUBECONFIG environment variable from fish config.
Step 0.12: Create Zettelkasten Documentation
New files:
~/code/personal/zk/zot.md~/code/personal/zk/minikube.md
Files to update:
~/code/personal/zk/1767747119-YCPO.md(main blumeops card)
Updates to main blumeops card:
-
Add to Device Tags table: |
tag:registry| indri | Container registry access | -
Add to Services table: | Registry | https://registry.tail8d86e.ts.net | OCI container registry (Zot) | zot | | Kubernetes | https://indri: | Minikube cluster | minikube |
-
Add to Port Map (Indri) table: | 5050 | Zot | HTTP | localhost | Container registry | | | K8s API | HTTPS | 0.0.0.0 | Minikube API server |
-
Add new section Remote Kubernetes Access:
## Remote Kubernetes Access (from Gilbert) The minikube cluster on indri is accessible from gilbert via direct connection. Cluster was created with `--apiserver-names=indri --listen-address=0.0.0.0`. **Fish abbreviations** (in `~/.config/fish/config.fish`): - `ki` → `kubectl --context=minikube-indri` - `k9i` → `k9s --context=minikube-indri` - `k9` → `k9s` ```bash # Quick access via abbreviations ki get nodes k9i # Or explicitly set context kubectl config use-context minikube-indri kubectl get nodes
Template for zot.md:
---
id: zot
aliases:
- zot
- container-registry
tags:
- blumeops
---
# Zot Registry Management Log
Zot is an OCI-native container registry running on Indri, providing:
1. Pull-through cache for Docker Hub, GHCR, Quay (avoids rate limits)
2. Private image storage for custom-built containers
## Service Details
- URL: https://registry.tail8d86e.ts.net
- Local port: 5050
- Data directory: ~/zot
- Config: ~/.config/zot/config.json
- Managed via: mcquack LaunchAgent
## Namespace Convention
| Path | Source |
|------|--------|
| `registry.../docker.io/*` | Cached from Docker Hub |
| `registry.../ghcr.io/*` | Cached from GHCR |
| `registry.../quay.io/*` | Cached from Quay |
| `registry.../blumeops/*` | Private images (yours) |
## Useful Commands
\`\`\`bash
# List all images
curl -s http://localhost:5050/v2/_catalog | jq
# Pull via cache (from indri or k8s)
podman pull localhost:5050/docker.io/library/nginx:latest
# Build and push private image (from gilbert)
podman build -t registry.tail8d86e.ts.net/blumeops/myapp:v1 .
podman push registry.tail8d86e.ts.net/blumeops/myapp:v1
# Check service status
launchctl list | grep zot
# View logs
tail -f ~/Library/Logs/mcquack.zot.err.log
\`\`\`
## Log
### [DATE]
- Initial setup for k8s migration Phase 0
Template for minikube.md:
---
id: minikube
aliases:
- minikube
- kubernetes
- k8s
tags:
- blumeops
---
# Minikube Management Log
Minikube provides a single-node Kubernetes cluster on Indri for running containerized services.
## Cluster Details
- Driver: podman (rootless)
- Container runtime: CRI-O
- Kubernetes version: v1.34.0
- Resources: 4 CPUs, 7800MB RAM, 200GB disk
- API server: https://indri:<port> (accessible from gilbert via Tailscale)
## Remote Access from Gilbert
Cluster was created with `--apiserver-names=indri --listen-address=0.0.0.0` to allow remote kubectl access.
\`\`\`bash
# Switch context
kubectl config use-context minikube-indri
# Verify
kubectl get nodes
kubectl get namespaces
# Use k9s
k9s --context minikube-indri
\`\`\`
## Useful Commands (on indri)
\`\`\`bash
# Cluster status
minikube status
# Start/stop cluster
minikube start
minikube stop
# Access dashboard
minikube dashboard
# SSH into node
minikube ssh
# View logs
minikube logs
\`\`\`
## Podman Machine (prerequisite)
Minikube uses podman as the container runtime. The podman machine must be running:
\`\`\`bash
# Check podman machine
podman machine list
# Start if needed
podman machine start
\`\`\`
## Log
### [DATE]
- Initial cluster setup for k8s migration Phase 0
- Configured for remote access with --apiserver-names=indri
Implementation Notes:
- Created zot.md and minikube.md in ~/code/personal/zk/
- Updated 1767747119-YCPO.md (main blumeops card) with all specified changes
- Added 1Password credential plugin reference to minikube docs
- K8s API port is 39535 (dynamically assigned by minikube, may change on cluster recreation)
Step 0.13: Update Main Playbook
Files to modify:
ansible/playbooks/indri.yml
Changes:
# Add new roles to the roles list
- role: podman
tags: podman
- role: zot
tags: zot
- role: zot_metrics
tags: zot_metrics
- role: minikube
tags: minikube
Implementation Notes:
- Roles were added incrementally during Steps 0.3, 0.5, 0.8, and 0.9
- All four roles (zot, zot_metrics, podman, minikube) confirmed present in indri.yml
Step 0.14: Expose K8s API as Tailscale Service (Added Post-Completion)
Note
: This step was added after Phase 0 was otherwise complete, to provide a stable, named endpoint for the Kubernetes API server.
Goal: Expose the minikube API server as k8s.tail8d86e.ts.net instead of using indri:<dynamic-port>.
Current state:
- Minikube API server on port 39535 (dynamic, could change on cluster recreation)
- Accessed via
https://indri:39535 - Certificate SANs include "indri"
Target state:
- Stable Tailscale service at
k8s.tail8d86e.ts.net:443 - Fixed API server port (6443, the k8s standard)
- Certificate SANs include both hostnames for compatibility
Step 0.14.1: Update Pulumi ACLs
Files to modify:
pulumi/policy.hujsonpulumi/__main__.py
Changes to policy.hujson:
- Add tag to
tagOwners:
"tag:k8s-api": ["autogroup:admin", "tag:blumeops"],
- Update Erich's test case accept list to include k8s-api:
"accept": ["tag:grafana:443", "tag:kiwix:443", "tag:feed:443", "tag:loki:3100", "tag:pg:5432", "tag:homelab:22", "tag:registry:443", "tag:k8s-api:443"],
- Update Allison's deny list:
"deny": ["tag:grafana:443", "tag:loki:3100", "tag:nas:445", "tag:registry:443", "tag:k8s-api:443"],
Changes to main.py:
- Add
"tag:k8s-api"to indri's DeviceTags
Testing:
mise run tailnet-preview # Review changes
mise run tailnet-up # Apply changes
Step 0.14.2: Create Tailscale Service in Admin Console (MANUAL)
CRITICAL: Do this BEFORE running ansible that calls
tailscale serve
- Go to https://login.tailscale.com/admin/services
- Create service
k8swith:- Port: 443 (TCP)
- Host: indri
Step 0.14.3: Recreate Minikube Cluster
The cluster needs to be recreated to:
- Add
k8s.tail8d86e.ts.netto the API server certificate SANs - Fix the API server port to 6443 (standard k8s port)
On indri:
# Stop and delete existing cluster
minikube stop
minikube delete
# Recreate with new settings
minikube start \
--driver=podman \
--container-runtime=cri-o \
--cpus=4 --memory=7800 --disk-size=200g \
--apiserver-names=k8s.tail8d86e.ts.net,indri \
--apiserver-port=6443 \
--listen-address=0.0.0.0
# Verify certificate SANs include both names
kubectl config view --minify -o jsonpath="{.clusters[0].cluster.server}"
# Expected: https://127.0.0.1:6443 or similar
# Verify cluster is running
minikube status
kubectl get nodes
Update ansible role defaults (ansible/roles/minikube/defaults/main.yml):
minikube_apiserver_names:
- k8s.tail8d86e.ts.net
- indri
minikube_apiserver_port: 6443
Step 0.14.4: Add K8s Service to Tailscale Serve
Files to modify:
ansible/roles/tailscale_serve/defaults/main.yml
Add to services list:
- name: svc:k8s
tcp:
port: 443
upstream: tcp://localhost:6443
Note: Using TCP passthrough (not HTTPS termination) because k8s uses mTLS authentication.
Deploy:
mise run provision-indri -- --tags tailscale-serve
Step 0.14.5: Update 1Password Credentials
After cluster recreation, the client certificates have changed.
On indri, get the new credentials:
# Display new certificates (copy to 1Password)
cat ~/.minikube/profiles/minikube/client.crt
cat ~/.minikube/profiles/minikube/client.key
cat ~/.minikube/ca.crt
In 1Password (vault: vg6xf6vvfmoh5hqjjhlhbeoaie, item: 3jo4f2hnzvwfmamudfsbbbec7e):
- Update
client-certfield with new certificate - Update
client-keyfield with new key - Update
ca-certfield with new CA certificate
Step 0.14.6: Update Kubeconfig on Gilbert
Update CA certificate:
# Fetch new CA cert from 1Password
op --vault vg6xf6vvfmoh5hqjjhlhbeoaie item get 3jo4f2hnzvwfmamudfsbbbec7e --fields ca-cert | sed 's/^"//; s/"$//' > ~/.kube/minikube-indri/ca.crt
Update kubeconfig (~/.kube/minikube-indri/config.yml):
clusters:
- cluster:
certificate-authority: /Users/eblume/.kube/minikube-indri/ca.crt
server: https://k8s.tail8d86e.ts.net # Changed from https://indri:39535
name: minikube-indri
Verification:
# Test connection via new hostname
kubectl --context=minikube-indri get nodes
# Test via abbreviation
ki get nodes
Step 0.14.7: Update Documentation
Files to update:
~/code/personal/zk/minikube.md- Update API server URL and port info~/code/personal/zk/1767747119-YCPO.md- Update Services table and Port Map
Changes to blumeops card:
-
Update Services table: | Kubernetes | https://k8s.tail8d86e.ts.net | Minikube cluster | minikube |
-
Update Port Map: | 6443 | K8s API | HTTPS/TCP | 0.0.0.0 | Minikube API server (via Tailscale) |
-
Add
tag:k8s-apito Device Tags table
Step 0.14.8: Update indri-services-check
Files to modify:
mise-tasks/indri-services-check
Changes:
# Update remote k8s check to use new URL
check_service "k8s-apiserver (remote)" "kubectl --kubeconfig=$HOME/.kube/minikube-indri/config.yml --context=minikube-indri get --raw /healthz"
# (No change needed - uses kubeconfig which now points to k8s.tail8d86e.ts.net)
Step 0.14 Verification
# 1. Service health check
mise run indri-services-check
# All services should be OK
# 2. Test k8s access via Tailscale hostname
curl -k https://k8s.tail8d86e.ts.net/healthz
# Expected: ok (or certificate error if mTLS required - that's fine)
# 3. kubectl via Tailscale
ki get nodes
ki get namespaces
# 4. k9s via Tailscale
k9i
Phase 0 Verification Checklist
Run after completing all steps:
# 1. Full service health check
mise run indri-services-check
# All services should show OK, including new ones
# 2. Registry functionality - pull-through cache
ssh indri 'podman pull localhost:5000/docker.io/library/alpine:latest'
curl -s https://registry.tail8d86e.ts.net/v2/_catalog
# Expected: {"repositories":["docker.io/library/alpine"]}
# 3. Registry functionality - private image push (from gilbert)
podman pull alpine:latest
podman tag alpine:latest registry.tail8d86e.ts.net/blumeops/test:v1
podman push registry.tail8d86e.ts.net/blumeops/test:v1
curl -s https://registry.tail8d86e.ts.net/v2/_catalog
# Expected: {"repositories":["blumeops/test","docker.io/library/alpine"]}
# 4. Kubernetes cluster
ssh indri 'minikube status'
ssh indri 'kubectl get nodes'
kubectl get nodes # from gilbert
# 5. Metrics in Prometheus
curl -s "http://indri:9090/api/v1/query?query=zot_up"
# Expected: value = 1
# 6. Logs in Loki
# In Grafana Explore: {service="zot"}
# Should see zot log entries
# 7. k9s from gilbert
k9s
# Should connect and show minikube cluster
Phase 0 Rollback
If something goes wrong:
# Stop and remove minikube
ssh indri 'minikube stop && minikube delete'
# Stop and remove zot
ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.eblume.zot.plist'
ssh indri 'rm ~/Library/LaunchAgents/mcquack.eblume.zot.plist'
# Remove podman machine
ssh indri 'podman machine stop && podman machine rm'
# Remove from tailscale serve
ssh indri 'tailscale serve --service svc:registry reset'
# Remove tags from Pulumi (revert policy.hujson changes)
mise run tailnet-up
# Revert ansible playbook changes
git checkout ansible/playbooks/indri.yml
git checkout ansible/roles/tailscale_serve/defaults/main.yml
git checkout ansible/roles/alloy/templates/config.alloy.j2
# Remove new roles
rm -rf ansible/roles/{zot,zot_metrics,podman,minikube}
# Remove zk cards
rm ~/code/personal/zk/{zot,minikube}.md
Phase 0 Follow-up: Grafana Dashboards
After Phase 0 is running and stable, create monitoring dashboards:
Zot Dashboard (ansible/roles/grafana/files/dashboards/zot.json):
- Check what metrics zot exposes:
ssh indri 'curl -s http://localhost:5000/metrics' - Review community dashboards for inspiration (copy permitted if license allows)
- Create dashboard with available metrics (at minimum:
zot_up)
Minikube Dashboard (ansible/roles/grafana/files/dashboards/minikube.json):
- Deploy kube-state-metrics if needed for additional cluster metrics
- Review what Prometheus can scrape from the cluster
- Review community dashboards for inspiration (copy permitted if license allows)
- Create dashboard with relevant panels (node usage, pod counts, etc.)
New Files Summary
| File | Purpose |
|---|---|
ansible/roles/zot/ |
Zot registry deployment |
ansible/roles/zot_metrics/ |
Metrics collection for Zot |
ansible/roles/podman/ |
Podman installation and setup |
ansible/roles/minikube/ |
Minikube cluster setup |
~/code/personal/zk/zot.md |
Zot management documentation |
~/code/personal/zk/minikube.md |
Minikube management documentation |
Modified Files Summary
| File | Changes |
|---|---|
pulumi/policy.hujson |
Add tag:registry |
ansible/playbooks/indri.yml |
Add new roles |
ansible/roles/tailscale_serve/defaults/main.yml |
Add svc:registry |
ansible/roles/alloy/templates/config.alloy.j2 |
Add zot log collection |
mise-tasks/indri-services-check |
Add zot and k8s checks |