Commit graph

9 commits

Author SHA1 Message Date
6cab5091ea Add storage-provisioner health check to minikube Ansible role
The storage-provisioner is a bare Pod with no controller. If the node
restarts via Docker Desktop (rather than `minikube start`), kubelet
restores static pods but bare pods are lost. Detect this and re-run
`minikube start` to restore addons.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 12:04:25 -07:00
d76d675b29 Fix minikube role skipping start when kubelet/apiserver are stopped (#137)
## Summary
- After a power loss, minikube's Docker container (host) restarts but kubelet/apiserver remain stopped
- The ansible role's status check used `--format='{{.Host}}'` which only examined the host VM state
- When host=Running but kubelet/apiserver=Stopped, the role skipped `minikube start`
- Fixed to use full `minikube status` exit code (returns non-zero when any component is unhealthy)
- Simplified all downstream conditions to use exit code instead of string matching

## Test plan
- [x] Verified the fix correctly skips `minikube start` when cluster is already fully running
- [x] Pre-commit hooks pass (ansible-lint, yamllint, etc.)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/137
2026-02-09 23:03:01 -08:00
d6e6b48f6a Migrate registry to Caddy (registry.ops.eblu.me) (#58)
## Summary
- Update all references from `registry.tail8d86e.ts.net` to `registry.ops.eblu.me`
- Remove `tailscale_serve` ansible role (no longer needed - all services migrated to Caddy)
- Update minikube containerd config for new registry URL
- Update devpi manifest, CI actions, and mise tasks

## Deployment and Testing
- [ ] Run `mise run provision-indri -- --check --diff` (dry run)
- [ ] Run `mise run provision-indri -- --tags minikube` to update containerd config
- [ ] Sync devpi ArgoCD app: `argocd app sync devpi`
- [ ] Manually remove old Tailscale serve entry: `ssh indri 'tailscale serve --service=svc:registry off'`
- [ ] Test registry access: `curl https://registry.ops.eblu.me/v2/_catalog`
- [ ] Run `mise run indri-services-check` to verify all services healthy

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/58
2026-01-25 12:06:15 -08:00
21848a7919 P5.1: Migrate minikube from podman to QEMU2 driver (#38)
## Summary
- Migrate minikube from podman driver to qemu2 driver for proper NFS/SMB volume mount support
- Update ansible minikube role with qemu installation and containerd runtime
- Remove podman role dependency from indri.yml
- Add synology user creation steps and post-migration zot reconfiguration notes

## Why
Phase 6 (Kiwix/Transmission migration) was blocked because the podman driver lacks kernel capabilities for filesystem mounts. QEMU2 creates an actual VM with full mount support.

## Deployment and Testing
- [ ] Create k8s-storage user on Synology DSM
- [ ] Store credentials in 1Password (synology-k8s-storage)
- [ ] Export current k8s state
- [ ] Stop and delete podman-based minikube cluster
- [ ] Run ansible to create QEMU2 cluster
- [ ] Test NFS volume mount with test pod
- [ ] Redeploy ArgoCD and all apps
- [ ] Verify all services healthy
- [ ] Reconfigure zot registry mirrors for containerd (post-migration)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/38
2026-01-21 16:03:37 -08:00
f2541c3f77 Fix minikube role idempotency for zot mirror config (#31)
## Summary
- Fixed trailing newline mismatch in config comparison (ansible command module strips whitespace, slurp preserves it)
- Only copy temp file when config actually needs updating (avoids spurious changes)
- Task now properly skips when config is already correct

## Deployment and Testing
- [x] Verified idempotency: `changed=0` on repeated runs
- [x] Verified change detection: corrupted config triggers proper update
- [x] ansible-lint passes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/31
2026-01-19 16:19:52 -08:00
130c044523 Fix hanging minikube provision 2026-01-19 15:49:11 -08:00
a8f4d00294 K8s Migration Phase 1: Infrastructure Setup (#29)
## Summary
- Split k8s migration plan into phases folder for easier navigation
- Added `tag:k8s` to Pulumi ACLs for Kubernetes workloads
- Phase 1 work in progress

## Phase 1 Goals
- Tailscale Kubernetes Operator
- CloudNativePG Operator
- PostgreSQL cluster for future app migrations

## Deployment and Testing
- [ ] Review Phase 1 plan
- [ ] `mise run tailnet-preview` to verify ACL changes
- [ ] `mise run tailnet-up` to apply ACL changes
- [ ] Create Tailscale OAuth client (manual)
- [ ] Deploy operators and PostgreSQL cluster

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/29
2026-01-19 09:49:52 -08:00
3679124ebd Expose Kubernetes API as Tailscale service (Step 0.14) (#27)
## Summary
- Add `tag:k8s-api` to Pulumi ACLs and indri device tags
- Configure Tailscale serve with TCP passthrough for k8s API at `k8s.tail8d86e.ts.net`
- Update minikube role to include `k8s.tail8d86e.ts.net` in certificate SANs
- Add `apiserver_port` config option (internal port 6443, dynamic host port with podman driver)
- Document Step 0.14 in k8s-migration plan (added post-Phase 0 completion)

The Kubernetes API is now accessible at `https://k8s.tail8d86e.ts.net` using TCP passthrough to preserve mTLS authentication.

## Deployment and Testing
- [x] Pulumi ACLs applied
- [x] Tailscale service created and approved in admin console
- [x] Minikube cluster recreated with new cert SANs
- [x] tailscale serve configured with TCP passthrough
- [x] 1Password credentials updated with new certs
- [x] Kubeconfig updated on gilbert
- [x] `mise run indri-services-check` passes
- [x] `kubectl --context=minikube-indri get nodes` works via Tailscale

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/27
2026-01-18 12:49:20 -08:00
19a82373d5 K8s Migration Phase 0: Foundation Infrastructure (#26)
## Summary
- Step 0.1: Update Pulumi ACLs with tag:registry
- Step 0.3: Create Zot registry ansible role with mcquack LaunchAgent
- Step 0.4: Add Zot to Tailscale Serve configuration
- Step 0.5: Create Zot metrics role for Prometheus scraping
- Step 0.6: Add Zot log collection to Alloy
- Step 0.7: Update indri-services-check with zot checks
- Step 0.8: Add podman role for container runtime
- Step 0.9: Add minikube role for Kubernetes cluster
- Step 0.10: Configure remote kubectl access with 1Password credentials

## Remaining Steps
- [ ] Step 0.11: Add minikube to indri-services-check
- [ ] Step 0.12: Create zettelkasten documentation
- [ ] Step 0.13: Verify main playbook (already done - roles added)

## Deployment and Testing
- [x] Zot registry deployed and accessible at https://registry.tail8d86e.ts.net
- [x] Podman machine running on indri
- [x] Minikube cluster running on indri
- [x] kubectl access from gilbert working with 1Password credentials
- [ ] indri-services-check passes all checks

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/26
2026-01-18 12:06:28 -08:00