Commit graph

17 commits

Author SHA1 Message Date
6a140107c6 P7 forgejo plan updated 2026-01-21 20:04:18 -08:00
4dd74dfff8 complete P6 2026-01-21 19:16:04 -08:00
89eff26301 complete P5.1 2026-01-21 16:32:01 -08:00
21848a7919 P5.1: Migrate minikube from podman to QEMU2 driver (#38)
## Summary
- Migrate minikube from podman driver to qemu2 driver for proper NFS/SMB volume mount support
- Update ansible minikube role with qemu installation and containerd runtime
- Remove podman role dependency from indri.yml
- Add synology user creation steps and post-migration zot reconfiguration notes

## Why
Phase 6 (Kiwix/Transmission migration) was blocked because the podman driver lacks kernel capabilities for filesystem mounts. QEMU2 creates an actual VM with full mount support.

## Deployment and Testing
- [ ] Create k8s-storage user on Synology DSM
- [ ] Store credentials in 1Password (synology-k8s-storage)
- [ ] Export current k8s state
- [ ] Stop and delete podman-based minikube cluster
- [ ] Run ansible to create QEMU2 cluster
- [ ] Test NFS volume mount with test pod
- [ ] Redeploy ArgoCD and all apps
- [ ] Verify all services healthy
- [ ] Reconfigure zot registry mirrors for containerd (post-migration)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/38
2026-01-21 16:03:37 -08:00
7b60cca31e Document P6 blocker and add P5.1 QEMU2 migration plan (#37)
## Summary
- Document P6 (Kiwix/Transmission) blocker: podman driver cannot mount external volumes
- Add P5.1 plan to migrate minikube from podman to QEMU2 driver
- Update overview with corrected phase statuses and driver information

## Background

P6 implementation (`feature/p6-kiwix-transmission`) was completed but blocked because **all volume mount approaches failed** with the podman driver:

| Approach | Result |
|----------|--------|
| NFS volume | Failed - CAP_SYS_ADMIN required |
| SMB CSI driver | Failed - EPERM in rootless container |
| `minikube mount` (9p) | Failed - permission denied |
| hostPath | Failed - path doesn't exist in container |

Root cause: Podman driver runs minikube in a rootless container lacking kernel capabilities for filesystem mounts.

## What's Next

1. Merge this documentation PR
2. Execute P5.1 (QEMU2 migration) in a fresh session
3. Retry P6 with the QEMU2 driver

## Deployment and Testing
- [x] No deployment needed - documentation only
- [x] ArgoCD apps reset to main
- [x] Cluster healthy (except kiwix/transmission intentionally offline)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/37
2026-01-20 20:49:48 -08:00
b97d461a5a P6: Kiwix and Transmission migration planning (#35)
## Summary
- Detailed planning document for Phase 6 of k8s migration
- Transmission as standalone general-purpose torrent service with web UI at torrent.tail8d86e.ts.net
- NFS storage on sifaka (/volume1/torrents) shared between both services
- Declarative ZIM torrent list in kiwix's ConfigMap, synced to transmission via sidecar
- ZIM watcher CronJob for automatic kiwix restart when new archives complete
- Supports both GitOps (declarative) and interactive (web UI) torrent management

## Architecture Highlights
- **torrent namespace**: Standalone transmission with Tailscale ingress
- **kiwix namespace**: kiwix-serve with torrent-sync sidecar
- **Shared NFS PV**: Single PV referenced by PVCs in both namespaces
- **No backup needed**: Sifaka is RAID 5/6 and already the backup target

## Deployment and Testing
- [ ] Review plan document
- [ ] Verify NFS export on sifaka is feasible
- [ ] Approve architecture decisions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/35
2026-01-20 18:42:11 -08:00
f98103a58d P5 done 2026-01-20 15:04:46 -08:00
0439fbb704 P5: Migrate devpi to Kubernetes (#34)
## Summary
- Migrate devpi PyPI caching proxy from indri LaunchAgent to Kubernetes
- Custom container image with devpi-server + devpi-web + auto-init
- StatefulSet with 50Gi PVC, Tailscale Ingress at pypi.tail8d86e.ts.net
- Remove devpi from ansible playbooks and update CLAUDE.md with k8s workflow

## Key Changes
- Add CRI-O registry mirror config for registry.tail8d86e.ts.net
- Change ArgoCD apps to manual sync (was auto-sync causing issues)
- 2Gi memory limit for Whoosh indexer (reclaimed after startup)

## Deployment and Testing
- [x] devpi pod healthy in k8s
- [x] pip install through proxy works
- [x] mcquack 1.0.0 uploaded and installable
- [x] Old devpi stopped on indri

## Post-Merge
Reset ArgoCD to main:
```
argocd app set apps --revision main && argocd app sync apps
argocd app set devpi --revision main && argocd app sync devpi
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/34
2026-01-20 14:55:37 -08:00
b2307412fc Add P4 implementation notes and mark complete 2026-01-20 09:10:23 -08:00
735b643429 P4: Miniflux migration + PostgreSQL consolidation (#33)
## Summary
- Deploy miniflux in k8s via ArgoCD
- Expose via Tailscale Ingress at feed.tail8d86e.ts.net
- Retire brew PostgreSQL (no longer needed)
- Rename k8s-pg to pg (canonical hostname)
- Remove ansible miniflux and postgresql roles
- Update borgmatic to backup pg.tail8d86e.ts.net
- Update all zk documentation

## Deployment and Testing
- [x] Miniflux pod running in k8s
- [x] User login works at https://feed.tail8d86e.ts.net
- [x] Feeds and entries visible
- [x] brew miniflux and postgresql stopped
- [x] Tailscale services migrated (feed, pg)
- [x] zk documentation updated
- [x] Run ansible to apply role removals
- [ ] Verify borgmatic backup with new pg hostname

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/33
2026-01-20 09:04:47 -08:00
463f476374 P3 done
Updated P3_postgresql.complete.md with full implementation notes including:
- borgmatic borg path fix
- Disaster recovery testing
- CloudNativePG managed roles for borgmatic user
- Dual database backup configuration
- ACL grant for homelab → k8s
- ArgoCD selfHeal disabled for feature branch workflow
- CNPG default values to prevent drift

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 18:19:33 -08:00
e69a3df2d4 P3 done 2026-01-19 18:03:48 -08:00
f0c28a3cdd Rename P2 plan to .complete.md 2026-01-19 15:06:27 -08:00
45dfefa8df Mark P2 complete with implementation notes
Documents lessons learned:
- SSH credential template for all forge repos
- Kustomize patches must omit namespace for matching
- Tailscale hostname cutover requires manual admin console deletion
- ArgoCD workflow: all apps target main, manual sync for control

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 15:06:14 -08:00
7e6742ad24 K8s Migration Phase 2: Grafana to Kubernetes (#30)
## Summary
- Migrate Grafana from Homebrew/Ansible to Kubernetes deployment
- Switch CloudNativePG to use forge-mirrored Helm chart (HTTPS, no auth needed)
- Add Grafana Helm chart deployment via ArgoCD with multi-source pattern
- Add Grafana config (Tailscale Ingress, 9 dashboard ConfigMaps)
- Update Loki to bind 0.0.0.0 for k8s pod access via `host.containers.internal`

## Key Changes
- `argocd/apps/grafana.yaml` - Grafana Helm chart Application
- `argocd/apps/grafana-config.yaml` - Ingress + dashboard ConfigMaps
- `argocd/apps/cloudnative-pg.yaml` - Now uses forge mirror instead of external Helm repo
- `ansible/roles/loki/templates/loki-config.yaml.j2` - Bind 0.0.0.0

## Deployment and Testing
- [x] Deploy Loki config change: `mise run provision-indri -- --tags loki`
- [x] Create namespace: `ki create namespace monitoring`
- [x] Create secret: `op inject -i argocd/manifests/grafana-config/secret-admin.yaml.tpl | ki apply -f -`
- [x] Sync ArgoCD apps (grafana, grafana-config)
- [x] Verify Grafana works at https://grafana.tail8d86e.ts.net
- [x] Remove svc:grafana from ansible tailscale_serve
- [x] Stop brew grafana: `ssh indri 'brew services stop grafana'`
- [x] Delete ansible grafana role

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/30
2026-01-19 14:40:25 -08:00
680ad1095b Rename P1 to complete 2026-01-19 10:03:52 -08:00
a8f4d00294 K8s Migration Phase 1: Infrastructure Setup (#29)
## Summary
- Split k8s migration plan into phases folder for easier navigation
- Added `tag:k8s` to Pulumi ACLs for Kubernetes workloads
- Phase 1 work in progress

## Phase 1 Goals
- Tailscale Kubernetes Operator
- CloudNativePG Operator
- PostgreSQL cluster for future app migrations

## Deployment and Testing
- [ ] Review Phase 1 plan
- [ ] `mise run tailnet-preview` to verify ACL changes
- [ ] `mise run tailnet-up` to apply ACL changes
- [ ] Create Tailscale OAuth client (manual)
- [ ] Deploy operators and PostgreSQL cluster

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/29
2026-01-19 09:49:52 -08:00