Remove plans, they dont seem to work

2026-01-24 16:21:49 -08:00 · 2026-01-24 16:21:49 -08:00 · ceba6b3c2c
commit ceba6b3c2c
parent 8ca8798121
18 changed files with 0 additions and 6816 deletions
--- a/plans/ci-cd-bootstrap/00_overview.md
+++ b/plans/ci-cd-bootstrap/00_overview.md
@ -1,179 +0,0 @@
 # Forgejo Actions CI/CD Bootstrap Plan
 This plan details the setup of Forgejo Actions as the CI/CD system for blumeops, starting with the bootstrapping problem: using Forgejo to build and deploy Forgejo itself.
 ## Goals
 1. **Forgejo Actions** as the primary CI system (replaces Woodpecker from original plan)
 2. **Self-hosted Forgejo** built from source, deployed as mcquack LaunchAgent on indri
 3. **Container builds** for ArgoCD manifests (devpi, etc.)
 4. **Cron-scheduled tasks** via k8s CronJobs (not Actions)
 5. **Local development** parity using `act` for workflow testing
 ## Why Forgejo Actions over Woodpecker?
 - Native integration with Forgejo (no OAuth setup, automatic repo detection)
 - GitHub Actions compatible syntax (huge ecosystem of reusable actions)
 - `act` tool for local testing on gilbert
 - Single system to maintain instead of two
 ## Architecture Overview
 ```
 ┌─────────────────────────────────────────────────────────────────┐
 │                           INDRI                                  │
 │  ┌─────────────────────┐                                        │
 │  │     Forgejo         │ ← Built from source                    │
 │  │   (mcquack agent)   │ ← Deploys itself via CI                │
 │  │                     │                                        │
 │  │  - Web UI (3001)    │                                        │
 │  │  - SSH (2200)       │                                        │
 │  │  - Actions enabled  │                                        │
 │  └─────────────────────┘                                        │
 └─────────────────────────────────────────────────────────────────┘
         │
         │ SSH deploy
         ▼
 ┌─────────────────────────────────────────────────────────────────┐
 │                      KUBERNETES (minikube)                       │
 │  ┌─────────────────────┐     ┌─────────────────────┐           │
 │  │   Forgejo Runner    │     │    Other Services   │           │
 │  │   (host mode)       │     │    (via ArgoCD)     │           │
 │  │                     │     │                     │           │
 │  │  - Custom image     │     │                     │           │
 │  │  - Node.js + tools  │     │                     │           │
 │  │  - Docker builds    │     │                     │           │
 │  └─────────────────────┘     └─────────────────────┘           │
 └─────────────────────────────────────────────────────────────────┘
 ```
 ## Phases
 | Phase | Name | Description | Status |
 |-------|------|-------------|--------|
 | 1 | [Enable Actions](P1_enable_actions.md) | Configure Forgejo for Actions, deploy runner in host mode | ✅ Complete |
 | 2 | [Custom Runner Image](P2_mirror_and_build.md) | Build custom runner with Node.js/tools, enable standard Actions | ✅ Complete |
 | 3 | [Mirror Forgejo & Build](P3_mirror_forgejo.md) | Mirror upstream Forgejo, create build workflow | Planning |
 | 4 | [Self-Deploy](P4_self_deploy.md) | Forgejo deploys itself, transition to mcquack | Planning |
 | 5 | [Container Builds](P5_container_builds.md) | Build custom container images (devpi, etc.) | Planning |
 ## The Bootstrap Problem
 **Chicken-and-egg**: We need Forgejo Actions to build Forgejo, but Forgejo must be running first.
 **Additional complication**: The stock runner image lacks Node.js, so standard GitHub Actions don't work.
 **Solution**:
 1. Keep current brew-based Forgejo running during setup ✅
 2. Enable Actions, deploy runner in host mode ✅
 3. **Build custom runner image** with Node.js and tools (bootstrap manually, then automate)
 4. Mirror upstream Forgejo, create build workflow
 5. Address cross-compilation challenge (Linux runner → macOS target)
 6. First CI build creates the binary
 7. CI deploys binary to indri as mcquack service
 8. `brew services stop forgejo` and uninstall
 9. Future builds: Forgejo builds and deploys itself
 **Cross-compilation challenge**:
 The runner runs in Linux containers (k8s), but Forgejo needs to run on indri (macOS ARM64). Options:
 - Cross-compile with CGO_ENABLED=1 (complex, needs OSX toolchain)
 - Cross-compile with CGO_ENABLED=0 (breaks Tailscale DNS resolution)
 - Build on gilbert manually, use CI only for deploy
 - Run a native macOS runner on indri (outside k8s)
 This will be addressed in Phase 3.
 **Risk mitigation**: If self-deployment breaks Forgejo:
 - blumeops is mirrored to GitHub
 - Manual recovery: build on gilbert, scp to indri, restart service
 - See Disaster Recovery section in P4
 ## Host Mode Runner
 The runner uses **host mode** (`ubuntu-latest:host`), meaning:
 - Jobs run directly in the runner container (no Docker/k8s pods spawned)
 - Tools must be pre-installed in the runner image
 - Stock image lacks Node.js, so `actions/checkout@v4` doesn't work
 - Solution: Build custom runner image with necessary tools (Phase 2)
 ## Ansible Role Strategy
 The forgejo ansible role will follow the zot/alloy pattern:
 1. **Check binary exists** at expected path
 2. **If missing**: Fail with message pointing to CI trigger instructions
 3. **If present**: Deploy config, ensure LaunchAgent loaded
 Ansible does NOT:
 - Build the binary (that's CI's job)
 - Deploy new versions (that's CI's job)
 Ansible DOES:
 - Manage app.ini configuration (via template with secrets from 1Password)
 - Manage mcquack LaunchAgent plist
 - Ensure service is running
 - Collect logs via Alloy
 ## Files Summary
 ### New Files
 | Path | Purpose |
 |------|---------|
 | `argocd/apps/forgejo-runner.yaml` | ArgoCD Application for runner ✅ |
 | `argocd/manifests/forgejo-runner/` | Runner k8s manifests ✅ |
 | `argocd/manifests/forgejo-runner/Dockerfile` | Custom runner image (P2) |
 | `.forgejo/workflows/build-runner.yml` | Auto-rebuild runner image (P2) |
 | `.forgejo/workflows/test.yml` | Test workflow ✅ |
 | (on forge) `eblume/forgejo/.forgejo/workflows/` | Build workflow in forgejo mirror (P3) |
 ### Modified Files
 | Path | Change |
 |------|--------|
 | `ansible/roles/forgejo/` | Complete rewrite for mcquack pattern (P4) |
 | `ansible/roles/alloy/defaults/main.yml` | Update forgejo log paths (P4) |
 | zk cards | Update forgejo, argocd, blumeops cards |
 ### Credentials Needed
 | Item | Purpose | Storage |
 |------|---------|---------|
 | Runner registration token | Runner auth to Forgejo | 1Password ✅ |
 | SSH deploy key | Runner SSH to indri (for Forgejo deploy) | 1Password + k8s secret (P3) |
 ## Related Plans
 - [P7_forgejo.md](../k8s-migration/P7_forgejo.md) - Original k8s migration plan (superseded for Forgejo itself, but SSH hostname split info still relevant)
 - [P8_woodpecker.md](../k8s-migration/P8_woodpecker.md) - Original Woodpecker plan (superseded by Forgejo Actions)
 ## Decision Log
 ### 2026-01-23: Custom runner image as Phase 2
 **Decision**: Move custom runner image work from P4 to P2
 **Rationale**:
 - Stock runner lacks Node.js, can't run `actions/checkout@v4`
 - Need working GitHub Actions before building Forgejo
 - Bootstrap manually (podman build on gilbert), then automate
 ### 2026-01-23: Forgejo Actions over Woodpecker
 **Decision**: Use Forgejo Actions instead of Woodpecker CI
 **Rationale**:
 - Native Forgejo integration (Actions is built-in)
 - GitHub Actions compatible (reuse existing actions)
 - `act` for local testing
 - One less system to deploy and maintain
 ### 2026-01-23: Keep Forgejo on indri (not k8s)
 **Decision**: Forgejo stays on indri as mcquack service, not migrated to k8s
 **Rationale**:
 - Avoid circular dependency (ArgoCD needs Forgejo to deploy Forgejo)
 - Simpler SSH handling (direct port, no k8s networking complexity)
 - Forgejo is critical infrastructure, benefits from isolation
 - Can still use Tailscale serve for external access
--- a/plans/ci-cd-bootstrap/P1_enable_actions.md
+++ b/plans/ci-cd-bootstrap/P1_enable_actions.md
@ -1,322 +0,0 @@
 # Phase 1: Enable Forgejo Actions
 **Goal**: Configure Forgejo to support Actions workflows and deploy a runner in k8s
 **Status**: Completed (2026-01-23)
 **Prerequisites**: None (uses existing brew-based Forgejo)
 ---
 ## Current State
 - Forgejo runs via `brew services` on indri
 - Config at `/opt/homebrew/var/forgejo/custom/conf/app.ini`
 - Actions not enabled
 - No runners deployed
 ---
 ## Step 1: Enable Actions in Forgejo
 ### 1.1 Update app.ini
 SSH to indri and edit the Forgejo config:
 ```bash
 ssh indri 'vim /opt/homebrew/var/forgejo/custom/conf/app.ini'
 ```
 Add the following sections:
 ```ini
 [actions]
 ENABLED = true
 DEFAULT_ACTIONS_URL = https://code.forgejo.org
 [repository]
 ; Allow workflows to be stored in .forgejo/workflows
 DEFAULT_REPO_UNITS = repo.code,repo.issues,repo.pulls,repo.releases,repo.wiki,repo.projects,repo.packages,repo.actions
 ```
 ### 1.2 Restart Forgejo
 ```bash
 ssh indri 'brew services restart forgejo'
 ```
 ### 1.3 Verify Actions Enabled
 1. Go to https://forge.tail8d86e.ts.net
 2. Navigate to any repo → Settings → Actions
 3. Should see "Enable Repository Actions" option
 ---
 ## Step 2: Create Runner Registration Token
 ### 2.1 Generate Token in Forgejo UI
 1. Go to https://forge.tail8d86e.ts.net/admin/actions/runners
 2. Click "Create new Runner"
 3. Copy the registration token
 4. Store in 1Password (blumeops vault) as "Forgejo Runner Token"
 ### 2.2 Create k8s Secret Template
 Create `argocd/manifests/forgejo-runner/secret-token.yaml.tpl`:
 ```yaml
 # Template for op inject
 apiVersion: v1
 kind: Secret
 metadata:
  name: forgejo-runner-token
  namespace: forgejo-runner
 type: Opaque
 stringData:
  token: "op://blumeops/<runner-token-item>/token"
 ```
 ---
 ## Step 3: Deploy Runner to Kubernetes
 ### 3.1 Create ArgoCD Application
 Create `argocd/apps/forgejo-runner.yaml`:
 ```yaml
 apiVersion: argoproj.io/v1alpha1
 kind: Application
 metadata:
  name: forgejo-runner
  namespace: argocd
 spec:
  project: default
  source:
    repoURL: ssh://forgejo@indri.tail8d86e.ts.net:2200/eblume/blumeops.git
    targetRevision: main
    path: argocd/manifests/forgejo-runner
  destination:
    server: https://kubernetes.default.svc
    namespace: forgejo-runner
  syncPolicy:
    syncOptions:
      - CreateNamespace=true
 ```
 ### 3.2 Create Runner Manifests
 Create directory `argocd/manifests/forgejo-runner/` with:
 **kustomization.yaml**:
 ```yaml
 apiVersion: kustomize.config.k8s.io/v1beta1
 kind: Kustomization
 namespace: forgejo-runner
 resources:
  - namespace.yaml
  - deployment.yaml
  - serviceaccount.yaml
  - secret-token.yaml
 ```
 **namespace.yaml**:
 ```yaml
 apiVersion: v1
 kind: Namespace
 metadata:
  name: forgejo-runner
 ```
 **serviceaccount.yaml**:
 ```yaml
 apiVersion: v1
 kind: ServiceAccount
 metadata:
  name: forgejo-runner
  namespace: forgejo-runner
 ```
 **deployment.yaml**:
 ```yaml
 apiVersion: apps/v1
 kind: Deployment
 metadata:
  name: forgejo-runner
  namespace: forgejo-runner
 spec:
  replicas: 1
  selector:
    matchLabels:
      app: forgejo-runner
  template:
    metadata:
      labels:
        app: forgejo-runner
    spec:
      serviceAccountName: forgejo-runner
      containers:
        - name: runner
          image: code.forgejo.org/forgejo/runner:3.5.1
          env:
            - name: FORGEJO_INSTANCE_URL
              value: "https://forge.tail8d86e.ts.net"
            - name: RUNNER_NAME
              value: "k8s-runner-1"
            - name: RUNNER_TOKEN
              valueFrom:
                secretKeyRef:
                  name: forgejo-runner-token
                  key: token
          command:
            - /bin/sh
            - -c
            - |
              # Register runner if not already registered
              if [ ! -f /data/.runner ]; then
                forgejo-runner register \
                  --instance "$FORGEJO_INSTANCE_URL" \
                  --token "$RUNNER_TOKEN" \
                  --name "$RUNNER_NAME" \
                  --labels "ubuntu-latest:docker://node:20-bookworm,ubuntu-22.04:docker://ubuntu:22.04" \
                  --no-interactive
              fi
              # Start the runner daemon
              forgejo-runner daemon
          volumeMounts:
            - name: runner-data
              mountPath: /data
            - name: docker-sock
              mountPath: /var/run/docker.sock
          resources:
            requests:
              memory: "256Mi"
              cpu: "100m"
            limits:
              memory: "1Gi"
              cpu: "1000m"
      volumes:
        - name: runner-data
          emptyDir: {}
        - name: docker-sock
          hostPath:
            path: /var/run/docker.sock
            type: Socket
 ```
 **Note**: The runner needs access to Docker to run workflow jobs in containers. In minikube with docker driver, `/var/run/docker.sock` is available.
 ---
 ## Step 4: Deploy and Verify
 ### 4.1 Inject Secrets and Deploy
 ```bash
 # Inject secrets
 op inject -i argocd/manifests/forgejo-runner/secret-token.yaml.tpl \
  -o argocd/manifests/forgejo-runner/secret-token.yaml
 # Sync apps
 argocd app sync apps
 argocd app sync forgejo-runner
 ```
 ### 4.2 Verify Runner Registration
 ```bash
 # Check runner pod
 kubectl --context=minikube-indri -n forgejo-runner get pods
 # Check runner logs
 kubectl --context=minikube-indri -n forgejo-runner logs -f deployment/forgejo-runner
 # Verify in Forgejo UI
 # Go to https://forge.tail8d86e.ts.net/admin/actions/runners
 # Should see "k8s-runner-1" as online
 ```
 ---
 ## Step 5: Test with Simple Workflow
 ### 5.1 Create Test Workflow
 In the blumeops repo, create `.forgejo/workflows/test.yml`:
 ```yaml
 name: Test CI
 on:
  push:
    branches: [main]
  pull_request:
  workflow_dispatch:
 jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Hello World
        run: |
          echo "Hello from Forgejo Actions!"
          echo "Runner: ${{ runner.name }}"
          echo "Repo: ${{ github.repository }}"
 ```
 ### 5.2 Push and Verify
 ```bash
 git add .forgejo/
 git commit -m "Add test workflow for Forgejo Actions"
 git push
 ```
 Check https://forge.tail8d86e.ts.net/eblume/blumeops/actions for the workflow run.
 ---
 ## Verification Checklist
 - [x] Actions enabled in app.ini
 - [x] Forgejo restarted successfully
 - [x] Runner token stored in 1Password
 - [x] Runner deployment created in ArgoCD
 - [x] Runner pod running in k8s
 - [x] Runner shows as online in Forgejo admin
 - [x] Test workflow runs successfully
 ---
 ## Troubleshooting
 ### Runner Can't Connect to Forgejo
 The runner needs to reach `forge.tail8d86e.ts.net` from inside k8s. This should work via Tailscale operator egress (already configured for ArgoCD).
 If not working:
 ```bash
 # Test from inside k8s
 kubectl --context=minikube-indri run -it --rm curl --image=curlimages/curl -- \
  curl -v https://forge.tail8d86e.ts.net/api/v1/version
 ```
 ### Docker Socket Permission Denied
 The runner container needs to access the Docker socket. In minikube with docker driver, this should work. If permission denied:
 ```bash
 # Check socket permissions
 kubectl --context=minikube-indri -n forgejo-runner exec deployment/forgejo-runner -- ls -la /var/run/docker.sock
 ```
 May need to run runner as root or adjust security context.
 ---
 ## Next Phase
 Once runner is working, proceed to [Phase 2: Mirror & Build](P2_mirror_and_build.md).
--- a/plans/ci-cd-bootstrap/P2_mirror_and_build.md
+++ b/plans/ci-cd-bootstrap/P2_mirror_and_build.md
@ -1,347 +0,0 @@
 # Phase 2: Custom Runner Image
 **Goal**: Build a custom forgejo-runner image with necessary tools, enabling standard GitHub Actions
 **Status**: Complete (2026-01-23)
 **Prerequisites**: [Phase 1](P1_enable_actions.md) complete (Actions enabled, runner deployed in host mode)
 ---
 ## Problem Statement
 The stock `code.forgejo.org/forgejo/runner:3.5.1` image lacks tools needed for standard GitHub Actions:
 - **Node.js** - Required by most actions (checkout, setup-*, etc.)
 - **Git** - For repository operations (present but minimal)
 - **Common build tools** - make, gcc, curl, jq, etc.
 In host mode, jobs run directly in the runner container, so these tools must be pre-installed.
 ### Chicken-and-Egg Problem
 We can't use `actions/checkout@v4` to build the custom runner because that action requires Node.js, which we don't have yet. Solution: Bootstrap manually, then automate.
 ---
 ## Step 1: Create Dockerfile for Custom Runner
 Create `argocd/manifests/forgejo-runner/Dockerfile`:
 ```dockerfile
 FROM code.forgejo.org/forgejo/runner:3.5.1
 # The base image is Debian-based
 # Install tools needed for GitHub Actions and builds
 RUN apt-get update && apt-get install -y --no-install-recommends \
    # Required for actions/checkout and other Node-based actions
    nodejs \
    npm \
    # Build essentials
    git \
    curl \
    wget \
    jq \
    make \
    gcc \
    g++ \
    # For container builds (if we add Docker-in-Docker later)
    ca-certificates \
    && rm -rf /var/lib/apt/lists/*
 # Verify Node.js is available
 RUN node --version && npm --version
 ```
 ---
 ## Step 2: Bootstrap - Build Image Manually
 Since we can't use CI yet, build the image manually on gilbert and push to zot.
 ### 2.1 Build with Podman
 ```bash
 cd ~/code/personal/blumeops/argocd/manifests/forgejo-runner
 # Build for linux/arm64 (minikube on M1 Mac)
 podman build --platform linux/arm64 -t registry.tail8d86e.ts.net/blumeops/forgejo-runner:latest .
 # Push to zot (no auth required)
 podman push registry.tail8d86e.ts.net/blumeops/forgejo-runner:latest
 ```
 ### 2.2 Verify Image in Registry
 ```bash
 curl -s https://registry.tail8d86e.ts.net/v2/blumeops/forgejo-runner/tags/list | jq .
 ```
 ---
 ## Step 3: Update Runner Deployment
 ### 3.1 Update deployment.yaml
 Change the image from stock to custom:
 ```yaml
 # Before
 image: code.forgejo.org/forgejo/runner:3.5.1
 # After
 image: registry.tail8d86e.ts.net/blumeops/forgejo-runner:latest
 ```
 ### 3.2 Update kustomization.yaml
 Add Dockerfile to resources (for reference, not deployed):
 ```yaml
 # Note: Dockerfile is for building, not k8s deployment
 # It lives here for co-location with the runner manifests
 ```
 ### 3.3 Sync Deployment
 ```bash
 argocd app sync forgejo-runner
 # Verify new image is running
 kubectl --context=minikube-indri -n forgejo-runner get pods -o jsonpath='{.items[*].spec.containers[*].image}'
 ```
 ---
 ## Step 4: Test with Real GitHub Action
 Now that we have Node.js, test with `actions/checkout@v4`.
 ### 4.1 Update Test Workflow
 Update `.forgejo/workflows/test.yml`:
 ```yaml
 name: Test CI
 on:
  push:
    branches: [main]
  pull_request:
  workflow_dispatch:
 jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Verify tools
        run: |
          echo "Node.js: $(node --version)"
          echo "npm: $(npm --version)"
          echo "Git: $(git --version)"
          echo "Make: $(make --version | head -1)"
      - name: Show repo info
        run: |
          echo "Repository: ${{ github.repository }}"
          echo "Branch: ${{ github.ref_name }}"
          ls -la
 ```
 ### 4.2 Push and Verify
 ```bash
 git add .forgejo/workflows/test.yml
 git commit -m "Test checkout action with custom runner"
 git push
 ```
 Check https://forge.tail8d86e.ts.net/eblume/blumeops/actions - should see successful run with `actions/checkout@v4`.
 ---
 ## Step 5: Create Auto-Build Workflow for Runner
 Now that Actions work properly, create a workflow to rebuild the runner image automatically.
 ### 5.1 Create Build Workflow
 Create `.forgejo/workflows/build-runner.yml`:
 ```yaml
 name: Build Runner Image
 on:
  push:
    paths:
      - 'argocd/manifests/forgejo-runner/Dockerfile'
      - '.forgejo/workflows/build-runner.yml'
  workflow_dispatch:
 env:
  REGISTRY: registry.tail8d86e.ts.net
  IMAGE_NAME: blumeops/forgejo-runner
 jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Build image
        run: |
          cd argocd/manifests/forgejo-runner
          # Use docker build (available in runner container)
          # Note: This builds for the runner's native arch
          docker build -t ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }} .
          docker tag ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }} \
                     ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:latest
      - name: Push to registry
        run: |
          # Zot has no auth, just push
          docker push ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
          docker push ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:latest
      - name: Verify push
        run: |
          curl -sf "https://${{ env.REGISTRY }}/v2/${{ env.IMAGE_NAME }}/tags/list" | jq .
          echo "Image pushed: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}"
 ```
 ### 5.2 Note on Docker-in-Docker
 The runner runs in host mode, so we need Docker CLI available. Options:
 1. **Add Docker CLI to the custom image** (see Dockerfile update below)
 2. **Mount Docker socket from minikube** (requires deployment change)
 3. **Use Podman instead** (rootless, no socket needed)
 For now, we'll add Docker CLI to the image and mount the socket.
 ### 5.3 Update Dockerfile for Docker Builds
 ```dockerfile
 FROM code.forgejo.org/forgejo/runner:3.5.1
 RUN apt-get update && apt-get install -y --no-install-recommends \
    nodejs \
    npm \
    git \
    curl \
    wget \
    jq \
    make \
    gcc \
    g++ \
    ca-certificates \
    # Docker CLI for building container images
    docker.io \
    && rm -rf /var/lib/apt/lists/*
 RUN node --version && npm --version && docker --version
 ```
 ### 5.4 Update Deployment for Docker Socket
 Add Docker socket mount to `deployment.yaml`:
 ```yaml
 volumeMounts:
  - name: runner-data
    mountPath: /data
  - name: runner-config
    mountPath: /config
  - name: docker-sock
    mountPath: /var/run/docker.sock
 volumes:
  - name: runner-data
    emptyDir: {}
  - name: runner-config
    configMap:
      name: forgejo-runner-config
  - name: docker-sock
    hostPath:
      path: /var/run/docker.sock
      type: Socket
 ```
 ---
 ## Step 6: Verification
 ### 6.1 Manual Image Build Works
 ```bash
 # On gilbert
 podman build --platform linux/arm64 -t registry.tail8d86e.ts.net/blumeops/forgejo-runner:test .
 podman push registry.tail8d86e.ts.net/blumeops/forgejo-runner:test
 ```
 ### 6.2 Runner Uses Custom Image
 ```bash
 kubectl --context=minikube-indri -n forgejo-runner get pods -o jsonpath='{.items[*].spec.containers[*].image}'
 # Should show: registry.tail8d86e.ts.net/blumeops/forgejo-runner:latest
 ```
 ### 6.3 GitHub Actions Work
 - `actions/checkout@v4` succeeds
 - Test workflow shows Node.js, npm, git versions
 ### 6.4 Auto-Build Workflow Works
 Push a change to the Dockerfile and verify:
 1. Workflow triggers
 2. Image builds successfully
 3. Image pushed to zot
 ---
 ## Verification Checklist
 - [x] Dockerfile created for custom runner (Alpine-based with apk)
 - [x] Image built manually on gilbert (podman build)
 - [x] Image pushed to zot registry
 - [x] Runner deployment updated to use custom image
 - [x] Runner pod running with new image
 - [x] `actions/checkout@v4` works in test workflow
 - [ ] Auto-build workflow created (deferred - needs Docker socket)
 - [ ] Docker socket mounted (for container builds)
 - [ ] Auto-build workflow successfully rebuilds runner
 ---
 ## Troubleshooting
 ### Image Pull Fails in Minikube
 Minikube needs to be able to pull from zot. Check registry mirror config:
 ```bash
 ssh indri 'minikube ssh -- cat /etc/containerd/certs.d/registry.tail8d86e.ts.net/hosts.toml'
 ```
 ### Docker Build Fails in Workflow
 If Docker socket mount doesn't work:
 1. Check socket exists in minikube: `minikube ssh -- ls -la /var/run/docker.sock`
 2. Check permissions: runner may need to be in docker group
 3. Alternative: Use `podman` (rootless) instead of Docker
 ### Node.js Actions Still Fail
 Ensure the runner pod restarted after image update:
 ```bash
 kubectl --context=minikube-indri -n forgejo-runner rollout restart deployment/forgejo-runner
 kubectl --context=minikube-indri -n forgejo-runner logs -f deployment/forgejo-runner
 ```
 ---
 ## Next Phase
 Once the custom runner is working with auto-build, proceed to [Phase 3: Mirror Forgejo & Build](P3_mirror_and_build.md) to set up Forgejo source builds.
--- a/plans/ci-cd-bootstrap/P3_mirror_forgejo.md
+++ b/plans/ci-cd-bootstrap/P3_mirror_forgejo.md
@ -1,349 +0,0 @@
 # Phase 3: Mirror Forgejo & Build from Source
 **Goal**: Mirror upstream Forgejo to forge and create a workflow that builds it for macOS ARM64
 **Status**: Planning
 **Prerequisites**: [Phase 2](P2_mirror_and_build.md) complete (custom runner image with Node.js/tools)
 ---
 ## Problem Statement
 We want to build Forgejo from source to:
 1. Have full control over the binary running on indri
 2. Enable self-deployment via CI
 3. Ensure proper macOS DNS resolution (requires CGO_ENABLED=1)
 ### The Cross-Compilation Challenge
 The runner runs in a Linux container (k8s on indri), but the target is macOS ARM64 (indri itself).
 **Options**:
 | Option | Pros | Cons |
 |--------|------|------|
 | A. Cross-compile CGO_ENABLED=0 | Simple, no special toolchain | Breaks Tailscale MagicDNS resolution |
 | B. Cross-compile CGO_ENABLED=1 | Proper DNS | Needs OSX cross-compiler (osxcross), complex |
 | C. Build on gilbert manually | Works now, simple | Not automated, manual step |
 | D. Native macOS runner on indri | Full native build | Runner outside k8s, different architecture |
 | E. Hybrid: build on gilbert, deploy via CI | Uses existing tools | Partial automation |
 **Recommendation**: Start with Option C/E (manual build on gilbert, CI just deploys), then consider Option D if we want full automation.
 ---
 ## Step 1: Mirror Upstream Forgejo
 ### 1.1 User Action: Create Mirror on Forge
 **Manual step** (hairpinning doesn't work from indri):
 1. Go to https://forge.tail8d86e.ts.net
 2. Click "+" → "New Migration"
 3. Select "Gitea" as clone source
 4. URL: `https://codeberg.org/forgejo/forgejo.git`
 5. Repository name: `forgejo`
 6. Check "This repository will be a mirror"
 7. Click "Migrate Repository"
 ### 1.2 Clone Mirror Locally
 ```bash
 git clone ssh://forgejo@forge.tail8d86e.ts.net/eblume/forgejo.git ~/code/3rd/forgejo
 cd ~/code/3rd/forgejo
 ```
 ---
 ## Step 2: Understand Forgejo Build Process
 ### 2.1 Build Requirements
 From Forgejo's `Makefile` and docs:
 - **Go**: 1.23+ (check `go.mod` for exact version)
 - **Node.js**: 20+ (for frontend)
 - **Make**: GNU Make
 - **Git**: For version embedding
 ### 2.2 Build Commands
 ```bash
 # Install frontend dependencies and build
 make deps-frontend
 make frontend
 # Build backend (with CGO for proper DNS on macOS)
 CGO_ENABLED=1 TAGS="bindata sqlite sqlite_unlock_notify" make backend
 # Or all-in-one
 CGO_ENABLED=1 TAGS="bindata sqlite sqlite_unlock_notify" make build
 ```
 ### 2.3 Output
 Binary at `gitea` (yes, the binary is still named `gitea` for compatibility).
 ---
 ## Step 3: Build on Gilbert (Manual Bootstrap)
 For the initial bootstrap, build on gilbert (macOS ARM64 native).
 ### 3.1 Setup Build Environment
 ```bash
 cd ~/code/3rd/forgejo
 mise use go@1.23 node@20
 # Verify tools
 go version
 node --version
 make --version
 ```
 ### 3.2 Build
 ```bash
 # Clean build
 make clean
 # Build frontend
 make deps-frontend
 make frontend
 # Build backend with CGO (important for macOS DNS!)
 CGO_ENABLED=1 TAGS="bindata sqlite sqlite_unlock_notify" make backend
 # Verify binary
 ./gitea --version
 file gitea  # Should show: Mach-O 64-bit executable arm64
 ```
 ### 3.3 Deploy to Indri
 ```bash
 # Copy binary
 scp gitea indri:~/.local/bin/forgejo-new
 # Verify on indri
 ssh indri '~/.local/bin/forgejo-new --version'
 ```
 ---
 ## Step 4: Create Deploy Workflow (Option E)
 Since cross-compilation is complex, use a hybrid approach:
 1. Build on gilbert (manual trigger or pre-built)
 2. CI workflow fetches and deploys
 ### 4.1 SSH Deploy Key for Runner
 The runner needs SSH access to indri to deploy the binary.
 **Generate key on gilbert**:
 ```bash
 ssh-keygen -t ed25519 -C "forgejo-runner-deploy" -f ~/.ssh/forgejo-runner-deploy -N ""
 ```
 **Add public key to indri's authorized_keys**:
 ```bash
 cat ~/.ssh/forgejo-runner-deploy.pub | ssh indri 'cat >> ~/.ssh/authorized_keys'
 ```
 **Store private key in 1Password** (blumeops vault) as "Forgejo Runner Deploy Key"
 ### 4.2 Create k8s Secret
 Create `argocd/manifests/forgejo-runner/secret-ssh.yaml.tpl`:
 ```yaml
 apiVersion: v1
 kind: Secret
 metadata:
  name: forgejo-runner-ssh
  namespace: forgejo-runner
 type: Opaque
 stringData:
  id_ed25519: |
    op://blumeops/<deploy-key-item>/private-key
  known_hosts: |
    # Get with: ssh-keyscan indri.tail8d86e.ts.net 2>/dev/null | grep ed25519
    indri.tail8d86e.ts.net ssh-ed25519 AAAAC3...
 ```
 ### 4.3 Update Deployment for SSH
 Add SSH secret mount to `deployment.yaml`:
 ```yaml
 volumeMounts:
  - name: ssh-key
    mountPath: /root/.ssh
    readOnly: true
 volumes:
  - name: ssh-key
    secret:
      secretName: forgejo-runner-ssh
      defaultMode: 0600
 ```
 ### 4.4 Create Deploy-Only Workflow
 Create `.forgejo/workflows/deploy-forgejo.yml` in blumeops:
 ```yaml
 name: Deploy Forgejo
 on:
  workflow_dispatch:
    inputs:
      version:
        description: 'Version to deploy (tag or commit)'
        required: true
        default: 'v10.0.0'
 jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Deploy to indri
        env:
          VERSION: ${{ github.event.inputs.version }}
        run: |
          # SSH config
          mkdir -p ~/.ssh
          cp /root/.ssh/id_ed25519 ~/.ssh/
          cp /root/.ssh/known_hosts ~/.ssh/
          chmod 600 ~/.ssh/id_ed25519
          # Deploy script
          ssh erichblume@indri.tail8d86e.ts.net << 'EOF'
            set -e
            cd ~/.local/bin
            # Verify the new binary exists and runs
            if [ ! -f forgejo-new ]; then
              echo "ERROR: forgejo-new not found. Build on gilbert first:"
              echo "  cd ~/code/3rd/forgejo && git checkout $VERSION"
              echo "  CGO_ENABLED=1 TAGS='bindata sqlite sqlite_unlock_notify' make build"
              echo "  scp gitea indri:~/.local/bin/forgejo-new"
              exit 1
            fi
            ./forgejo-new --version
            # Stop current service
            launchctl unload ~/Library/LaunchAgents/mcquack.eblume.forgejo.plist 2>/dev/null || true
            # Atomic swap
            mv forgejo forgejo-old 2>/dev/null || true
            mv forgejo-new forgejo
            # Start new service
            launchctl load ~/Library/LaunchAgents/mcquack.eblume.forgejo.plist
            # Verify it's running
            sleep 5
            curl -sf http://localhost:3001/api/v1/version || exit 1
            echo "Deploy successful!"
            ./forgejo --version
          EOF
 ```
 ---
 ## Future: Full CI Build (Option D)
 If we want full automation, consider running a native macOS runner on indri:
 ### Native Runner on Indri
 ```bash
 # Install forgejo-runner on indri via mise
 ssh indri 'mise use forgejo-runner'
 # Register as a macOS runner
 ssh indri 'forgejo-runner register \
  --instance https://forge.tail8d86e.ts.net \
  --token "$TOKEN" \
  --name "indri-native" \
  --labels "macos-arm64:host" \
  --no-interactive'
 # Create LaunchAgent for runner
 # (similar to other mcquack services)
 ```
 Then workflow uses:
 ```yaml
 runs-on: macos-arm64
 ```
 This enables full native builds in CI. Document in a future phase if needed.
 ---
 ## Verification Checklist
 - [ ] Forgejo mirrored to forge
 - [ ] Mirror cloned to ~/code/3rd/forgejo
 - [ ] Build succeeds on gilbert
 - [ ] Binary is valid macOS ARM64 executable
 - [ ] Binary deployed to indri ~/.local/bin/
 - [ ] SSH deploy key created and stored in 1Password
 - [ ] Deploy key added to indri authorized_keys
 - [ ] (Optional) k8s SSH secret created
 - [ ] (Optional) Deploy workflow created
 ---
 ## Troubleshooting
 ### Build Fails: Node.js Version
 ```
 error: engine "node" is incompatible
 ```
 Update Node.js: `mise use node@20`
 ### Build Fails: Go Version
 ```
 go: go.mod requires go >= 1.23
 ```
 Update Go: `mise use go@1.23`
 ### Binary Crashes on indri
 Check if CGO was enabled:
 ```bash
 # If built without CGO, DNS resolution may fail
 ./forgejo --version  # Should work
 ./forgejo web        # May fail to resolve Tailscale hostnames
 ```
 Rebuild with `CGO_ENABLED=1`.
 ### SSH Deploy Fails
 Check runner has SSH access:
 ```bash
 # Test from inside runner pod
 kubectl --context=minikube-indri -n forgejo-runner exec deployment/forgejo-runner -- \
  ssh -i /root/.ssh/id_ed25519 erichblume@indri.tail8d86e.ts.net 'echo ok'
 ```
 ---
 ## Next Phase
 Once Forgejo is building and deploying successfully, proceed to [Phase 4: Self-Deploy](P4_self_deploy.md) for the full mcquack transition.
--- a/plans/ci-cd-bootstrap/P4_self_deploy.md
+++ b/plans/ci-cd-bootstrap/P4_self_deploy.md
@ -1,409 +0,0 @@
 # Phase 4: Self-Deploy & Transition to mcquack
 **Goal**: Complete the bootstrap - Forgejo deploys itself, transition from brew to mcquack LaunchAgent
 **Status**: Planning
 **Prerequisites**: [Phase 3](P3_mirror_forgejo.md) complete (Forgejo builds and deploys to indri)
 ---
 ## Overview
 This phase completes the bootstrap:
 1. First successful CI deploy creates the binary
 2. Transition from brew service to mcquack LaunchAgent
 3. Update ansible role to mcquack pattern
 4. Remove brew forgejo
 After this phase, Forgejo builds and deploys itself on every tagged release.
 ---
 ## Step 1: Prepare indri for mcquack
 ### 1.1 Create Directory Structure
 ```bash
 ssh indri << 'EOF'
  mkdir -p ~/.local/bin
  mkdir -p ~/.config/forgejo
  mkdir -p ~/Library/Logs
 EOF
 ```
 ### 1.2 Prepare Data Directory
 The existing data is at `/opt/homebrew/var/forgejo`. We'll keep it there for now (simpler), or optionally migrate to `~/forgejo`.
 **Option A: Keep existing path** (recommended for simplicity)
 - Data stays at `/opt/homebrew/var/forgejo`
 - Binary moves to `~/.local/bin/forgejo`
 **Option B: Full migration**
 - Move data to `~/forgejo`
 - Requires updating app.ini paths
 For this plan, we'll use Option A.
 ---
 ## Step 2: First CI Deploy
 ### 2.1 Trigger Build with Deploy
 1. Go to https://forge.tail8d86e.ts.net/eblume/forgejo/actions
 2. Select "Build Forgejo" workflow
 3. Click "Run workflow"
 4. Set deploy=true
 5. Monitor the run
 ### 2.2 Verify Binary Deployed
 ```bash
 ssh indri 'ls -la ~/.local/bin/forgejo && ~/.local/bin/forgejo --version'
 ```
 At this point:
 - New binary is at `~/.local/bin/forgejo`
 - Brew forgejo is still running
 - LaunchAgent doesn't exist yet
 ---
 ## Step 3: Create mcquack LaunchAgent
 ### 3.1 Create Plist Manually (One-Time Bootstrap)
 ```bash
 ssh indri << 'EOF'
 cat > ~/Library/LaunchAgents/mcquack.eblume.forgejo.plist << 'PLIST'
 <?xml version="1.0" encoding="UTF-8"?>
 <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
 <plist version="1.0">
 <dict>
  <key>Label</key>
  <string>mcquack.eblume.forgejo</string>
  <key>ProgramArguments</key>
  <array>
    <string>/Users/erichblume/.local/bin/forgejo</string>
    <string>web</string>
    <string>--config</string>
    <string>/opt/homebrew/var/forgejo/custom/conf/app.ini</string>
    <string>--work-path</string>
    <string>/opt/homebrew/var/forgejo</string>
  </array>
  <key>RunAtLoad</key>
  <true/>
  <key>KeepAlive</key>
  <true/>
  <key>StandardOutPath</key>
  <string>/Users/erichblume/Library/Logs/mcquack.forgejo.out.log</string>
  <key>StandardErrorPath</key>
  <string>/Users/erichblume/Library/Logs/mcquack.forgejo.err.log</string>
  <key>EnvironmentVariables</key>
  <dict>
    <key>HOME</key>
    <string>/Users/erichblume</string>
    <key>USER</key>
    <string>erichblume</string>
  </dict>
 </dict>
 </plist>
 PLIST
 EOF
 ```
 ---
 ## Step 4: Cutover from Brew to mcquack
 ### 4.1 Stop Brew Service
 ```bash
 ssh indri 'brew services stop forgejo'
 ```
 ### 4.2 Start mcquack Service
 ```bash
 ssh indri 'launchctl load ~/Library/LaunchAgents/mcquack.eblume.forgejo.plist'
 ```
 ### 4.3 Verify Service Running
 ```bash
 # Check process
 ssh indri 'launchctl list | grep forgejo'
 # Check logs
 ssh indri 'tail -20 ~/Library/Logs/mcquack.forgejo.err.log'
 # Check HTTP
 curl -s https://forge.tail8d86e.ts.net/api/v1/version
 ```
 ### 4.4 Verify Git Operations
 ```bash
 # SSH test
 ssh -T forgejo@forge.tail8d86e.ts.net
 # Clone test
 git clone ssh://forgejo@forge.tail8d86e.ts.net/eblume/blumeops.git /tmp/test-clone
 rm -rf /tmp/test-clone
 ```
 ---
 ## Step 5: Update Ansible Role
 ### 5.1 Rewrite forgejo Role
 Replace `ansible/roles/forgejo/tasks/main.yml`:
 ```yaml
 ---
 # Forgejo is built from source via CI and deployed automatically.
 # This role manages the configuration and LaunchAgent only.
 #
 # BINARY DEPLOYMENT:
 # The binary at ~/.local/bin/forgejo is deployed by Forgejo Actions CI.
 # If missing, trigger a build at:
 #   https://forge.tail8d86e.ts.net/eblume/forgejo/actions
 #
 # CONFIGURATION:
 # app.ini at /opt/homebrew/var/forgejo/custom/conf/app.ini contains secrets
 # and is NOT managed by ansible. It is backed up by borgmatic.
 - name: Verify forgejo binary exists
  ansible.builtin.stat:
    path: "{{ forgejo_binary }}"
  register: forgejo_binary_stat
 - name: Fail if forgejo binary not found
  ansible.builtin.fail:
    msg: |
      Forgejo binary not found at {{ forgejo_binary }}.
      The binary is deployed by Forgejo Actions CI. To build and deploy:
      1. Go to https://forge.tail8d86e.ts.net/eblume/forgejo/actions
      2. Select "Build Forgejo" workflow
      3. Click "Run workflow" with deploy=true
      Alternatively, build manually on gilbert and scp to indri.
  when: not forgejo_binary_stat.stat.exists
 - name: Check forgejo config exists
  ansible.builtin.stat:
    path: "{{ forgejo_config }}"
  register: forgejo_config_stat
 - name: Fail if forgejo config is missing
  ansible.builtin.fail:
    msg: |
      Forgejo config not found at {{ forgejo_config }}
      This file contains secrets and is not managed by ansible.
      To restore from backup, run:
        borgmatic --config ~/.config/borgmatic/config.yaml extract --archive latest \
        --path {{ forgejo_config }}
  when: not forgejo_config_stat.stat.exists
 - name: Deploy forgejo LaunchAgent plist
  ansible.builtin.template:
    src: forgejo.plist.j2
    dest: ~/Library/LaunchAgents/mcquack.eblume.forgejo.plist
    mode: '0644'
  notify: Restart forgejo
 - name: Check if forgejo LaunchAgent is loaded
  ansible.builtin.command: launchctl list mcquack.eblume.forgejo
  register: forgejo_launchctl_check
  changed_when: false
  failed_when: false
 - name: Load forgejo LaunchAgent if not loaded
  ansible.builtin.command: launchctl load ~/Library/LaunchAgents/mcquack.eblume.forgejo.plist
  when: forgejo_launchctl_check.rc != 0
  changed_when: true
  failed_when: false
 ```
 ### 5.2 Create defaults/main.yml
 ```yaml
 ---
 # Forgejo binary and paths
 forgejo_binary: /Users/erichblume/.local/bin/forgejo
 forgejo_work_path: /opt/homebrew/var/forgejo
 forgejo_config: "{{ forgejo_work_path }}/custom/conf/app.ini"
 forgejo_log_dir: /Users/erichblume/Library/Logs
 # HTTP and SSH ports (must match app.ini)
 forgejo_http_port: 3001
 forgejo_ssh_port: 2200
 ```
 ### 5.3 Create templates/forgejo.plist.j2
 ```xml
 <?xml version="1.0" encoding="UTF-8"?>
 <!-- {{ ansible_managed }} -->
 <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
 <plist version="1.0">
 <dict>
  <key>Label</key>
  <string>mcquack.eblume.forgejo</string>
  <key>ProgramArguments</key>
  <array>
    <string>{{ forgejo_binary }}</string>
    <string>web</string>
    <string>--config</string>
    <string>{{ forgejo_config }}</string>
    <string>--work-path</string>
    <string>{{ forgejo_work_path }}</string>
  </array>
  <key>RunAtLoad</key>
  <true/>
  <key>KeepAlive</key>
  <true/>
  <key>StandardOutPath</key>
  <string>{{ forgejo_log_dir }}/mcquack.forgejo.out.log</string>
  <key>StandardErrorPath</key>
  <string>{{ forgejo_log_dir }}/mcquack.forgejo.err.log</string>
  <key>EnvironmentVariables</key>
  <dict>
    <key>HOME</key>
    <string>/Users/erichblume</string>
    <key>USER</key>
    <string>erichblume</string>
  </dict>
 </dict>
 </plist>
 ```
 ### 5.4 Update handlers/main.yml
 ```yaml
 ---
 - name: Restart forgejo
  ansible.builtin.shell: |
    launchctl unload ~/Library/LaunchAgents/mcquack.eblume.forgejo.plist 2>/dev/null || true
    launchctl load ~/Library/LaunchAgents/mcquack.eblume.forgejo.plist
  changed_when: true
 ```
 ---
 ## Step 6: Update Alloy Log Collection
 Update `ansible/roles/alloy/defaults/main.yml`:
 Change forgejo log paths from brew to mcquack:
 ```yaml
 alloy_brew_logs:
  # Remove forgejo from here
  - path: /opt/homebrew/var/log/tailscaled.log
    service: tailscale
    stream: stdout
 alloy_mcquack_logs:
  # ... existing entries ...
  - path: /Users/erichblume/Library/Logs/mcquack.forgejo.out.log
    service: forgejo
    stream: stdout
  - path: /Users/erichblume/Library/Logs/mcquack.forgejo.err.log
    service: forgejo
    stream: stderr
 ```
 ---
 ## Step 7: Remove Brew Forgejo
 ### 7.1 Uninstall Brew Package
 ```bash
 ssh indri 'brew uninstall forgejo'
 ```
 ### 7.2 Remove Old Logs
 ```bash
 ssh indri 'rm -f /opt/homebrew/var/log/forgejo.log'
 ```
 ---
 ## Step 8: Run Ansible
 ```bash
 mise run provision-indri -- --tags forgejo,alloy
 ```
 ---
 ## Disaster Recovery
 ### If CI Deploy Breaks Forgejo
 1. **Build manually on gilbert**:
   ```bash
   cd ~/code/3rd/forgejo
   git pull
   mise use go node
   TAGS="bindata sqlite sqlite_unlock_notify" make build
   scp gitea indri:~/.local/bin/forgejo
   ```
 2. **Restart service**:
   ```bash
   ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.eblume.forgejo.plist; launchctl load ~/Library/LaunchAgents/mcquack.eblume.forgejo.plist'
   ```
 3. **Verify**:
   ```bash
   curl https://forge.tail8d86e.ts.net/api/v1/version
   ```
 ### If Forgejo Won't Start
 1. Check logs: `ssh indri 'tail -100 ~/Library/Logs/mcquack.forgejo.err.log'`
 2. Check binary: `ssh indri '~/.local/bin/forgejo --version'`
 3. Check config: `ssh indri 'cat /opt/homebrew/var/forgejo/custom/conf/app.ini | head -50'`
 4. Try running manually: `ssh indri '~/.local/bin/forgejo web --config /opt/homebrew/var/forgejo/custom/conf/app.ini --work-path /opt/homebrew/var/forgejo'`
 ### Switch ArgoCD to GitHub (Nuclear Option)
 If Forgejo is down and you need to deploy fixes:
 ```bash
 argocd repo add https://github.com/eblume/blumeops.git --username eblume --password $GITHUB_PAT
 argocd app set apps --repo https://github.com/eblume/blumeops.git
 argocd app sync apps
 ```
 After recovery, switch back to Forgejo.
 ---
 ## Verification Checklist
 - [ ] CI deploy completed successfully
 - [ ] Binary at `~/.local/bin/forgejo`
 - [ ] mcquack LaunchAgent created
 - [ ] Brew service stopped
 - [ ] mcquack service started
 - [ ] HTTP works (`curl https://forge.tail8d86e.ts.net/api/v1/version`)
 - [ ] SSH works (`ssh -T forgejo@forge.tail8d86e.ts.net`)
 - [ ] Git clone/push works
 - [ ] Ansible role updated
 - [ ] Alloy logs updated
 - [ ] Brew package uninstalled
 - [ ] `mise run provision-indri` succeeds
 ---
 ## Next Phase
 After bootstrap is complete, proceed to [Phase 5: Container Builds](P5_container_builds.md) to set up container image building for ArgoCD.
--- a/plans/ci-cd-bootstrap/P5_container_builds.md
+++ b/plans/ci-cd-bootstrap/P5_container_builds.md
@ -1,505 +0,0 @@
 # Phase 5: Container Image Builds
 **Goal**: Set up CI workflows to build custom container images and push to zot registry
 **Status**: Planning
 **Prerequisites**: [Phase 4](P4_self_deploy.md) complete (Forgejo self-deploying, Actions working)
 ---
 ## Overview
 With Forgejo Actions operational (including custom runner from P2), we can now build container images for:
 - Custom devpi with pre-installed plugins
 - Any other custom images needed for k8s services
 - Release artifacts for Python packages
 **Note**: The custom runner image build is covered in [Phase 2](P2_mirror_and_build.md). This phase focuses on application container builds.
 ---
 ## Use Case 1: devpi Custom Image
 ### Current State
 devpi runs from `registry.tail8d86e.ts.net/blumeops/devpi:latest`, built manually:
 - Base image: python
 - Adds: devpi-server, devpi-web
 - Startup script for auto-initialization
 ### Goal
 Automate builds triggered by:
 - Push to devpi repo on forge
 - Manual workflow dispatch
 - Optionally: upstream devpi release (via schedule check)
 ---
 ## Step 1: Create Workflow for devpi
 ### 1.1 Ensure devpi Repo Has Dockerfile
 The Dockerfile already exists at `argocd/manifests/devpi/Dockerfile`. We'll create a workflow in the blumeops repo that builds it.
 ### 1.2 Create Build Workflow
 Create `.forgejo/workflows/build-devpi.yml` in blumeops repo:
 ```yaml
 name: Build devpi Image
 on:
  push:
    paths:
      - 'argocd/manifests/devpi/Dockerfile'
      - 'argocd/manifests/devpi/start.sh'
      - '.forgejo/workflows/build-devpi.yml'
  workflow_dispatch:
    inputs:
      tag:
        description: 'Image tag (default: latest)'
        required: false
        default: 'latest'
 env:
  REGISTRY: registry.tail8d86e.ts.net
  IMAGE_NAME: blumeops/devpi
 jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
      - name: Determine tag
        id: tag
        run: |
          if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
            TAG="${{ github.event.inputs.tag }}"
          else
            TAG="latest"
          fi
          echo "tag=$TAG" >> "$GITHUB_OUTPUT"
      - name: Build image
        uses: docker/build-push-action@v5
        with:
          context: argocd/manifests/devpi
          file: argocd/manifests/devpi/Dockerfile
          platforms: linux/arm64
          load: true
          tags: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ steps.tag.outputs.tag }}
      - name: Push to registry
        run: |
          # Zot has no auth, just push
          docker push ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ steps.tag.outputs.tag }}
      - name: Verify push
        run: |
          # Check image exists in registry
          curl -sf "https://${{ env.REGISTRY }}/v2/${{ env.IMAGE_NAME }}/tags/list" | jq .
 ```
 ### 1.3 Runner Needs Registry Access
 The runner needs to reach `registry.tail8d86e.ts.net`. This should work via Tailscale egress (same as Forgejo access).
 If not, add egress for registry in `argocd/manifests/tailscale-operator/`:
 ```yaml
 apiVersion: tailscale.com/v1alpha1
 kind: Connector
 metadata:
  name: egress-registry
  namespace: tailscale-operator
 spec:
  hostname: egress-registry
  subnetRouter:
    advertiseRoutes:
      - registry.tail8d86e.ts.net/32
 ```
 ---
 ## Step 2: Test Build Workflow
 ### 2.1 Push and Trigger
 ```bash
 # Make a small change to trigger
 echo "# Build $(date)" >> argocd/manifests/devpi/Dockerfile
 git add argocd/manifests/devpi/Dockerfile
 git commit -m "Trigger devpi image rebuild"
 git push
 ```
 ### 2.2 Monitor Build
 1. Go to https://forge.tail8d86e.ts.net/eblume/blumeops/actions
 2. Watch "Build devpi Image" workflow
 3. Verify success
 ### 2.3 Verify Image in Registry
 ```bash
 curl -s https://registry.tail8d86e.ts.net/v2/blumeops/devpi/tags/list | jq .
 ```
 ### 2.4 Restart devpi to Use New Image
 ```bash
 kubectl --context=minikube-indri -n devpi rollout restart statefulset/devpi
 ```
 ---
 ## Step 3: Reusable Container Build Workflow
 ### 3.1 Create Reusable Workflow
 Create `.forgejo/workflows/build-container.yml`:
 ```yaml
 name: Build Container Image
 on:
  workflow_call:
    inputs:
      context:
        description: 'Build context path'
        required: true
        type: string
      dockerfile:
        description: 'Dockerfile path (relative to context)'
        required: false
        type: string
        default: 'Dockerfile'
      image_name:
        description: 'Image name (without registry)'
        required: true
        type: string
      tag:
        description: 'Image tag'
        required: false
        type: string
        default: 'latest'
      platforms:
        description: 'Target platforms'
        required: false
        type: string
        default: 'linux/arm64'
 env:
  REGISTRY: registry.tail8d86e.ts.net
 jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
      - name: Build and push
        uses: docker/build-push-action@v5
        with:
          context: ${{ inputs.context }}
          file: ${{ inputs.context }}/${{ inputs.dockerfile }}
          platforms: ${{ inputs.platforms }}
          push: true
          tags: ${{ env.REGISTRY }}/${{ inputs.image_name }}:${{ inputs.tag }}
      - name: Verify push
        run: |
          curl -sf "https://${{ env.REGISTRY }}/v2/${{ inputs.image_name }}/tags/list" | jq .
 ```
 ### 3.2 Use in devpi Workflow
 Simplify `.forgejo/workflows/build-devpi.yml`:
 ```yaml
 name: Build devpi Image
 on:
  push:
    paths:
      - 'argocd/manifests/devpi/**'
  workflow_dispatch:
 jobs:
  build:
    uses: ./.forgejo/workflows/build-container.yml
    with:
      context: argocd/manifests/devpi
      image_name: blumeops/devpi
 ```
 ---
 ## Step 4: Python Package Builds (Optional)
 ### 4.1 Use Case
 Build Python packages from forge repos and publish to devpi.
 Example: `mcquack` package (LaunchAgent management library)
 ### 4.2 Create Python Build Workflow
 Create `.forgejo/workflows/build-python.yml`:
 ```yaml
 name: Build Python Package
 on:
  workflow_call:
    inputs:
      package_path:
        description: 'Path to package (contains pyproject.toml)'
        required: false
        type: string
        default: '.'
      python_version:
        description: 'Python version'
        required: false
        type: string
        default: '3.12'
      publish:
        description: 'Publish to devpi'
        required: false
        type: boolean
        default: false
    secrets:
      DEVPI_PASSWORD:
        required: false
 env:
  DEVPI_URL: https://pypi.tail8d86e.ts.net
 jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: ${{ inputs.python_version }}
      - name: Install uv
        run: pip install uv
      - name: Build package
        run: |
          cd ${{ inputs.package_path }}
          uv build
      - name: Upload artifact
        uses: actions/upload-artifact@v4
        with:
          name: dist
          path: ${{ inputs.package_path }}/dist/
      - name: Publish to devpi
        if: inputs.publish
        run: |
          cd ${{ inputs.package_path }}
          uv publish \
            --publish-url ${{ env.DEVPI_URL }}/eblume/dev/ \
            --username eblume \
            --password "${{ secrets.DEVPI_PASSWORD }}"
 ```
 ---
 ## Step 5: Scheduled Builds (Cron)
 ### 5.1 Weekly Rebuild
 Keep images fresh with weekly rebuilds:
 ```yaml
 name: Weekly Image Rebuilds
 on:
  schedule:
    # Every Sunday at 3 AM UTC
    - cron: '0 3 * * 0'
  workflow_dispatch:
 jobs:
  devpi:
    uses: ./.forgejo/workflows/build-container.yml
    with:
      context: argocd/manifests/devpi
      image_name: blumeops/devpi
 ```
 ---
 ## Future Improvements
 ### Multi-Arch Builds
 For images that need both ARM64 and AMD64:
 ```yaml
 platforms: linux/arm64,linux/amd64
 ```
 Requires QEMU emulation setup in runner (already supported by buildx).
 ### Build Caching
 Use GitHub/Forgejo cache actions:
 ```yaml
 - name: Cache Docker layers
  uses: actions/cache@v4
  with:
    path: /tmp/.buildx-cache
    key: ${{ runner.os }}-buildx-${{ hashFiles('**/Dockerfile') }}
 ```
 ### Security Scanning
 Add Trivy or similar:
 ```yaml
 - name: Run Trivy vulnerability scanner
  uses: aquasecurity/trivy-action@master
  with:
    image-ref: '${{ env.REGISTRY }}/${{ inputs.image_name }}:${{ inputs.tag }}'
 ```
 ---
 ## Step 6: Runner Observability (Logging & Metrics)
 ### 6.1 Problem
 The forgejo-runner pod generates logs and metrics that should be collected for:
 - Debugging failed workflow runs
 - Monitoring runner health and capacity
 - Alerting on runner failures
 ### 6.2 Log Collection via Alloy
 The forgejo-runner namespace needs to be included in Alloy's k8s log collection. Alloy is already configured to scrape logs from k8s pods - verify the runner namespace is included.
 Check current Alloy config:
 ```bash
 ssh indri 'cat ~/.config/alloy/config.alloy | grep -A20 discovery.kubernetes'
 ```
 If using namespace filtering, ensure `forgejo-runner` is included.
 ### 6.3 Metrics Collection
 The forgejo-runner exposes Prometheus metrics. Add a ServiceMonitor or configure Alloy to scrape:
 **Option A: ServiceMonitor (if using Prometheus Operator)**
 Create `argocd/manifests/forgejo-runner/servicemonitor.yaml`:
 ```yaml
 apiVersion: monitoring.coreos.com/v1
 kind: ServiceMonitor
 metadata:
  name: forgejo-runner
  namespace: forgejo-runner
 spec:
  selector:
    matchLabels:
      app: forgejo-runner
  endpoints:
    - port: metrics
      interval: 30s
 ```
 **Option B: Alloy scrape config**
 Add to Alloy's k8s scrape config to discover the runner pod's metrics endpoint.
 ### 6.4 Create Runner Service for Metrics
 Add `argocd/manifests/forgejo-runner/service.yaml`:
 ```yaml
 apiVersion: v1
 kind: Service
 metadata:
  name: forgejo-runner-metrics
  namespace: forgejo-runner
  labels:
    app: forgejo-runner
 spec:
  selector:
    app: forgejo-runner
  ports:
    - name: metrics
      port: 8080
      targetPort: 8080
 ```
 Update kustomization.yaml to include the service.
 ### 6.5 Grafana Dashboard
 Consider creating a dashboard for:
 - Runner status (online/offline)
 - Job queue depth
 - Job execution time
 - Success/failure rates
 ### 6.6 Verification
 ```bash
 # Check runner logs are appearing in Loki
 # Go to Grafana → Explore → Loki
 # Query: {namespace="forgejo-runner"}
 # Check metrics are being scraped
 # Go to Grafana → Explore → Prometheus
 # Query: forgejo_runner_*
 ```
 ---
 ## Verification Checklist
 - [ ] devpi build workflow created
 - [ ] devpi image builds successfully
 - [ ] Image pushed to zot registry
 - [ ] devpi pod uses new image
 - [ ] Reusable container workflow created
 - [ ] (Optional) Python build workflow created
 - [ ] (Optional) Scheduled builds configured
 - [ ] Runner logs visible in Loki
 - [ ] Runner metrics scraped by Prometheus/Alloy
 ---
 ## Summary
 With this phase complete, we have:
 1. **Forgejo Actions** running with k8s runner
 2. **Forgejo self-deploys** from CI on tagged releases
 3. **Container images** built automatically on push
 4. Infrastructure for Python package builds
 5. **Runner observability** with logs in Loki and metrics in Prometheus
 The CI/CD bootstrap is complete. Future work:
 - Add more container builds as needed
 - Add Python package publishing for internal tools
 - Consider adding a macOS runner on indri for native builds
 - Create Grafana dashboards for CI/CD monitoring
--- a/plans/completed/k8s-migration/00_overview.md
+++ b/plans/completed/k8s-migration/00_overview.md
@ -1,79 +0,0 @@
 # Blumeops Minikube Migration Plan
 **Status**: Completed (2026-01-23)
 This plan detailed the phased migration of blumeops services from direct hosting on indri (Mac Mini M1) to a minikube cluster. The migration is now complete for all services that will be migrated.
 ## Final Status
 | Phase | Name | Status | Notes |
 |-------|------|--------|-------|
 | 0 | [Foundation](P0_foundation.complete.md) | ✅ Complete | Container registry (zot) + minikube cluster |
 | 1 | [K8s Infrastructure](P1_k8s_infrastructure.complete.md) | ✅ Complete | Tailscale operator, ArgoCD, CloudNativePG, PostgreSQL cluster |
 | 2 | [Grafana](P2_grafana.complete.md) | ✅ Complete | Migrated Grafana via ArgoCD |
 | 3 | [PostgreSQL](P3_postgresql.complete.md) | ✅ Complete | Data migration to k8s PostgreSQL |
 | 4 | [Miniflux](P4_miniflux.complete.md) | ✅ Complete | Migrated Miniflux via ArgoCD |
 | 5 | [devpi](P5_devpi.complete.md) | ✅ Complete | Migrated devpi via ArgoCD |
 | 5.1 | [Docker Migration](P5.1_docker_migration.complete.md) | ✅ Complete | Switched minikube to docker driver (not QEMU2) |
 | 6 | [Kiwix](P6_kiwix.complete.md) | ✅ Complete | Migrated Kiwix + Transmission via ArgoCD |
 | 7 | [Forgejo](P7_forgejo.md) | ⏭️ Won't Do | Forgejo stays on indri - see [CI/CD Bootstrap](../../ci-cd-bootstrap/) |
 | 8 | [Woodpecker](P8_woodpecker.md) | ⏭️ Won't Do | Replaced by Forgejo Actions - see [CI/CD Bootstrap](../../ci-cd-bootstrap/) |
 | 9 | [Cleanup](P9_cleanup.md) | ⏭️ Won't Do | Observability cleanup done separately (2026-01-22) |
 ## What Was Migrated to K8s
 | Service | Status | Notes |
 |---------|--------|-------|
 | Grafana | ✅ In k8s | Helm chart via ArgoCD |
 | PostgreSQL | ✅ In k8s | CloudNativePG operator |
 | Miniflux | ✅ In k8s | Using k8s PostgreSQL |
 | devpi | ✅ In k8s | Custom container image |
 | Kiwix | ✅ In k8s | NFS mount from sifaka |
 | Transmission | ✅ In k8s | NFS mount from sifaka |
 | Prometheus | ✅ In k8s | Migrated 2026-01-22 |
 | Loki | ✅ In k8s | Migrated 2026-01-22 |
 | Alloy (k8s) | ✅ In k8s | DaemonSet for pod logs |
 | TeslaMate | ✅ In k8s | Added 2026-01-23 |
 ## What Stays on Indri
 | Service | Reason |
 |---------|--------|
 | **Forgejo** | Critical infrastructure, avoids circular dependency with ArgoCD |
 | **Zot Registry** | K8s needs images to start - must be outside k8s |
 | **Alloy (host)** | Collects host-level metrics and logs |
 | **Borgmatic** | Backup system must survive k8s failures |
 | **Plex** | Uses own NAT traversal, not Tailscale |
 ## Architecture Decisions Made
 ### Minikube Driver: Docker (not QEMU2/Podman)
 - Original plan called for QEMU2, but docker driver proved simpler
 - NFS mounts work via Docker NAT through indri's LAN IP
 - API server accessible via Tailscale TCP passthrough
 ### Forgejo: Stays on Indri
 - Original P7 planned k8s migration
 - Decision changed: Forgejo is critical infrastructure
 - Will be built from source via Forgejo Actions CI
 - See [CI/CD Bootstrap Plan](../../ci-cd-bootstrap/) for details
 ### CI/CD: Forgejo Actions (not Woodpecker)
 - Original P8 planned Woodpecker deployment
 - Decision changed: Use Forgejo's native Actions instead
 - Simpler (one less system), GitHub Actions compatible
 - See [CI/CD Bootstrap Plan](../../ci-cd-bootstrap/) for details
 ### Observability: Migrated to K8s
 - Original plan kept Prometheus/Loki on indri
 - Changed: Migrated both to k8s (2026-01-22)
 - Alloy on indri pushes to k8s endpoints
 - Alloy DaemonSet in k8s collects pod logs
 ## Lessons Learned
 1. **Docker driver is simpler than QEMU2** - Direct NFS mounts work, no VM complexity
 2. **Tailscale operator works well** - Easy service exposure with automatic TLS
 3. **CloudNativePG is production-ready** - Good operator, easy backups
 4. **Keep critical infra outside k8s** - Forgejo and zot must survive k8s failures
 5. **CGO matters on macOS** - Alloy needed CGO=1 for Tailscale DNS resolution
--- a/plans/completed/k8s-migration/P0_foundation.complete.md
+++ b/plans/completed/k8s-migration/P0_foundation.complete.md
--- a/plans/completed/k8s-migration/P1_k8s_infrastructure.complete.md
+++ b/plans/completed/k8s-migration/P1_k8s_infrastructure.complete.md
@ -1,657 +0,0 @@
 # Phase 1: Kubernetes Infrastructure
 **Goal**: Tailscale operator, ArgoCD, CloudNativePG operator, PostgreSQL cluster
 **Status**: In Progress
 **Prerequisites**: [Phase 0](P0_foundation.complete.md) complete
 ---
 ## Overview
 Phase 1 establishes the k8s control plane infrastructure:
 1. **Tailscale operator** - Exposes services on the tailnet
 2. **ArgoCD** - GitOps continuous delivery
 3. **CloudNativePG** - PostgreSQL operator
 4. **PostgreSQL cluster** - Database for future app migrations
 The deployment follows a bootstrap pattern:
 - First two components deployed via `kubectl apply -k` (no GitOps yet)
 - ArgoCD then takes over management of all components including itself
 - All subsequent deployments use ArgoCD
 ---
 ## Kubernetes Tags Overview
 | Tag | Purpose | Applied To |
 |-----|---------|------------|
 | `tag:k8s-api` | Controls access to the K8s API server | indri (Phase 0.14) |
 | `tag:k8s-operator` | Identifies the Tailscale K8s Operator | OAuth client for operator |
 | `tag:k8s` | Default tag for operator-managed resources | Proxies, services, ingresses created by operator |
 **Ownership chain**: `tag:k8s-operator` must own `tag:k8s` so the operator can assign that tag to devices it creates.
 ---
 ## PostgreSQL Migration Strategy
 The k8s PostgreSQL cluster will eventually replace the brew PostgreSQL on indri.
 | Phase | `pg.tail8d86e.ts.net` points to | Miniflux connects to |
 |-------|--------------------------------|---------------------|
 | Current | brew PostgreSQL (indri) | `pg.tail8d86e.ts.net` |
 | Phase 1 | brew PostgreSQL (indri) | `pg.tail8d86e.ts.net` (no change) |
 | Phase 4 | brew PostgreSQL (indri) | k8s PG (internal, after miniflux migrates to k8s) |
 | Post-Phase 4 | k8s PostgreSQL | k8s PG (internal) |
 | Cleanup | k8s PostgreSQL | k8s PG (internal) |
 This allows zero-downtime migration - the Tailscale service switches after apps are migrated.
 ---
 ## Steps
 ### 1. Update Pulumi ACLs for k8s workloads ✓
 **Status**: Complete
 Added to `pulumi/policy.hujson`:
 - `tag:k8s-operator` - for the operator OAuth client
 - `tag:k8s` - for operator-managed resources (owned by `tag:k8s-operator`)
 - Grant for `tag:k8s` → `tag:registry` access
 ---
 ### 2. Create Tailscale OAuth client ✓
 **Status**: Complete
 OAuth client stored in 1Password (vault: `vg6xf6vvfmoh5hqjjhlhbeoaie`, item: `2it22lavwgbxdskoaxanej354q`)
 **Configuration used:**
 - Tags: `tag:k8s-operator`
 - Devices write scope tag: `tag:k8s`
 - Scopes: Devices Core (R/W), Auth Keys (R/W), Services (Write)
 ---
 ### 3. Deploy Tailscale Kubernetes Operator (Bootstrap)
 Deploy via `kubectl apply -k` - will be migrated to ArgoCD management in Step 5.
 **Setup manifests directory:**
 ```bash
 mkdir -p argocd/manifests/tailscale-operator
 cd argocd/manifests/tailscale-operator
 # Download static manifest from Tailscale repo
 curl -sL https://raw.githubusercontent.com/tailscale/tailscale/main/cmd/k8s-operator/deploy/manifests/operator.yaml -o operator.yaml
 # Download CRDs
 curl -sL https://raw.githubusercontent.com/tailscale/tailscale/main/cmd/k8s-operator/deploy/crds/tailscale.com_connectors.yaml -o crds/connectors.yaml
 curl -sL https://raw.githubusercontent.com/tailscale/tailscale/main/cmd/k8s-operator/deploy/crds/tailscale.com_proxyclasses.yaml -o crds/proxyclasses.yaml
 # ... (other CRDs as needed)
 ```
 **Create kustomization.yaml:**
 ```yaml
 apiVersion: kustomize.config.k8s.io/v1beta1
 kind: Kustomization
 namespace: tailscale-system
 resources:
  - operator.yaml
 secretGenerator:
  - name: operator-oauth
    namespace: tailscale-system
    literals:
      - client_id=PLACEHOLDER
      - client_secret=PLACEHOLDER
 generatorOptions:
  disableNameSuffixHash: true
 ```
 **Deploy:**
 ```bash
 # Get credentials from 1Password and create secret manually (kustomize secretGenerator is for reference)
 CLIENT_ID=$(op --vault vg6xf6vvfmoh5hqjjhlhbeoaie item get 2it22lavwgbxdskoaxanej354q --fields client-id --reveal)
 CLIENT_SECRET=$(op --vault vg6xf6vvfmoh5hqjjhlhbeoaie item get 2it22lavwgbxdskoaxanej354q --fields client-secret --reveal)
 kubectl create namespace tailscale-system
 kubectl create secret generic operator-oauth \
  --namespace tailscale-system \
  --from-literal=client_id=$CLIENT_ID \
  --from-literal=client_secret=$CLIENT_SECRET
 # Apply operator manifests
 kubectl apply -k argocd/manifests/tailscale-operator/
 ```
 **Verification:**
 ```bash
 kubectl get pods -n tailscale-system
 # Expected: operator pod Running
 kubectl logs -n tailscale-system -l app.kubernetes.io/name=tailscale-operator
 ```
 ---
 ### 4. Deploy ArgoCD
 Deploy ArgoCD and expose via Tailscale as `argocd.tail8d86e.ts.net`.
 **Prerequisites:**
 - Add `tag:argocd` to Pulumi ACLs
 - Create Tailscale service `argocd` in admin console
 **Setup manifests:**
 ```bash
 mkdir -p argocd/manifests/argocd
 # Download ArgoCD install manifest
 curl -sL https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml -o argocd/manifests/argocd/install.yaml
 ```
 **Create kustomization.yaml:**
 ```yaml
 apiVersion: kustomize.config.k8s.io/v1beta1
 kind: Kustomization
 namespace: argocd
 resources:
  - install.yaml
  - service-tailscale.yaml  # LoadBalancer for Tailscale exposure
 ```
 **Create service-tailscale.yaml:**
 ```yaml
 apiVersion: v1
 kind: Service
 metadata:
  name: argocd-server-tailscale
  namespace: argocd
  annotations:
    tailscale.com/hostname: "argocd"
 spec:
  type: LoadBalancer
  loadBalancerClass: tailscale
  selector:
    app.kubernetes.io/name: argocd-server
  ports:
    - name: https
      port: 443
      targetPort: 8080
 ```
 **Deploy:**
 ```bash
 kubectl create namespace argocd
 kubectl apply -k argocd/manifests/argocd/
 ```
 **Get initial admin password:**
 ```bash
 kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d
 ```
 **Verification:**
 - https://argocd.tail8d86e.ts.net loads
 - Can login with admin / <initial-password>
 **Post-setup:**
 1. Change admin password, store in 1Password
 2. Configure git repo connection to `github.com/eblume/blumeops` (public, no auth needed)
   - Note: Using GitHub mirror since ArgoCD can't easily reach forge without additional networking
 ---
 ### 5. Migrate Tailscale Operator to ArgoCD
 Create ArgoCD Application to manage the Tailscale operator.
 **Create argocd/apps/tailscale-operator.yaml:**
 ```yaml
 apiVersion: argoproj.io/v1alpha1
 kind: Application
 metadata:
  name: tailscale-operator
  namespace: argocd
 spec:
  project: default
  source:
    repoURL: https://github.com/eblume/blumeops.git
    targetRevision: main
    path: argocd/manifests/tailscale-operator
  destination:
    server: https://kubernetes.default.svc
    namespace: tailscale-system
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
 ```
 **Apply:**
 ```bash
 kubectl apply -f argocd/apps/tailscale-operator.yaml
 ```
 **Note on secrets:** The OAuth secret was created manually in Step 3. For GitOps, consider:
 - Sealed Secrets
 - External Secrets Operator
 - SOPS
 For now, the secret remains manually managed outside of ArgoCD.
 ---
 ### 6. Deploy CloudNativePG via ArgoCD
 **Setup manifests:**
 ```bash
 mkdir -p argocd/manifests/cloudnative-pg
 # Download CNPG operator manifest
 curl -sL https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.24/releases/cnpg-1.24.0.yaml -o argocd/manifests/cloudnative-pg/operator.yaml
 ```
 **Create kustomization.yaml:**
 ```yaml
 apiVersion: kustomize.config.k8s.io/v1beta1
 kind: Kustomization
 resources:
  - operator.yaml
 ```
 **Create ArgoCD Application (argocd/apps/cloudnative-pg.yaml):**
 ```yaml
 apiVersion: argoproj.io/v1alpha1
 kind: Application
 metadata:
  name: cloudnative-pg
  namespace: argocd
 spec:
  project: default
  source:
    repoURL: https://github.com/eblume/blumeops.git
    targetRevision: main
    path: argocd/manifests/cloudnative-pg
  destination:
    server: https://kubernetes.default.svc
    namespace: cnpg-system
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true
 ```
 **Apply:**
 ```bash
 kubectl apply -f argocd/apps/cloudnative-pg.yaml
 ```
 **Verification:**
 ```bash
 kubectl get pods -n cnpg-system
 # Expected: cnpg-controller-manager Running
 ```
 ---
 ### 7. Create PostgreSQL Cluster via ArgoCD
 Create the database cluster. **Not exposed via Tailscale yet** - internal only until apps migrate.
 **Create argocd/manifests/databases/blumeops-pg.yaml:**
 ```yaml
 apiVersion: postgresql.cnpg.io/v1
 kind: Cluster
 metadata:
  name: blumeops-pg
  namespace: databases
 spec:
  instances: 1
  storage:
    size: 10Gi
    storageClass: standard
  monitoring:
    enablePodMonitor: true
  bootstrap:
    initdb:
      database: miniflux
      owner: miniflux
 ```
 **Create kustomization.yaml:**
 ```yaml
 apiVersion: kustomize.config.k8s.io/v1beta1
 kind: Kustomization
 namespace: databases
 resources:
  - blumeops-pg.yaml
 ```
 **Create ArgoCD Application (argocd/apps/blumeops-pg.yaml):**
 ```yaml
 apiVersion: argoproj.io/v1alpha1
 kind: Application
 metadata:
  name: blumeops-pg
  namespace: argocd
 spec:
  project: default
  source:
    repoURL: https://github.com/eblume/blumeops.git
    targetRevision: main
    path: argocd/manifests/databases
  destination:
    server: https://kubernetes.default.svc
    namespace: databases
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true
 ```
 **Apply:**
 ```bash
 kubectl apply -f argocd/apps/blumeops-pg.yaml
 ```
 **Verification:**
 ```bash
 kubectl get cluster -n databases
 # Expected: blumeops-pg with STATUS "Cluster in healthy state"
 kubectl get pods -n databases
 # Expected: blumeops-pg-1 Running
 # Get connection secret
 kubectl -n databases get secret blumeops-pg-app -o jsonpath='{.data.uri}' | base64 -d
 ```
 ---
 ### 8. Create App-of-Apps Root Application
 Once all components are deployed, create a root application to manage all apps.
 **Create argocd/apps/root.yaml:**
 ```yaml
 apiVersion: argoproj.io/v1alpha1
 kind: Application
 metadata:
  name: root
  namespace: argocd
 spec:
  project: default
  source:
    repoURL: https://github.com/eblume/blumeops.git
    targetRevision: main
    path: argocd/apps
  destination:
    server: https://kubernetes.default.svc
    namespace: argocd
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
 ```
 **Apply:**
 ```bash
 kubectl apply -f argocd/apps/root.yaml
 ```
 Now ArgoCD manages itself and all other applications via the app-of-apps pattern.
 ---
 ## New Files Summary
 ```
 argocd/
  apps/
    root.yaml                    # App-of-apps root
    tailscale-operator.yaml      # Tailscale operator app
    cloudnative-pg.yaml          # CNPG operator app
    blumeops-pg.yaml             # PostgreSQL cluster app
  manifests/
    tailscale-operator/
      kustomization.yaml
      operator.yaml
    argocd/
      kustomization.yaml
      install.yaml
      service-tailscale.yaml
    cloudnative-pg/
      kustomization.yaml
      operator.yaml
    databases/
      kustomization.yaml
      blumeops-pg.yaml
 ```
 ---
 ## Pulumi ACL Updates Required
 Add to `pulumi/policy.hujson`:
 ```hujson
 "tag:argocd": ["autogroup:admin", "tag:blumeops"],
 ```
 Add to Erich's test accept list:
 ```hujson
 "accept": [..., "tag:argocd:443"],
 ```
 Add to Allison's deny list:
 ```hujson
 "deny": [..., "tag:argocd:443"],
 ```
 ---
 ## Verification Checklist
 ```bash
 # 1. Tailscale operator running
 kubectl get pods -n tailscale-system
 # 2. ArgoCD accessible
 curl -k https://argocd.tail8d86e.ts.net/healthz
 # 3. CloudNativePG operator running
 kubectl get pods -n cnpg-system
 # 4. PostgreSQL cluster healthy
 kubectl get cluster -n databases
 # 5. All ArgoCD apps synced
 kubectl get applications -n argocd
 # All should show STATUS: Synced, HEALTH: Healthy
 ```
 ---
 ## Rollback
 ```bash
 # Remove ArgoCD apps (will cascade delete managed resources)
 kubectl delete application -n argocd root
 kubectl delete application -n argocd blumeops-pg
 kubectl delete application -n argocd cloudnative-pg
 kubectl delete application -n argocd tailscale-operator
 # Remove ArgoCD
 kubectl delete -k argocd/manifests/argocd/
 kubectl delete namespace argocd
 # Remove namespaces
 kubectl delete namespace databases
 kubectl delete namespace cnpg-system
 kubectl delete namespace tailscale-system
 # Revert ACL changes
 git checkout pulumi/policy.hujson
 mise run tailnet-up
 ```
 ---
 ## Implementation Notes (Deviations from Plan)
 *Added during implementation for retrospective review*
 ### Git Source: Forge Instead of GitHub
 **Plan**: Use GitHub mirror (`github.com/eblume/blumeops`)
 **Actual**: Use internal Forgejo (`ssh://forgejo@indri.tail8d86e.ts.net:2200/eblume/blumeops.git`)
 **Why**: User preference to use internal infrastructure, accepting circular dependency for later.
 **Required changes**:
 - Deploy key added to forge for ArgoCD SSH access
 - Repository secret `repo-forge` with SSH private key from 1Password
 - Discovered: `op read` requires `?ssh-format=openssh` query parameter for ArgoCD-compatible key format
 - Egress proxy service to reach forge from cluster (targets `indri.tail8d86e.ts.net` not `forge.tail8d86e.ts.net` due to Tailscale Serve limitation)
 - DNSConfig CRD for cluster-to-tailnet MagicDNS resolution
 - ACL grant: `tag:k8s` → `tag:homelab` on ports 3001 (HTTP) and 2200 (SSH)
 ### ArgoCD Exposure: Ingress Instead of LoadBalancer
 **Plan**: LoadBalancer service with `tailscale.com/hostname` annotation
 **Actual**: Tailscale Ingress with Let's Encrypt TLS termination
 **Why**: Ingress provides automatic TLS certificates and is the recommended approach.
 **File**: `argocd/manifests/argocd/service-tailscale.yaml` uses `kind: Ingress` with `ingressClassName: tailscale`
 ### Namespace: `tailscale` Instead of `tailscale-system`
 **Plan**: `tailscale-system` namespace
 **Actual**: `tailscale` namespace
 **Why**: Matches upstream Tailscale operator defaults.
 ### Sync Policy: Manual Instead of Automated
 **Plan**: `syncPolicy.automated` with prune and selfHeal
 **Actual**: Manual sync policy for workload apps; auto-sync only for app-of-apps
 **Why**: User preference for explicit control over deployments during initial migration phase.
 **Pattern**:
 - `apps.yaml` (app-of-apps): auto-sync to pick up new Application manifests
 - All workload apps: manual sync requires `argocd app sync <name>`
 ### CloudNativePG: Helm Chart Instead of Raw Manifest
 **Plan**: Download raw CNPG manifest
 **Actual**: Multi-source Application using official Helm chart from `https://cloudnative-pg.github.io/charts`
 **Why**: Helm chart is the officially supported distribution method.
 **Additional fix**: Required `ServerSideApply=true` sync option due to large CRD exceeding annotation size limit.
 ### App-of-Apps: Named `apps` Instead of `root`
 **Plan**: `argocd/apps/root.yaml`
 **Actual**: `argocd/apps/apps.yaml` with Application named `apps`
 **Why**: Clearer naming; `apps` manages apps, `argocd` manages itself.
 ### ArgoCD Self-Management Added
 **Plan**: Not explicitly planned
 **Actual**: `argocd/apps/argocd.yaml` Application for ArgoCD self-management
 **Why**: Standard GitOps pattern - ArgoCD manages its own deployment after bootstrap.
 ### CRI-O Registry Mirror for Zot
 **Plan**: Not in original plan
 **Actual**: Configured CRI-O to use zot as pull-through cache for docker.io, ghcr.io, quay.io
 **Why**: Reduces external bandwidth, speeds up pulls, avoids rate limits.
 **Implementation**: Ansible `minikube` role applies `/etc/containers/registries.conf.d/zot-mirror.conf` inside minikube VM using stable hostname `host.containers.internal:5050`.
 ### ProxyClass for CRI-O Image Compatibility
 **Plan**: Not mentioned
 **Actual**: Required `ProxyClass` with fully-qualified image paths (`docker.io/tailscale/...`)
 **Why**: CRI-O requires fully-qualified image references; default Tailscale operator uses short names.
 ### Actual File Structure
 ```
 argocd/
  apps/
    apps.yaml                    # App-of-apps (auto-sync)
    argocd.yaml                  # ArgoCD self-management (manual sync)
    tailscale-operator.yaml      # Tailscale operator (manual sync)
    cloudnative-pg.yaml          # CNPG operator via Helm (manual sync)
  manifests/
    tailscale-operator/
      kustomization.yaml
      operator.yaml
      proxyclass.yaml            # CRI-O compatibility
      dnsconfig.yaml             # Cluster-to-tailnet DNS
      egress-forge.yaml          # Egress proxy for forge
      secret.yaml.tpl            # OAuth secret template (manual)
      README.md
    argocd/
      kustomization.yaml         # Uses remote base from upstream
      service-tailscale.yaml     # Ingress (not LoadBalancer)
      argocd-cmd-params-cm.yaml  # Disable HTTPS redirect
      repo-forge-secret.yaml.tpl # SSH key template (manual)
      README.md
    cloudnative-pg/
      values.yaml                # Helm values (currently minimal)
      README.md
 ```
 ### Bootstrap Commands (Actual)
 ```bash
 # 1. Create namespaces
 kubectl create namespace tailscale
 kubectl create namespace argocd
 # 2. Apply secrets (manual, uses 1Password)
 op inject -i argocd/manifests/tailscale-operator/secret.yaml.tpl | kubectl apply -f -
 PRIV_KEY=$(op read "op://vg6xf6vvfmoh5hqjjhlhbeoaie/csjncynh6htjvnh2l2da65y32q/private key?ssh-format=openssh")$'\n' && \
 kubectl create secret generic repo-forge -n argocd \
  --from-literal=type=git \
  --from-literal=url='ssh://forgejo@indri.tail8d86e.ts.net:2200/eblume/blumeops.git' \
  --from-literal=insecure=true \
  --from-literal=sshPrivateKey="$PRIV_KEY" && \
 kubectl label secret repo-forge -n argocd argocd.argoproj.io/secret-type=repository
 # 3. Bootstrap tailscale-operator
 kubectl apply -k argocd/manifests/tailscale-operator/
 # 4. Bootstrap ArgoCD
 kubectl apply -k argocd/manifests/argocd/
 # 5. Login and change password
 argocd login argocd.tail8d86e.ts.net --username admin --grpc-web
 argocd account update-password
 # 6. Apply ArgoCD Applications
 kubectl apply -f argocd/apps/argocd.yaml
 kubectl apply -f argocd/apps/apps.yaml
 # 7. Sync workloads
 argocd app sync tailscale-operator
 argocd app sync cloudnative-pg
 ```
--- a/plans/completed/k8s-migration/P2_grafana.complete.md
+++ b/plans/completed/k8s-migration/P2_grafana.complete.md
@ -1,396 +0,0 @@
 # Phase 2: Grafana Migration (Pilot)
 **Goal**: Migrate Grafana as lowest-risk pilot service
 **Status**: Complete (2026-01-19)
 **Prerequisites**: [Phase 1](P1_k8s_infrastructure.complete.md) complete
 ---
 ## Overview
 This phase migrates Grafana from Homebrew/Ansible on indri to Kubernetes, establishing the pattern for future service migrations. Additionally, we establish the pattern of mirroring Helm chart repositories to forge for resilience and GitOps consistency.
 ---
 ## Key Decisions
 ### Helm Chart Mirroring
 **Problem**: P1 uses external Helm repos which creates external dependencies.
 **Solution**: Mirror Helm chart Git repositories to forge, reference charts from git path.
 ArgoCD auto-detects Helm charts when a directory contains `Chart.yaml`. No build step needed.
 | Chart | Upstream Git Repo | Forge Mirror | Chart Path |
 |-------|-------------------|--------------|------------|
 | cloudnative-pg | `github.com/cloudnative-pg/charts` | `forge/eblume/cloudnative-pg-charts` | `charts/cloudnative-pg/` |
 | grafana | `github.com/grafana/helm-charts` | `forge/eblume/grafana-helm-charts` | `charts/grafana/` |
 ### Database Storage
 Use SQLite with 1Gi PVC (not k8s PostgreSQL). Grafana stores minimal persistent data and dashboards are git-provisioned.
 ### Datasource URLs
 From k8s pods, use `host.containers.internal` to reach indri services:
 - Prometheus: `http://host.containers.internal:9090`
 - Loki: `http://host.containers.internal:3100` (requires ansible change to bind 0.0.0.0)
 ### Ingress
 Tailscale Ingress with Let's Encrypt TLS (following ArgoCD pattern), with `crio-compat` proxy class.
 ### Secrets Management
 Admin password stored in 1Password, injected manually via `op inject`. Future: migrate to External Secrets Operator or similar.
 ---
 ## Prerequisites
 ### 0.1 Mirror Helm Chart Repos to Forge
 **User action**: Create mirrors in forge:
 1. **CloudNativePG charts** (fix existing P1 app):
   - Mirror: `https://github.com/cloudnative-pg/charts`
   - To: `forge.tail8d86e.ts.net/eblume/cloudnative-pg-charts`
 2. **Grafana helm-charts** (new):
   - Mirror: `https://github.com/grafana/helm-charts`
   - To: `forge.tail8d86e.ts.net/eblume/grafana-helm-charts`
 ### 0.2 Update Loki to Bind 0.0.0.0
 **File**: `ansible/roles/loki/templates/loki-config.yaml.j2`
 Add under `server:`:
 ```yaml
 http_listen_address: 0.0.0.0
 ```
 Deploy: `mise run provision-indri -- --tags loki`
 ---
 ## Steps
 ### 1. Fix CloudNativePG to Use Forge Mirror
 Update `argocd/apps/cloudnative-pg.yaml` to use forge-mirrored chart:
 ```yaml
 sources:
  - repoURL: ssh://forgejo@indri.tail8d86e.ts.net:2200/eblume/cloudnative-pg-charts.git
    targetRevision: cloudnative-pg-0.23.0  # git tag
    path: charts/cloudnative-pg
    helm:
      releaseName: cloudnative-pg
      valueFiles:
        - $values/argocd/manifests/cloudnative-pg/values.yaml
  - repoURL: ssh://forgejo@indri.tail8d86e.ts.net:2200/eblume/blumeops.git
    targetRevision: main
    ref: values
 ```
 ---
 ### 2. Create Grafana Helm Values
 **File**: `argocd/manifests/grafana/values.yaml`
 ```yaml
 admin:
  existingSecret: grafana-admin
  userKey: admin-user
  passwordKey: admin-password
 persistence:
  enabled: true
  type: pvc
  size: 1Gi
 grafana.ini:
  server:
    root_url: https://grafana.tail8d86e.ts.net
  analytics:
    check_for_updates: false
    reporting_enabled: false
 datasources:
  datasources.yaml:
    apiVersion: 1
    datasources:
      - name: Prometheus
        type: prometheus
        access: proxy
        uid: prometheus
        url: http://host.containers.internal:9090
        isDefault: true
        editable: false
      - name: Loki
        type: loki
        access: proxy
        uid: loki
        url: http://host.containers.internal:3100
        editable: false
 sidecar:
  dashboards:
    enabled: true
    label: grafana_dashboard
    labelValue: "1"
 service:
  type: ClusterIP
  port: 80
 resources:
  requests:
    memory: "128Mi"
    cpu: "100m"
  limits:
    memory: "512Mi"
    cpu: "500m"
 ```
 ---
 ### 3. Create Grafana ArgoCD Application
 **File**: `argocd/apps/grafana.yaml`
 ```yaml
 apiVersion: argoproj.io/v1alpha1
 kind: Application
 metadata:
  name: grafana
  namespace: argocd
 spec:
  project: default
  sources:
    - repoURL: ssh://forgejo@indri.tail8d86e.ts.net:2200/eblume/grafana-helm-charts.git
      targetRevision: grafana-8.8.2
      path: charts/grafana
      helm:
        releaseName: grafana
        valueFiles:
          - $values/argocd/manifests/grafana/values.yaml
    - repoURL: ssh://forgejo@indri.tail8d86e.ts.net:2200/eblume/blumeops.git
      targetRevision: main
      ref: values
  destination:
    server: https://kubernetes.default.svc
    namespace: monitoring
  syncPolicy:
    syncOptions:
      - CreateNamespace=true
 ```
 ---
 ### 4. Create Grafana Config Application
 **File**: `argocd/apps/grafana-config.yaml`
 Deploys Tailscale Ingress and Dashboard ConfigMaps from `argocd/manifests/grafana-config/`.
 ---
 ### 5. Create Grafana Config Manifests
 **Directory**: `argocd/manifests/grafana-config/`
 Contents:
 - `kustomization.yaml`
 - `ingress-tailscale.yaml` - Tailscale Ingress for `grafana.tail8d86e.ts.net`
 - `secret-admin.yaml.tpl` - Admin password template (1Password-backed)
 - `README.md` - Notes on secrets management
 - `dashboards/configmap-*.yaml` - 9 dashboard ConfigMaps
 **Ingress**:
 ```yaml
 apiVersion: networking.k8s.io/v1
 kind: Ingress
 metadata:
  name: grafana-tailscale
  namespace: monitoring
  annotations:
    tailscale.com/proxy-class: "crio-compat"
 spec:
  ingressClassName: tailscale
  defaultBackend:
    service:
      name: grafana
      port:
        number: 80
  tls:
    - hosts:
        - grafana
 ```
 **Secret template** (`secret-admin.yaml.tpl`):
 ```yaml
 # Apply: op inject -i secret-admin.yaml.tpl | kubectl apply -f -
 apiVersion: v1
 kind: Secret
 metadata:
  name: grafana-admin
  namespace: monitoring
 type: Opaque
 stringData:
  admin-user: admin
  admin-password: {{ op://vg6xf6vvfmoh5hqjjhlhbeoaie/oxkcr3xtxnewy7noep2izvyr6y/password }}
 ```
 **Dashboard ConfigMaps**: Convert each JSON from `ansible/roles/grafana/files/dashboards/` to ConfigMap with label `grafana_dashboard: "1"`.
 ---
 ### 6. Deploy to Kubernetes
 ```bash
 # Create namespace and secret
 ki create namespace monitoring
 op inject -i argocd/manifests/grafana-config/secret-admin.yaml.tpl | ki apply -f -
 # Push changes and sync
 argocd app sync grafana
 argocd app sync grafana-config
 ```
 ---
 ### 7. Tailscale Service Cutover
 Remove `svc:grafana` from `ansible/roles/tailscale_serve/defaults/main.yml`, then:
 ```bash
 mise run provision-indri -- --tags tailscale-serve
 ```
 ---
 ### 8. Stop Brew Grafana
 ```bash
 ssh indri 'brew services stop grafana'
 ```
 ---
 ### 9. Retire Ansible Grafana Role
 Once k8s Grafana is verified working:
 1. **Remove role from playbook** - Delete grafana role entry from `ansible/playbooks/indri.yml`
 2. **Delete the role directory** - `rm -rf ansible/roles/grafana/`
 3. **Update zk documentation** - Note in `~/code/personal/zk/1767747119-YCPO.md` that Grafana is now k8s-hosted
 ---
 ## New Files
 | Path | Purpose |
 |------|---------|
 | `argocd/apps/grafana.yaml` | Grafana Helm chart Application |
 | `argocd/apps/grafana-config.yaml` | Grafana config Application |
 | `argocd/manifests/grafana/values.yaml` | Helm values |
 | `argocd/manifests/grafana-config/kustomization.yaml` | Kustomize config |
 | `argocd/manifests/grafana-config/ingress-tailscale.yaml` | Tailscale Ingress |
 | `argocd/manifests/grafana-config/secret-admin.yaml.tpl` | Admin password template |
 | `argocd/manifests/grafana-config/README.md` | Secrets management notes |
 | `argocd/manifests/grafana-config/dashboards/configmap-*.yaml` | 9 dashboard ConfigMaps |
 ## Modified Files
 | Path | Change |
 |------|--------|
 | `argocd/apps/cloudnative-pg.yaml` | Switch to forge-mirrored chart |
 | `ansible/roles/loki/templates/loki-config.yaml.j2` | Add `http_listen_address: 0.0.0.0` |
 | `ansible/roles/tailscale_serve/defaults/main.yml` | Remove `svc:grafana` |
 | `ansible/playbooks/indri.yml` | Remove grafana role |
 ## Deleted Files
 | Path | Reason |
 |------|--------|
 | `ansible/roles/grafana/` | Replaced by k8s deployment |
 ---
 ## Verification
 - [x] Loki accessible from k8s pods
 - [x] Prometheus accessible from k8s pods
 - [x] Grafana pod running in `monitoring` namespace
 - [x] Grafana Ingress active
 - [x] https://grafana.tail8d86e.ts.net loads
 - [x] All 9 dashboards visible
 - [x] Prometheus datasource queries work
 - [x] Loki datasource queries work
 ---
 ## Rollback
 1. Re-add `svc:grafana` to ansible tailscale_serve
 2. `mise run provision-indri -- --tags tailscale-serve,grafana`
 3. `argocd app delete grafana grafana-config --cascade`
 ---
 ## Implementation Notes
 *Added during implementation for retrospective review*
 ### SSH Credential Management
 **Issue**: Initial plan used HTTPS URLs for forge-mirrored Helm chart repos, but ArgoCD in cluster couldn't resolve `forge.tail8d86e.ts.net` (MagicDNS not available inside cluster).
 **Solution**: Use SSH URLs for all forge repos. Created a **credential template** (`repo-creds-forge`) that matches all repos under `ssh://forgejo@indri.tail8d86e.ts.net:2200/eblume/` using URL prefix matching. This allows a single SSH key (added to Forgejo user, not as deploy key) to work for all repos.
 ### SSH Host Key for ArgoCD
 **Issue**: ArgoCD's known_hosts didn't include indri's SSH host key, causing `knownhosts: key is unknown` errors.
 **Solution**: Added `argocd-ssh-known-hosts-cm.yaml` as a kustomize patch to include indri's host key alongside the upstream defaults.
 **Gotcha**: Kustomize patches must **not specify namespace** - the namespace transformation happens *after* patch matching. Our patch had `namespace: argocd` which caused "no matches for Id" errors until removed.
 ### Tailscale Hostname Cutover
 **Issue**: After removing `svc:grafana` from ansible's tailscale_serve config, the k8s Ingress still got a numbered hostname (`grafana-1.tail8d86e.ts.net`).
 **Solution**: The old `svc:grafana` service remained registered in Tailscale admin console even after clearing its serve config. **Manual deletion in Tailscale admin console** was required to free the `grafana` hostname for the k8s Ingress to claim. After deletion, recreating the Ingress picked up the correct hostname.
 ### ArgoCD Workflow Decision
 During implementation, we established the pattern for GitOps workflow:
 - **All apps target `main` branch** (not feature branches)
 - Manual sync policy on workload apps = merge doesn't auto-deploy
 - Workflow: feature branch → PR → merge to main → `argocd app sync <name>`
 - For testing: temporarily set one app to feature branch via `argocd app set --revision`
 This avoids the friction of switching `targetRevision` in manifests during development.
 ### Bootstrap Dependencies
 Some resources must be applied manually before ArgoCD can manage itself:
 1. **SSH known_hosts** - chicken-and-egg: ArgoCD can't sync the config that adds the host key
 2. **Credential secrets** - `repo-creds-forge` must exist before ArgoCD can pull from forge
 These are documented in `argocd/manifests/argocd/README.md` as bootstrap steps.
 ### Actual Versions Used
 - Grafana Helm chart: `grafana-8.8.2` (tag in grafana-helm-charts repo)
 - CloudNativePG Helm chart: `cloudnative-pg-v0.23.0` (tag in cloudnative-pg-charts repo)
 - Grafana version: 11.4.0
--- a/plans/completed/k8s-migration/P3_postgresql.complete.md
+++ b/plans/completed/k8s-migration/P3_postgresql.complete.md
@ -1,359 +0,0 @@
 # Phase 3: PostgreSQL Disaster Recovery & Backup
 **Goal**: Test disaster recovery and configure borgmatic backups for k8s-pg
 **Status**: Complete (2026-01-19)
 **Prerequisites**: [Phase 2](P2_grafana.complete.md) complete
 ---
 ## Overview
 Phase 3 establishes disaster recovery capabilities for the k8s PostgreSQL cluster:
 1. **Fix borgmatic backup issues** - Resolve `borg: command not found` error
 2. **Test disaster recovery** - Restore miniflux data from borgmatic backup to k8s-pg
 3. **Create borgmatic user** - Read-only backup user in k8s-pg via CloudNativePG
 4. **Configure dual database backup** - Backup both brew PostgreSQL and k8s-pg during migration
 This phase prepares for Phase 4 (miniflux migration) by verifying we can restore data to k8s-pg.
 ---
 ## Key Decisions
 ### Backup Both Databases During Transition
 **Decision**: Configure borgmatic to backup both `localhost:5432/miniflux` (brew) and `k8s-pg.tail8d86e.ts.net:5432/miniflux` (k8s) until migration complete.
 **Why**: Provides redundancy during migration. After Phase 4, remove localhost entry.
 ### Reuse Existing borgmatic Password
 **Decision**: Use same borgmatic password from 1Password for k8s-pg user.
 **Why**: Simpler credential management, password already proven secure.
 ### CloudNativePG Managed Roles
 **Decision**: Declare borgmatic user via CloudNativePG `managed.roles` instead of SQL commands.
 **Why**: Declarative, version-controlled, matches eblume user pattern.
 ### Disable selfHeal on apps App
 **Decision**: Remove `selfHeal: true` from `argocd/apps/apps.yaml`.
 **Why**: Allows temporarily pointing child apps to feature branches during development without ArgoCD reverting the change.
 ---
 ## Steps
 ### 1. Fix borgmatic borg path issue
 **Problem**: borgmatic failing with `borg: command not found`
 **Cause**: LaunchAgent doesn't have homebrew in PATH, so `borg` binary not found.
 **Solution**: Add `local_path` to borgmatic config template.
 **File**: `ansible/roles/borgmatic/templates/config.yaml.j2`
 ```yaml
 # Path to borg binary (LaunchAgent doesn't have homebrew in PATH)
 local_path: {{ borgmatic_local_path }}
 ```
 **File**: `ansible/roles/borgmatic/defaults/main.yml`
 ```yaml
 borgmatic_local_path: /opt/homebrew/bin/borg
 ```
 ---
 ### 2. Run manual backup to verify fix
 ```bash
 mise run provision-indri -- --tags borgmatic
 ssh indri '/opt/homebrew/bin/borgmatic --verbosity 1'
 ```
 ---
 ### 3. Extract miniflux dump from borgmatic
 ```bash
 ssh indri 'borgmatic list --archive latest'
 ssh indri 'borgmatic restore --archive latest --destination /tmp/restore'
 ```
 ---
 ### 4. Add ACL grant for homelab → k8s
 **Problem**: Connection from indri to k8s-pg blocked - Tailscale proxy logs showed "no rules matched"
 **Solution**: Add ACL grant in Pulumi.
 **File**: `pulumi/policy.hujson`
 ```hujson
 // Homelab can reach k8s PostgreSQL for borgmatic backups
 {
  "src": ["tag:homelab"],
  "dst": ["tag:k8s"],
  "ip":  ["tcp:5432"],
 },
 ```
 Deploy: `mise run tailnet-up`
 ---
 ### 5. Restore data to k8s-pg
 ```bash
 # Using eblume superuser credentials from 1Password
 ssh indri "psql 'postgres://eblume@k8s-pg.tail8d86e.ts.net:5432/miniflux' -f /tmp/restore/localhost/miniflux/miniflux"
 ```
 **Verification**:
 ```bash
 psql 'postgres://eblume@k8s-pg.tail8d86e.ts.net:5432/miniflux' -c 'SELECT COUNT(*) FROM users; SELECT COUNT(*) FROM feeds; SELECT COUNT(*) FROM entries;'
 # Result: 2 users, 2 feeds, 44 entries
 ```
 ---
 ### 6. Create borgmatic user in k8s-pg via CloudNativePG
 **File**: `argocd/manifests/databases/secret-borgmatic.yaml.tpl`
 ```yaml
 # Template for borgmatic backup user password
 # Apply with: op inject -i secret-borgmatic.yaml.tpl | kubectl apply -f -
 apiVersion: v1
 kind: Secret
 metadata:
  name: blumeops-pg-borgmatic
  namespace: databases
 type: kubernetes.io/basic-auth
 stringData:
  username: borgmatic
  password: {{ op://vg6xf6vvfmoh5hqjjhlhbeoaie/mw2bv5we7woicjza7hc6s44yvy/db-password }}
 ```
 **File**: `argocd/manifests/databases/blumeops-pg.yaml` (add to managed roles)
 ```yaml
 managed:
  roles:
    # ... existing eblume role ...
    # borgmatic read-only user for backups
    - name: borgmatic
      login: true
      connectionLimit: -1
      ensure: present
      inherit: true
      inRoles:
        - pg_read_all_data
      passwordSecret:
        name: blumeops-pg-borgmatic
 ```
 **Deploy**:
 ```bash
 op inject -i argocd/manifests/databases/secret-borgmatic.yaml.tpl | kubectl apply -f -
 argocd app set blumeops-pg --revision feature/p3-postgresql-borgmatic
 argocd app sync blumeops-pg
 ```
 ---
 ### 7. Configure borgmatic for dual database backup
 **File**: `ansible/roles/borgmatic/defaults/main.yml`
 ```yaml
 borgmatic_postgresql_databases:
  # Brew PostgreSQL on indri (current production)
  - name: miniflux
    hostname: localhost
    port: 5432
    username: borgmatic
  # k8s PostgreSQL (CloudNativePG) - backup both during migration
  - name: miniflux
    hostname: k8s-pg.tail8d86e.ts.net
    port: 5432
    username: borgmatic
 ```
 **File**: `ansible/roles/postgresql/tasks/main.yml` (update .pgpass)
 ```yaml
 - name: Write .pgpass file for borgmatic backups
  ansible.builtin.copy:
    content: |
      # Managed by ansible - only read-only roles
      localhost:{{ postgresql_port }}:*:borgmatic:{{ postgresql_user_passwords['borgmatic'] }}
      k8s-pg.tail8d86e.ts.net:5432:*:borgmatic:{{ postgresql_user_passwords['borgmatic'] }}
    dest: ~/.pgpass
    mode: '0600'
  no_log: true
 ```
 ---
 ### 8. Verify complete backup pipeline
 ```bash
 mise run provision-indri -- --tags borgmatic,postgresql
 ssh indri '/opt/homebrew/bin/borgmatic --verbosity 1'
 ssh indri 'borgmatic list --archive latest'
 ```
 **Expected output**: Archive contains both dumps:
 - `localhost/miniflux/miniflux`
 - `k8s-pg.tail8d86e.ts.net/miniflux/miniflux`
 ---
 ### 9. Fix ArgoCD drift from CNPG defaults
 **Problem**: ArgoCD showed blumeops-pg as OutOfSync due to CNPG operator adding default values.
 **Solution**: Add CNPG defaults explicitly to managed roles.
 **File**: `argocd/manifests/databases/blumeops-pg.yaml`
 ```yaml
 managed:
  roles:
    - name: eblume
      # ... existing fields ...
      connectionLimit: -1
      ensure: present
      inherit: true
    - name: borgmatic
      # ... existing fields ...
      connectionLimit: -1
      ensure: present
      inherit: true
 ```
 ---
 ### 10. Update zk documentation
 Updated:
 - `~/code/personal/zk/borgmatic.md` - k8s-pg backup documentation and log entry
 - `~/code/personal/zk/postgresql.md` - k8s PostgreSQL section and log entry
 ---
 ## New Files
 | Path | Purpose |
 |------|---------|
 | `argocd/manifests/databases/secret-borgmatic.yaml.tpl` | borgmatic user password template |
 ## Modified Files
 | Path | Change |
 |------|--------|
 | `ansible/roles/borgmatic/defaults/main.yml` | Added `borgmatic_local_path`, k8s-pg database entry |
 | `ansible/roles/borgmatic/templates/config.yaml.j2` | Added `local_path` option |
 | `ansible/roles/postgresql/tasks/main.yml` | Added k8s-pg to .pgpass |
 | `argocd/apps/apps.yaml` | Disabled selfHeal |
 | `argocd/manifests/databases/blumeops-pg.yaml` | Added borgmatic managed role, CNPG defaults |
 | `pulumi/policy.hujson` | Added ACL grant homelab → k8s on tcp:5432 |
 ---
 ## Verification
 - [x] borgmatic backup runs successfully
 - [x] Miniflux data restored to k8s-pg (2 users, 2 feeds, 44 entries)
 - [x] borgmatic user created in k8s-pg with pg_read_all_data role
 - [x] Both localhost and k8s-pg databases in backup archive
 - [x] ArgoCD shows blumeops-pg as Synced
 - [x] zk documentation updated
 ---
 ## Rollback
 Keep brew PostgreSQL running until Phase 4 verified. To revert:
 1. Remove k8s-pg entry from borgmatic databases
 2. Remove k8s-pg from .pgpass
 3. `mise run provision-indri -- --tags borgmatic,postgresql`
 ---
 ## Implementation Notes
 *Added during implementation for retrospective review*
 ### borgmatic LaunchAgent PATH Issue
 **Problem**: borgmatic LaunchAgent failed with `borg: command not found`
 **Root cause**: LaunchAgents run with minimal PATH that doesn't include `/opt/homebrew/bin`
 **Solution**: Added `local_path: /opt/homebrew/bin/borg` to borgmatic config. This was already done for `pg_dump_command` but not for borg itself.
 **Lesson**: Any tool invoked by borgmatic needs absolute path when running from LaunchAgent.
 ### 1Password Field Name Mismatch
 **Issue**: Initial secret template used `password` field but 1Password item had `db-password`.
 **Discovery**: Error message from `op inject` indicated field not found.
 **Fix**: Updated template to use correct field name `db-password`.
 ### ACL Grant Discovery
 **Problem**: Connection from indri (tag:homelab) to k8s-pg (tag:k8s) failed.
 **Diagnosis**: Checked Tailscale operator proxy logs which showed "no rules matched" - clear indication of missing ACL.
 **Solution**: Added explicit grant in `pulumi/policy.hujson` for `tag:homelab` → `tag:k8s` on `tcp:5432`.
 ### ArgoCD selfHeal and Feature Branch Development
 **Problem**: When testing changes, temporarily pointed blumeops-pg app to feature branch via `argocd app set --revision`. ArgoCD's selfHeal kept reverting it back to main.
 **Discussion**: Two options considered:
 - Option A: Disable selfHeal on apps app (manual sync required for new apps)
 - Option B: Keep selfHeal, use different workflow
 **Decision**: Option A chosen. The apps app now only has `prune: true`, not selfHeal. This allows:
 1. Temporarily testing feature branches
 2. Manual control over when app manifest changes are applied
 **Trade-off**: Must manually sync apps app when adding/removing Application manifests.
 ### CloudNativePG Managed Role Reconciliation
 **Issue**: After creating borgmatic secret with correct password, CNPG didn't immediately update the user.
 **Solution**: Annotated the Cluster to trigger reconciliation:
 ```bash
 kubectl annotate cluster blumeops-pg -n databases cnpg.io/reconcile=$(date +%s) --overwrite
 ```
 ### ArgoCD Drift from CNPG Defaults
 **Problem**: blumeops-pg showed OutOfSync despite successful syncs.
 **Cause**: CNPG operator adds default values (`connectionLimit: -1`, `ensure: present`, `inherit: true`) to managed roles that weren't in our spec.
 **Solution**: Added these defaults explicitly to our spec to match what CNPG generates.
 **Comment added**: Documented in blumeops-pg.yaml that these are "CNPG defaults added to prevent ArgoCD drift".
 ### Git Workflow for Phase 3
 1. Created feature branch: `feature/p3-postgresql-borgmatic`
 2. Made commits throughout implementation
 3. Pointed blumeops-pg app to feature branch for testing
 4. Created PR #32 for review
 5. After merge, reset app to main: `argocd app set blumeops-pg --revision main`
 This workflow was enabled by disabling selfHeal (see above).
--- a/plans/completed/k8s-migration/P4_miniflux.complete.md
+++ b/plans/completed/k8s-migration/P4_miniflux.complete.md
@ -1,162 +0,0 @@
 # Phase 4: Miniflux Migration to Kubernetes
 **Goal**: Migrate Miniflux entirely off indri and onto k8s, retire brew PostgreSQL, rename k8s-pg to pg
 **Status**: Complete (2026-01-20)
 **Prerequisites**: [Phase 3](P3_postgresql.complete.md) complete
 ---
 ## Overview
 This phase completed the miniflux migration and retired brew PostgreSQL:
 1. Deployed miniflux container in k8s via ArgoCD
 2. Exposed via Tailscale Ingress at `feed.tail8d86e.ts.net`
 3. Removed all miniflux infrastructure from indri (ansible role, brew service, Tailscale serve)
 4. Retired brew PostgreSQL (no longer needed)
 5. Renamed k8s-pg to pg (canonical Tailscale hostname)
 6. Updated borgmatic to backup only `pg.tail8d86e.ts.net`
 7. Updated all zk documentation
 ---
 ## New Files
 | Path | Purpose |
 |------|---------|
 | `argocd/apps/miniflux.yaml` | ArgoCD Application definition |
 | `argocd/manifests/miniflux/deployment.yaml` | Miniflux Deployment |
 | `argocd/manifests/miniflux/service.yaml` | ClusterIP Service |
 | `argocd/manifests/miniflux/ingress-tailscale.yaml` | Tailscale Ingress for `feed.tail8d86e.ts.net` |
 | `argocd/manifests/miniflux/secret-db.yaml.tpl` | Database URL secret documentation |
 | `argocd/manifests/miniflux/kustomization.yaml` | Kustomize configuration |
 | `argocd/manifests/miniflux/README.md` | Setup instructions |
 ## Modified Files
 | Path | Change |
 |------|--------|
 | `ansible/playbooks/indri.yml` | Removed miniflux and postgresql roles, simplified pre_tasks |
 | `ansible/roles/tailscale_serve/defaults/main.yml` | Removed `svc:feed` and `svc:pg` entries |
 | `ansible/roles/alloy/defaults/main.yml` | Removed miniflux and postgresql logs, disabled postgres metrics |
 | `ansible/roles/borgmatic/defaults/main.yml` | Updated to backup only `pg.tail8d86e.ts.net` |
 | `ansible/roles/borgmatic/tasks/main.yml` | Added .pgpass file management |
 | `argocd/manifests/databases/service-tailscale.yaml` | Renamed hostname from k8s-pg to pg |
 ## Deleted Files
 | Path | Reason |
 |------|--------|
 | `ansible/roles/miniflux/` | Entire role no longer needed |
 | `ansible/roles/postgresql/` | Brew PostgreSQL no longer needed |
 ---
 ## Verification
 - [x] Miniflux pod healthy in k8s
 - [x] https://feed.tail8d86e.ts.net accessible
 - [x] User `eblume` can log in
 - [x] Feeds visible and entries readable
 - [x] `pg.tail8d86e.ts.net` resolves to k8s PostgreSQL
 - [x] Old `k8s-pg` and `feed` devices removed from Tailscale
 - [x] brew miniflux and postgresql services stopped
 - [x] Tailscale serve entries cleared from indri
 - [x] zk documentation updated
 ---
 ## Implementation Notes
 *Lessons learned and issues encountered*
 ### CNPG-Generated Password vs 1Password
 **Problem**: Initial secret template used 1Password for miniflux database password, but CNPG auto-generates the bootstrap owner password.
 **Solution**: Reference the CNPG-generated password from `blumeops-pg-app` secret:
 ```bash
 kubectl create secret generic miniflux-db -n miniflux \
  --from-literal=url="$(kubectl -n databases get secret blumeops-pg-app -o jsonpath='{.data.uri}' | base64 -d)"
 ```
 ### Table Ownership Issue After P3 Restore
 **Problem**: Miniflux pod crashed with "permission denied for table schema_version".
 **Root cause**: P3 restore was run as the `eblume` superuser, so all tables were created owned by `eblume`, not `miniflux`.
 **Solution**: Transfer ownership of all tables to miniflux:
 ```sql
 DO $$
 DECLARE r RECORD;
 BEGIN
    FOR r IN (SELECT tablename FROM pg_tables WHERE schemaname = 'public') LOOP
        EXECUTE 'ALTER TABLE public.' || quote_ident(r.tablename) || ' OWNER TO miniflux';
    END LOOP;
 END$$;
 ```
 ### Tailscale Ingress Hostname Suffix
 **Behavior**: When requesting a Tailscale hostname that's already taken, the operator adds a suffix (e.g., `feed-1`).
 **Workflow**:
 1. Deploy initially - gets `feed-1.tail8d86e.ts.net`
 2. Clear old `svc:feed` from indri
 3. Delete old `feed` device from Tailscale admin
 4. Delete and recreate the Ingress - now claims `feed`
 ### Renaming Tailscale Service Hostname
 **Problem**: Changing the `tailscale.com/hostname` annotation doesn't automatically update the Tailscale device.
 **Solution**: Delete the service and let ArgoCD recreate it:
 ```bash
 kubectl -n databases delete service blumeops-pg-tailscale
 argocd app sync blumeops-pg
 ```
 ### .pgpass Management Migration
 **Issue**: The postgresql role managed `~/.pgpass` for borgmatic. With postgresql role deleted, borgmatic couldn't authenticate.
 **Solution**: Moved .pgpass management to the borgmatic role. Password is still fetched in playbook pre_tasks as `borgmatic_db_password`.
 ### Ansible Check Mode and Registered Variables
 **Problem**: Running `provision-indri --check --diff` failed in the podman role with "Conditional result (True) was derived from value of type 'str'" errors.
 **Root cause**: Command tasks are skipped in check mode, leaving registered variables undefined or with unexpected types when used in conditionals.
 **Solution**: Added `check_mode: false` to read-only command tasks that gather information:
 ```yaml
 - name: Check if podman machine exists
  ansible.builtin.command:
    cmd: podman machine list --format json
  register: podman_machine_list
  changed_when: false
  check_mode: false  # Safe to run in check mode - read-only
 ```
 **Lesson**: Any task that registers a variable used in conditionals should have `check_mode: false` if the command is read-only/safe.
 ### 1Password CLI on Headless Hosts
 **Issue**: Attempted to run `op` commands on indri, but 1Password CLI requires interactive authentication (biometrics/password).
 **Solution**: All `op` commands must be in `pre_tasks` of the playbook with `delegate_to: localhost` so they run on gilbert (the workstation with GUI auth).
 ### Git Workflow for Phase 4
 1. Created feature branch: `feature/p4-miniflux`
 2. Made incremental commits throughout implementation
 3. Pointed `miniflux` and `blumeops-pg` apps to feature branch for testing
 4. Created PR #33 for review
 5. After merge, reset apps to main:
   ```bash
   argocd app set miniflux --revision main
   argocd app set blumeops-pg --revision main
   argocd app sync apps
   ```
--- a/plans/completed/k8s-migration/P5.1_docker_migration.complete.md
+++ b/plans/completed/k8s-migration/P5.1_docker_migration.complete.md
@ -1,208 +0,0 @@
 # Phase 5.1: Migrate Minikube from QEMU2 to Docker Driver
 **Goal**: Replace the qemu2 driver with docker to fix remote API access and simplify volume mounts
 **Status**: Complete (2026-01-21) - Cluster running, ArgoCD deployed, apps synced
 **Prerequisites**: [Phase 5](P5_devpi.complete.md) complete
 ---
 ## Background
 ### Original Problem (Podman → QEMU2)
 During Phase 6 (Kiwix/Transmission migration), we discovered that the **podman driver has fundamental limitations** that prevent mounting external volumes:
 1. **SMB CSI driver fails** with "Operation not permitted" - the rootless container lacks kernel-level mount capabilities
 2. **`minikube mount` fails** - 9p mount gets "permission denied" inside the podman VM
 3. **hostPath volumes** only work for paths inside the minikube container, not the macOS host
 We migrated to QEMU2 to get a full VM with kernel capabilities.
 ### New Problem (QEMU2 → Docker)
 The QEMU2 driver introduced a **new problem**: the Kubernetes API server is inside the VM at `192.168.105.2:6443`, and Tailscale's TCP proxy cannot forward to it properly:
 - TCP connections succeed (nc -zv works)
 - TLS handshake times out
 - Root cause unknown, but likely related to Tailscale serve's handling of non-localhost upstreams
 Additionally, the volume mount solution with QEMU2 was complex:
 - Required NFS mount from sifaka → indri
 - Then `minikube mount` to pass through to VM
 - Two LaunchAgents/LaunchDaemons for persistence
 - macOS GUI approval required for network access
 ### Why Docker?
 The **docker driver** solves both problems:
 1. **API Server on localhost**: Docker Desktop handles port forwarding from container to localhost automatically, so `tailscale serve --tcp=443 tcp://localhost:PORT` works
 2. **Simpler volume mounts**: Docker Desktop has built-in macOS file sharing. Paths shared with Docker are accessible inside containers.
 3. **Official Tailscale recommendation**: Tailscale's own [Kubernetes guide](https://tailscale.com/learn/managing-access-to-kubernetes-with-tailscale) uses minikube with the docker driver.
 ---
 ## Implementation Summary
 ### Infrastructure Changes
 1. **Docker Desktop installed** (manual via `brew install --cask docker`)
   - Configured with 12GB memory in Docker Desktop settings
   - Kubernetes option disabled (using minikube instead)
 2. **Docker minikube cluster created**:
   ```bash
   minikube start \
     --driver=docker \
     --container-runtime=docker \
     --cpus=6 \
     --memory=11264 \
     --disk-size=200g \
     --apiserver-names=k8s.tail8d86e.ts.net,indri \
     --apiserver-port=6443 \
     --listen-address=0.0.0.0
   ```
 3. **Tailscale serve configured** for k8s API:
   - API server on localhost (port is dynamic with docker driver)
   - `tailscale serve --service=svc:k8s --tcp=443 tcp://localhost:<PORT>`
 4. **Remote kubectl access working** from gilbert:
   - Created `mise-tasks/ensure-minikube-indri-kubectl-config` script
   - Fetches certs from indri and sets up `~/.kube/minikube-indri/config.yml`
 ### Ansible Roles Updated
 - `ansible/roles/minikube/` - docker driver, removed qemu2/NFS/socket_vmnet
 - `ansible/roles/tailscale_serve/` - removed svc:k8s (minikube role handles dynamic port)
 - Containerd registry mirrors configured for zot pull-through cache
 ### ArgoCD Bootstrap
 All apps deployed and synced from `feature/p5.1-qemu2-migration` branch:
 | App | Status | Notes |
 |-----|--------|-------|
 | tailscale-operator | Healthy | Manages Tailscale ingresses |
 | argocd | Healthy | Self-managed |
 | cloudnative-pg | Healthy | PostgreSQL operator |
 | blumeops-pg | Progressing | PostgreSQL cluster starting |
 | grafana | Progressing | Needs grafana-admin secret |
 | grafana-config | Healthy | Dashboards and ingress |
 | miniflux | Progressing | Needs miniflux-config secret |
 | devpi | Progressing | Starting up |
 ### Secrets Still Needed
 After PR merge, apply these secrets manually:
 ```bash
 # Grafana admin password
 op inject -i argocd/manifests/grafana-config/secret-admin.yaml.tpl | kubectl --context=minikube-indri apply -f -
 # Miniflux config
 op inject -i argocd/manifests/miniflux/secret.yaml.tpl | kubectl --context=minikube-indri apply -f -
 ```
 ---
 ## Technical Notes
 ### API Server Port
 With docker driver, the API server port is **dynamic** - Docker maps a random host port to 6443 inside the container.
 The minikube ansible role queries the port after cluster start and configures tailscale serve accordingly.
 ### Registry Mirror Configuration
 Containerd uses `/etc/containerd/certs.d/<registry>/hosts.toml` files. The ansible role configures mirrors for:
 - `registry.tail8d86e.ts.net` (private images)
 - `docker.io`
 - `ghcr.io`
 - `quay.io`
 ### ProxyClass Renamed
 Changed from `crio-compat` to `default` - the old name was misleading since we're no longer using CRI-O.
 ### Volume Mounts for P6 (Kiwix/Transmission)
 **Solution: Direct NFS from pods to sifaka** ✅ TESTED AND WORKING
 Docker NATs outbound traffic through indri's LAN IP (192.168.1.50), so sifaka's NFS exports need to allow `192.168.1.0/24`.
 Sifaka NFS exports configured:
 - `192.168.1.0/24` - Docker containers via indri NAT
 - `100.64.0.0/10` - Tailscale clients
 Pods can mount NFS directly:
 ```yaml
 volumes:
  - name: torrents
    nfs:
      server: sifaka
      path: /volume1/torrents
 ```
 No LaunchAgents, no `minikube mount`, no SMB CSI driver needed.
 ---
 ## Verification Checklist
 - [x] Docker Desktop installed and running on indri
 - [x] QEMU2 minikube deleted
 - [x] Docker minikube running (6 CPUs, 11GB RAM)
 - [x] API server accessible on localhost
 - [x] Tailscale serve configured for svc:k8s
 - [x] Remote kubectl access working from gilbert
 - [x] Ansible roles updated for docker driver
 - [x] socket_vmnet stopped
 - [x] ArgoCD deployed and synced
 - [x] All apps synced to feature branch
 - [x] Apply app secrets (grafana-admin, miniflux-db, devpi-root, eblume, borgmatic)
 - [x] Verify all apps healthy after secrets applied
 - [x] Miniflux database restored from borgmatic backup
 - [ ] Merge PR and reset apps to main branch
 - [ ] `mise run indri-services-check` passes
 ---
 ## Post-Merge Steps
 After PR is merged:
 ```bash
 # Reset all blumeops apps to main branch
 argocd app set apps --revision main
 argocd app set argocd --revision main
 argocd app set blumeops-pg --revision main
 argocd app set devpi --revision main
 argocd app set grafana-config --revision main
 argocd app set miniflux --revision main
 argocd app set tailscale-operator --revision main
 # Sync all apps
 argocd app sync apps
 argocd app sync argocd
 argocd app sync tailscale-operator
 argocd app sync blumeops-pg
 argocd app sync grafana-config
 argocd app sync miniflux
 argocd app sync devpi
 ```
 ---
 ## Rollback Plan
 If Docker driver doesn't work:
 1. Delete Docker minikube: `minikube delete`
 2. Recreate QEMU2 cluster (restore old ansible config from git)
 3. Accept the Tailscale TCP forwarding limitation and use SSH tunnel for remote kubectl
--- a/plans/completed/k8s-migration/P5_devpi.complete.md
+++ b/plans/completed/k8s-migration/P5_devpi.complete.md
@ -1,102 +0,0 @@
 # Phase 5: devpi Migration to Kubernetes
 **Goal**: Migrate devpi PyPI caching proxy from indri to k8s
 **Status**: Complete (2026-01-20)
 **Prerequisites**: [Phase 4](P4_miniflux.complete.md) complete
 ---
 ## Summary
 Successfully migrated devpi from mcquack LaunchAgent on indri to Kubernetes:
 - Custom container image with devpi-server + devpi-web + auto-init startup script
 - StatefulSet with 50Gi PVC for data persistence
 - Tailscale Ingress at `pypi.tail8d86e.ts.net`
 - Root password from 1Password secret, auto-initialized on first run
 - Verified pip caching proxy and mcquack package upload
 ---
 ## Key Learnings
 ### Registry Mirror Configuration
 - Minikube's CRI-O can't resolve Tailscale hostnames directly
 - Added registry mirror config to redirect `registry.tail8d86e.ts.net` → `host.containers.internal:5050`
 - Also added direct insecure registry entry for `host.containers.internal:5050`
 - Config in `ansible/roles/minikube/files/zot-mirror.conf`
 ### Memory Requirements
 - devpi-web's Whoosh search indexer needs significant memory during PyPI index build
 - Initial 512Mi limit caused OOMKills
 - Solution: High limit (2Gi) with low request (256Mi) - memory reclaimed after indexing
 ### Environment Variable Conflicts
 - Kubernetes auto-sets `DEVPI_PORT` for service discovery
 - Conflicted with our port config - renamed to `DEVPI_LISTEN_PORT`
 ### Tailscale Serve Cleanup
 - Use `tailscale serve status --json` to see entries (non-JSON output can be empty)
 - Use `tailscale serve clear svc:<name>` to remove entries
 ### ArgoCD Workflow
 - Changed `apps` to manual sync (was auto-sync with prune)
 - Workflow: sync apps → set revision to feature branch → sync service → test → reset to main after merge
 ---
 ## Verification Checklist
 - [x] devpi pod healthy in k8s
 - [x] https://pypi.tail8d86e.ts.net accessible
 - [x] Web interface shows root/pypi index
 - [x] `pip install <package>` works through proxy
 - [x] mcquack v1.0.0 uploaded to eblume/dev
 - [x] `pip install --index-url https://pypi.tail8d86e.ts.net/eblume/dev/+simple/ mcquack` works
 - [x] Old devpi service removed from indri
 - [x] zk documentation updated
 ---
 ## Files Changed
 ### New Files
 | Path | Purpose |
 |------|---------|
 | `argocd/apps/devpi.yaml` | ArgoCD Application definition |
 | `argocd/manifests/devpi/Dockerfile` | Container image with startup script |
 | `argocd/manifests/devpi/start.sh` | Auto-init startup script |
 | `argocd/manifests/devpi/statefulset.yaml` | StatefulSet with PVC |
 | `argocd/manifests/devpi/service.yaml` | ClusterIP Service |
 | `argocd/manifests/devpi/ingress-tailscale.yaml` | Tailscale Ingress |
 | `argocd/manifests/devpi/kustomization.yaml` | Kustomize configuration |
 | `argocd/manifests/devpi/secret-root.yaml.tpl` | 1Password secret template |
 | `argocd/manifests/devpi/README.md` | Setup documentation |
 ### Modified Files
 | Path | Change |
 |------|--------|
 | `CLAUDE.md` | Added k8s/ArgoCD workflow documentation |
 | `ansible/playbooks/indri.yml` | Removed devpi and devpi_metrics roles |
 | `ansible/roles/tailscale_serve/defaults/main.yml` | Removed svc:pypi |
 | `ansible/roles/alloy/defaults/main.yml` | Removed devpi log collection |
 | `ansible/roles/borgmatic/defaults/main.yml` | Removed devpi backup paths |
 | `ansible/roles/minikube/files/zot-mirror.conf` | Added registry mirror for Tailscale hostname |
 | `argocd/apps/apps.yaml` | Changed to manual sync policy |
 ### Roles Kept (not deleted)
 - `ansible/roles/devpi/` - Kept for reference
 - `ansible/roles/devpi_metrics/` - Kept for reference
 ---
 ## Post-Merge Cleanup
 After PR merge, reset ArgoCD apps to main:
 ```fish
 argocd app set apps --revision main
 argocd app sync apps
 argocd app set devpi --revision main
 argocd app sync devpi
 ```
--- a/plans/completed/k8s-migration/P6_kiwix.complete.md
+++ b/plans/completed/k8s-migration/P6_kiwix.complete.md
--- a/plans/completed/k8s-migration/P7_forgejo.md
+++ b/plans/completed/k8s-migration/P7_forgejo.md
@ -1,394 +0,0 @@
 # Phase 7: Forgejo Migration to Kubernetes
 **Goal**: Migrate Forgejo from indri (macOS Homebrew) to Kubernetes via ArgoCD
 **Status**: Planning (2026-01-21)
 **Prerequisites**: [Phase 6](P6_kiwix.complete.md) complete
 ---
 ## Critical Risks & Mitigations
 ### 1. Circular Dependency (Highest Risk)
 ArgoCD pulls manifests from Forgejo. If k8s Forgejo fails, we cannot redeploy it.
 **Mitigation**: blumeops is mirrored to `github.com/eblume/blumeops`. DR procedure documented to switch ArgoCD to GitHub temporarily (see Disaster Recovery section).
 ### 2. Split Hostnames Required
 The Tailscale k8s operator [cannot expose both HTTPS and TCP/SSH on the same hostname](https://github.com/tailscale/tailscale/issues/15539). See also [user comment](https://github.com/tailscale/tailscale/issues/15539#issuecomment-3782368432).
 **Solution**:
 - **HTTPS (web UI)**: `forge.tail8d86e.ts.net` via Tailscale Ingress
 - **SSH (git operations)**: `git.tail8d86e.ts.net` via Tailscale LoadBalancer
 ---
 ## Current State
 ### Forgejo on indri
 | Component | Location/Details |
 |-----------|------------------|
 | Data directory | `/opt/homebrew/var/forgejo/` (~426MB) |
 | SQLite database | `/opt/homebrew/var/forgejo/data/forgejo.db` (4.1MB) |
 | Git repositories | `/opt/homebrew/var/forgejo/data/forgejo-repositories/` (~418MB) |
 | Configuration | `/opt/homebrew/var/forgejo/custom/conf/app.ini` (contains secrets) |
 | HTTP port | 3001 (localhost) |
 | SSH port | 2200 (localhost) |
 | Tailscale | `svc:forge` with tcp:22→2200 and https:443→3001 |
 | Backup | borgmatic backs up to sifaka |
 ### Hosted Repositories (8 total)
 - blumeops (mirrored to GitHub)
 - cloudnative-pg-charts
 - csi-driver-smb
 - devpi
 - dotfiles
 - grafana-helm-charts
 - mcquack
 - zot
 ---
 ## Architecture Decision: Helm Chart via ArgoCD
 Following established pattern from cloudnative-pg and grafana:
 1. Mirror `https://code.forgejo.org/forgejo-helm/forgejo-helm` to forge
 2. ArgoCD Application with multi-source (chart + values)
 3. Values file in `argocd/manifests/forgejo/values.yaml`
 ---
 ## All `forge` References Requiring Update
 ### SSH URLs (change to `git.tail8d86e.ts.net:22`)
 | File | Current | After |
 |------|---------|-------|
 | `argocd/apps/apps.yaml` | `ssh://forgejo@indri.tail8d86e.ts.net:2200/...` | `ssh://forgejo@git.tail8d86e.ts.net/...` |
 | `argocd/apps/argocd.yaml` | same | same |
 | `argocd/apps/blumeops-pg.yaml` | same | same |
 | `argocd/apps/cloudnative-pg.yaml` | same | same |
 | `argocd/apps/devpi.yaml` | same | same |
 | `argocd/apps/grafana.yaml` | same | same |
 | `argocd/apps/grafana-config.yaml` | same | same |
 | `argocd/apps/kiwix.yaml` | same | same |
 | `argocd/apps/miniflux.yaml` | same | same |
 | `argocd/apps/tailscale-operator.yaml` | same | same |
 | `argocd/apps/torrent.yaml` | same | same |
 | `argocd/manifests/argocd/repo-forge-secret.yaml.tpl` | `ssh://forgejo@indri.tail8d86e.ts.net:2200/eblume/` | `ssh://forgejo@git.tail8d86e.ts.net/eblume/` |
 | `ansible/group_vars/all.yml` | `ssh://forgejo@forge.tail8d86e.ts.net/...` | `ssh://forgejo@git.tail8d86e.ts.net/...` |
 ### SSH Known Hosts (add `git.tail8d86e.ts.net`)
 | File | Change |
 |------|--------|
 | `argocd/manifests/argocd/argocd-ssh-known-hosts-cm.yaml` | Add `git.tail8d86e.ts.net ssh-ed25519 AAAA...` |
 ### HTTPS URLs (stay as `forge.tail8d86e.ts.net`)
 These remain unchanged:
 - `CLAUDE.md:135` - Mirror location
 - `mise-tasks/pr-comments:23` - Forge API base
 - `mise-tasks/indri-services-check:65` - HTTP health check (update to check k8s)
 ### Ansible/Indri Cleanup (remove after migration)
 | File | Action |
 |------|--------|
 | `ansible/playbooks/indri.yml:36-37` | Remove forgejo role |
 | `ansible/roles/tailscale_serve/defaults/main.yml:6` | Remove `svc:forge` entry |
 | `ansible/roles/alloy/defaults/main.yml:31-32` | Remove forgejo log collection |
 | `ansible/roles/borgmatic/defaults/main.yml:17` | Update backup path |
 ### Tailscale/Pulumi (update after hostname cutover)
 | File | Change |
 |------|--------|
 | `argocd/manifests/tailscale-operator/egress-forge.yaml` | Delete (no longer needed) |
 | `pulumi/policy.hujson` | Update `tag:forge` ACLs for k8s source |
 ---
 ## Pre-Migration Checklist
 - [ ] GitHub mirror verified current
 - [ ] Full borgmatic backup completed and verified
 - [ ] Manual backup of `/opt/homebrew/var/forgejo` on indri
 - [ ] Document all SSH deploy keys and webhooks
 - [ ] **User action**: Mirror forgejo-helm chart to forge
 - [ ] Extract secrets from app.ini to 1Password:
  - `INTERNAL_TOKEN`
  - `SECRET_KEY`
  - `JWT_SECRET`
  - Any OAuth/webhook secrets
 ---
 ## Steps
 ### Phase A: Create k8s Manifests
 **New Files:**
 ```
 argocd/apps/forgejo.yaml                    # ArgoCD Application (multi-source Helm)
 argocd/manifests/forgejo/values.yaml        # Helm chart values
 argocd/manifests/forgejo/kustomization.yaml # Kustomize config
 argocd/manifests/forgejo/pvc.yaml           # 10Gi PersistentVolumeClaim
 argocd/manifests/forgejo/secret-app.yaml.tpl # Secrets from 1Password
 ```
 **Key values.yaml settings:**
 ```yaml
 service:
  ssh:
    type: LoadBalancer
    loadBalancerClass: tailscale
    port: 22
    annotations:
      tailscale.com/hostname: "git-1"  # Test hostname first
 ingress:
  enabled: true
  className: tailscale
  hosts:
    - host: forge-1  # Test hostname first
 gitea:
  config:
    server:
      DOMAIN: forge-1.tail8d86e.ts.net
      ROOT_URL: https://forge-1.tail8d86e.ts.net/
      SSH_DOMAIN: git-1.tail8d86e.ts.net
      SSH_PORT: 22
    database:
      DB_TYPE: sqlite3
      PATH: /data/forgejo.db
 ```
 ---
 ### Phase B: Deploy to Test Hostnames
 1. Create feature branch, push to forge
 2. Sync ArgoCD apps: `argocd app sync apps`
 3. Point forgejo app to feature branch: `argocd app set forgejo --revision feature/p7-forgejo`
 4. Sync forgejo app: `argocd app sync forgejo`
 5. Verify pods running (empty data initially)
 ---
 ### Phase C: Data Migration (~10 min downtime)
 1. **Stop indri Forgejo**
   ```bash
   ssh indri 'brew services stop forgejo'
   ```
 2. **Copy data** (option A: rsync via NFS staging)
   ```bash
   ssh indri 'rsync -avP /opt/homebrew/var/forgejo/ sifaka:/volume1/forgejo-migration/'
   ```
 3. **Copy to PVC and fix permissions**
   ```bash
   kubectl exec -n forgejo deployment/forgejo -- rsync -avP /staging/ /data/
   kubectl exec -n forgejo deployment/forgejo -- chown -R 1000:1000 /data
   ```
 4. **Restart Forgejo**
   ```bash
   kubectl rollout restart deployment/forgejo -n forgejo
   ```
 ---
 ### Phase D: Validation (Critical)
 - [ ] Web UI accessible at `forge-1.tail8d86e.ts.net`
 - [ ] SSH works: `ssh -T forgejo@git-1.tail8d86e.ts.net`
 - [ ] All 8 repos visible and accessible
 - [ ] Git clone works
 - [ ] Git push works (test on non-critical repo)
 - [ ] eblume user preserved with correct permissions
 - [ ] PR history intact
 - [ ] Webhooks functioning
 - [ ] GitHub mirror push still works
 ---
 ### Phase E: Hostname Cutover
 1. **Clear indri Tailscale serve**
   ```bash
   ssh indri 'tailscale serve clear svc:forge'
   ```
 2. **User action**: Delete `svc:forge` and `forge-1` devices from Tailscale admin
 3. **Update manifests**: Change `forge-1` → `forge`, `git-1` → `git`
 4. **Sync ArgoCD**
 5. **Verify hostnames claimed**
   ```bash
   curl https://forge.tail8d86e.ts.net/api/v1/version
   ssh -T forgejo@git.tail8d86e.ts.net
   ```
 ---
 ### Phase F: Update ArgoCD to Use New Forgejo
 1. **Get SSH host key from k8s Forgejo**
   ```bash
   kubectl exec -n forgejo deployment/forgejo -- cat /data/ssh/ssh_host_ed25519_key.pub
   ```
 2. **Update known_hosts ConfigMap** with `git.tail8d86e.ts.net` key
 3. **Update repo-creds-forge secret** (manual kubectl commands)
 4. **Update all ArgoCD Application manifests** with new repoURL
 5. **Delete egress-forge.yaml** (no longer needed)
 6. **Sync ArgoCD** and verify all apps sync successfully
 ---
 ### Phase G: Update Local Git Remotes
 ```bash
 cd ~/code/personal/blumeops
 git remote set-url origin ssh://forgejo@git.tail8d86e.ts.net/eblume/blumeops.git
 # Repeat for all 8 repos
 ```
 ---
 ### Phase H: Cleanup
 1. Remove forgejo role from `ansible/playbooks/indri.yml`
 2. Remove `svc:forge` from `ansible/roles/tailscale_serve/defaults/main.yml`
 3. Remove forgejo log collection from `ansible/roles/alloy/defaults/main.yml`
 4. Delete `argocd/manifests/tailscale-operator/egress-forge.yaml`
 5. Update `mise-tasks/indri-services-check`
 6. Run ansible to clean up indri: `mise run provision-indri -- --tags tailscale-serve,alloy`
 7. Update zk documentation (forgejo, argocd, blumeops cards)
 8. Merge PR
 9. Reset ArgoCD to main
 ---
 ## Disaster Recovery Procedure
 **Add to [[forgejo]] zk card:**
 ### When Forgejo is Unavailable
 1. **Add GitHub repository to ArgoCD**
   ```bash
   argocd repo add https://github.com/eblume/blumeops.git \
     --username eblume \
     --password $(op read "op://<vault>/<item>/github-pat")
   ```
 2. **Point critical apps to GitHub**
   ```bash
   argocd app set apps --repo https://github.com/eblume/blumeops.git
   argocd app set forgejo --repo https://github.com/eblume/blumeops.git
   argocd app sync forgejo
   ```
 3. **Fix Forgejo** (restore from backup, fix config, etc.)
 4. **Verify Forgejo is healthy**
   ```bash
   curl https://forge.tail8d86e.ts.net/api/v1/version
   ssh -T forgejo@git.tail8d86e.ts.net
   ```
 5. **Switch back to Forgejo**
   ```bash
   argocd app set apps --repo ssh://forgejo@git.tail8d86e.ts.net/eblume/blumeops.git
   argocd app set forgejo --repo ssh://forgejo@git.tail8d86e.ts.net/eblume/blumeops.git
   argocd app sync apps
   argocd repo rm https://github.com/eblume/blumeops.git
   ```
 ---
 ## Files Summary
 ### New Files
 | Path | Purpose |
 |------|---------|
 | `argocd/apps/forgejo.yaml` | ArgoCD Application (multi-source Helm) |
 | `argocd/manifests/forgejo/values.yaml` | Helm chart values |
 | `argocd/manifests/forgejo/kustomization.yaml` | Kustomize config |
 | `argocd/manifests/forgejo/pvc.yaml` | 10Gi PersistentVolumeClaim |
 | `argocd/manifests/forgejo/secret-app.yaml.tpl` | Secrets template |
 ### Modified Files
 | Path | Change |
 |------|--------|
 | All `argocd/apps/*.yaml` | Update repoURL to `git.tail8d86e.ts.net` |
 | `argocd/manifests/argocd/argocd-ssh-known-hosts-cm.yaml` | Add `git.tail8d86e.ts.net` |
 | `argocd/manifests/argocd/repo-forge-secret.yaml.tpl` | Update URL |
 | `ansible/playbooks/indri.yml` | Remove forgejo role |
 | `ansible/roles/tailscale_serve/defaults/main.yml` | Remove `svc:forge` |
 | `ansible/roles/alloy/defaults/main.yml` | Remove forgejo logs |
 ### Files to Delete
 | Path | Reason |
 |------|--------|
 | `argocd/manifests/tailscale-operator/egress-forge.yaml` | No longer needed |
 ---
 ## Rollback
 If migration fails at any point:
 1. **Delete k8s resources**
   ```bash
   argocd app delete forgejo --cascade
   kubectl delete namespace forgejo
   ```
 2. **Restart indri Forgejo**
   ```bash
   ssh indri 'brew services start forgejo'
   ```
 3. **Re-enable Tailscale serve**
   ```bash
   mise run provision-indri -- --tags tailscale-serve
   ```
 4. **Revert ArgoCD apps to indri URLs** (if changed)
 ---
 ## Verification Checklist
 - [ ] GitHub mirror verified current
 - [ ] Helm chart mirrored to forge
 - [ ] Secrets extracted to 1Password
 - [ ] k8s Forgejo pod running
 - [ ] All 8 repos accessible
 - [ ] SSH clone/push works via `git.tail8d86e.ts.net`
 - [ ] HTTPS works via `forge.tail8d86e.ts.net`
 - [ ] ArgoCD syncs from new URL
 - [ ] All local remotes updated
 - [ ] Indri cleanup complete
 - [ ] zk docs updated
 - [ ] DR procedure documented in [[forgejo]] card
--- a/plans/completed/k8s-migration/P8_woodpecker.md
+++ b/plans/completed/k8s-migration/P8_woodpecker.md
@ -1,32 +0,0 @@
 # Phase 8: CI/CD (Woodpecker)
 **Goal**: Deploy Woodpecker CI integrated with Forgejo
 **Status**: Pending
 **Prerequisites**: [Phase 7](P7_forgejo.md) complete
 ---
 ## Steps
 ### 1. Create Forgejo OAuth application
 - Callback: https://ci.tail8d86e.ts.net/authorize
 - Store in 1Password
 ---
 ### 2. Deploy Woodpecker Server + Agent
 ---
 ### 3. Configure Tailscale LoadBalancer
 Tag: `svc:ci`
 ---
 ### 4. Test pipeline
 Create `.woodpecker.yaml` in test repo
--- a/plans/completed/k8s-migration/P9_cleanup.md
+++ b/plans/completed/k8s-migration/P9_cleanup.md
@ -1,52 +0,0 @@
 # Phase 9: Cleanup
 **Goal**: Remove deprecated services, harden system
 **Status**: Pending
 **Prerequisites**: [Phase 8](P8_woodpecker.md) complete
 ---
 ## Steps
 ### 1. Stop/remove unused brew services
 - postgresql@18
 - grafana
 - miniflux
 - forgejo
 ---
 ### 2. Update ansible playbook
 - Remove migrated service roles
 - Add k8s deployment references
 ---
 ### 3. Configure Velero backups (optional)
 - Install with MinIO on sifaka
 - Schedule daily cluster backups
 ---
 ### 4. Update zk documentation
 - New architecture
 - Runbooks
 - DR procedures
 ---
 ## Plan Completion
 When all phases are complete and verified:
 ```bash
 # Rename this folder to indicate completion
 git mv plans/k8s-migration plans/k8s-migration.complete
 git commit -m "Complete k8s migration plan"
 ```