Expand Phase 1 plan with ArgoCD and GitOps pattern

Major updates to Phase 1:
- Added ArgoCD deployment as step 4 (exposed at argocd.tail8d86e.ts.net)
- Bootstrap pattern: Tailscale operator deployed first via kubectl,
  then ArgoCD takes over management of all components
- App-of-apps pattern with argocd/apps/ and argocd/manifests/ structure
- PostgreSQL migration strategy documented (zero-downtime switchover)
- Using GitHub mirror for ArgoCD git source (public, no auth needed)

New Phase 1 steps:
1. Update Pulumi ACLs ✓
2. Create Tailscale OAuth client ✓
3. Deploy Tailscale operator (bootstrap)
4. Deploy ArgoCD
5. Migrate Tailscale operator to ArgoCD
6. Deploy CloudNativePG via ArgoCD
7. Create PostgreSQL cluster via ArgoCD
8. Create app-of-apps root

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Erich Blume 2026-01-18 16:05:46 -08:00
commit 91cd7260fd
2 changed files with 383 additions and 104 deletions

View file

@ -7,14 +7,14 @@ This plan details a phased migration of blumeops services from direct hosting on
| Phase | Name | Status | Description |
|-------|------|--------|-------------|
| 0 | [Foundation](P0_foundation.complete.md) | Complete | Container registry + minikube cluster |
| 1 | [K8s Infrastructure](P1_k8s_infrastructure.md) | In Progress | Tailscale operator + CloudNativePG |
| 2 | [Grafana](P2_grafana.md) | Pending | Migrate Grafana (pilot) |
| 3 | [PostgreSQL](P3_postgresql.md) | Pending | Migrate to CloudNativePG |
| 4 | [Miniflux](P4_miniflux.md) | Pending | Migrate Miniflux |
| 5 | [devpi](P5_devpi.md) | Pending | Migrate devpi |
| 6 | [Kiwix](P6_kiwix.md) | Pending | Migrate Kiwix |
| 7 | [Forgejo](P7_forgejo.md) | Pending | Migrate Forgejo (highest risk) |
| 8 | [Woodpecker](P8_woodpecker.md) | Pending | Deploy CI/CD |
| 1 | [K8s Infrastructure](P1_k8s_infrastructure.md) | In Progress | Tailscale operator, ArgoCD, CloudNativePG, PostgreSQL cluster |
| 2 | [Grafana](P2_grafana.md) | Pending | Migrate Grafana (pilot) via ArgoCD |
| 3 | [PostgreSQL](P3_postgresql.md) | Pending | Data migration to k8s PostgreSQL |
| 4 | [Miniflux](P4_miniflux.md) | Pending | Migrate Miniflux via ArgoCD |
| 5 | [devpi](P5_devpi.md) | Pending | Migrate devpi via ArgoCD |
| 6 | [Kiwix](P6_kiwix.md) | Pending | Migrate Kiwix via ArgoCD |
| 7 | [Forgejo](P7_forgejo.md) | Pending | Migrate Forgejo (highest risk) via ArgoCD |
| 8 | [Woodpecker](P8_woodpecker.md) | Pending | Deploy CI/CD via ArgoCD |
| 9 | [Cleanup](P9_cleanup.md) | Pending | Remove deprecated services |
## Architecture Overview

View file

@ -1,6 +1,6 @@
# Phase 1: Kubernetes Infrastructure
**Goal**: Tailscale operator + CloudNativePG operator
**Goal**: Tailscale operator, ArgoCD, CloudNativePG operator, PostgreSQL cluster
**Status**: In Progress
@ -8,9 +8,22 @@
---
## Kubernetes Tags Overview
## Overview
Phase 1 introduces three Tailscale tags for Kubernetes:
Phase 1 establishes the k8s control plane infrastructure:
1. **Tailscale operator** - Exposes services on the tailnet
2. **ArgoCD** - GitOps continuous delivery
3. **CloudNativePG** - PostgreSQL operator
4. **PostgreSQL cluster** - Database for future app migrations
The deployment follows a bootstrap pattern:
- First two components deployed via `kubectl apply -k` (no GitOps yet)
- ArgoCD then takes over management of all components including itself
- All subsequent deployments use ArgoCD
---
## Kubernetes Tags Overview
| Tag | Purpose | Applied To |
|-----|---------|------------|
@ -22,118 +35,278 @@ Phase 1 introduces three Tailscale tags for Kubernetes:
---
## PostgreSQL Migration Strategy
The k8s PostgreSQL cluster will eventually replace the brew PostgreSQL on indri.
| Phase | `pg.tail8d86e.ts.net` points to | Miniflux connects to |
|-------|--------------------------------|---------------------|
| Current | brew PostgreSQL (indri) | `pg.tail8d86e.ts.net` |
| Phase 1 | brew PostgreSQL (indri) | `pg.tail8d86e.ts.net` (no change) |
| Phase 4 | brew PostgreSQL (indri) | k8s PG (internal, after miniflux migrates to k8s) |
| Post-Phase 4 | k8s PostgreSQL | k8s PG (internal) |
| Cleanup | k8s PostgreSQL | k8s PG (internal) |
This allows zero-downtime migration - the Tailscale service switches after apps are migrated.
---
## Steps
### 1. Update Pulumi ACLs for k8s workloads
### 1. Update Pulumi ACLs for k8s workloads
Add the operator and workload tags to `pulumi/policy.hujson`.
**Status**: Complete
**Changes to tagOwners:**
```hujson
// Tailscale K8s Operator tags (Phase 1)
"tag:k8s-operator": ["autogroup:admin", "tag:blumeops"],
"tag:k8s": ["autogroup:admin", "tag:blumeops", "tag:k8s-operator"],
Added to `pulumi/policy.hujson`:
- `tag:k8s-operator` - for the operator OAuth client
- `tag:k8s` - for operator-managed resources (owned by `tag:k8s-operator`)
- Grant for `tag:k8s``tag:registry` access
---
### 2. Create Tailscale OAuth client ✓
**Status**: Complete
OAuth client stored in 1Password (vault: `vg6xf6vvfmoh5hqjjhlhbeoaie`, item: `2it22lavwgbxdskoaxanej354q`)
**Configuration used:**
- Tags: `tag:k8s-operator`
- Devices write scope tag: `tag:k8s`
- Scopes: Devices Core (R/W), Auth Keys (R/W), Services (Write)
---
### 3. Deploy Tailscale Kubernetes Operator (Bootstrap)
Deploy via `kubectl apply -k` - will be migrated to ArgoCD management in Step 5.
**Setup manifests directory:**
```bash
mkdir -p argocd/manifests/tailscale-operator
cd argocd/manifests/tailscale-operator
# Download static manifest from Tailscale repo
curl -sL https://raw.githubusercontent.com/tailscale/tailscale/main/cmd/k8s-operator/deploy/manifests/operator.yaml -o operator.yaml
# Download CRDs
curl -sL https://raw.githubusercontent.com/tailscale/tailscale/main/cmd/k8s-operator/deploy/crds/tailscale.com_connectors.yaml -o crds/connectors.yaml
curl -sL https://raw.githubusercontent.com/tailscale/tailscale/main/cmd/k8s-operator/deploy/crds/tailscale.com_proxyclasses.yaml -o crds/proxyclasses.yaml
# ... (other CRDs as needed)
```
**Add grant for k8s→registry access:**
```hujson
// k8s workloads (e.g., Woodpecker CI) can push/pull from registry
{
"src": ["tag:k8s"],
"dst": ["tag:registry"],
"ip": ["tcp:443"],
},
```
**Add test case:**
```hujson
{
"src": "tag:k8s",
"accept": ["tag:registry:443"],
},
**Create kustomization.yaml:**
```yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: tailscale-system
resources:
- operator.yaml
secretGenerator:
- name: operator-oauth
namespace: tailscale-system
literals:
- client_id=PLACEHOLDER
- client_secret=PLACEHOLDER
generatorOptions:
disableNameSuffixHash: true
```
**Deploy:**
```bash
mise run tailnet-preview # Review changes
mise run tailnet-up # Apply changes
```
# Get credentials from 1Password and create secret manually (kustomize secretGenerator is for reference)
CLIENT_ID=$(op --vault vg6xf6vvfmoh5hqjjhlhbeoaie item get 2it22lavwgbxdskoaxanej354q --fields client-id --reveal)
CLIENT_SECRET=$(op --vault vg6xf6vvfmoh5hqjjhlhbeoaie item get 2it22lavwgbxdskoaxanej354q --fields client-secret --reveal)
---
kubectl create namespace tailscale-system
kubectl create secret generic operator-oauth \
--namespace tailscale-system \
--from-literal=client_id=$CLIENT_ID \
--from-literal=client_secret=$CLIENT_SECRET
### 2. Create Tailscale OAuth client (MANUAL)
Go to https://login.tailscale.com/admin/settings/oauth and create an OAuth client:
**Configuration:**
- **Description**: `k8s-operator`
- **Tags**: `tag:k8s-operator`
- **Scopes**:
- Devices: Core (Read & Write)
- Auth Keys: Read & Write
- Services: Write
**After creation:**
1. Copy the Client ID and Client Secret
2. Store in 1Password (vault: `vg6xf6vvfmoh5hqjjhlhbeoaie`)
- Item name: `Tailscale K8s Operator OAuth`
- Fields: `client-id`, `client-secret`
---
### 3. Deploy Tailscale Kubernetes Operator
```bash
# Add helm repo
helm repo add tailscale https://pkgs.tailscale.com/helmcharts
helm repo update
# Get credentials from 1Password
CLIENT_ID=$(op --vault vg6xf6vvfmoh5hqjjhlhbeoaie item get "Tailscale K8s Operator OAuth" --fields client-id --reveal)
CLIENT_SECRET=$(op --vault vg6xf6vvfmoh5hqjjhlhbeoaie item get "Tailscale K8s Operator OAuth" --fields client-secret --reveal)
# Install operator
helm install tailscale-operator tailscale/tailscale-operator \
--namespace tailscale-system --create-namespace \
--set oauth.clientId=$CLIENT_ID \
--set oauth.clientSecret=$CLIENT_SECRET
# Apply operator manifests
kubectl apply -k argocd/manifests/tailscale-operator/
```
**Verification:**
```bash
kubectl get pods -n tailscale-system
# Expected: tailscale-operator pod Running
# Expected: operator pod Running
# Check operator logs
kubectl logs -n tailscale-system -l app.kubernetes.io/name=tailscale-operator
```
---
### 4. Deploy CloudNativePG operator
### 4. Deploy ArgoCD
Deploy ArgoCD and expose via Tailscale as `argocd.tail8d86e.ts.net`.
**Prerequisites:**
- Add `tag:argocd` to Pulumi ACLs
- Create Tailscale service `argocd` in admin console
**Setup manifests:**
```bash
kubectl apply -f https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.24/releases/cnpg-1.24.0.yaml
mkdir -p argocd/manifests/argocd
# Download ArgoCD install manifest
curl -sL https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml -o argocd/manifests/argocd/install.yaml
```
**Create kustomization.yaml:**
```yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: argocd
resources:
- install.yaml
- service-tailscale.yaml # LoadBalancer for Tailscale exposure
```
**Create service-tailscale.yaml:**
```yaml
apiVersion: v1
kind: Service
metadata:
name: argocd-server-tailscale
namespace: argocd
annotations:
tailscale.com/hostname: "argocd"
spec:
type: LoadBalancer
loadBalancerClass: tailscale
selector:
app.kubernetes.io/name: argocd-server
ports:
- name: https
port: 443
targetPort: 8080
```
**Deploy:**
```bash
kubectl create namespace argocd
kubectl apply -k argocd/manifests/argocd/
```
**Get initial admin password:**
```bash
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d
```
**Verification:**
- https://argocd.tail8d86e.ts.net loads
- Can login with admin / <initial-password>
**Post-setup:**
1. Change admin password, store in 1Password
2. Configure git repo connection to `github.com/eblume/blumeops` (public, no auth needed)
- Note: Using GitHub mirror since ArgoCD can't easily reach forge without additional networking
---
### 5. Migrate Tailscale Operator to ArgoCD
Create ArgoCD Application to manage the Tailscale operator.
**Create argocd/apps/tailscale-operator.yaml:**
```yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: tailscale-operator
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/eblume/blumeops.git
targetRevision: main
path: argocd/manifests/tailscale-operator
destination:
server: https://kubernetes.default.svc
namespace: tailscale-system
syncPolicy:
automated:
prune: true
selfHeal: true
```
**Apply:**
```bash
kubectl apply -f argocd/apps/tailscale-operator.yaml
```
**Note on secrets:** The OAuth secret was created manually in Step 3. For GitOps, consider:
- Sealed Secrets
- External Secrets Operator
- SOPS
For now, the secret remains manually managed outside of ArgoCD.
---
### 6. Deploy CloudNativePG via ArgoCD
**Setup manifests:**
```bash
mkdir -p argocd/manifests/cloudnative-pg
# Download CNPG operator manifest
curl -sL https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.24/releases/cnpg-1.24.0.yaml -o argocd/manifests/cloudnative-pg/operator.yaml
```
**Create kustomization.yaml:**
```yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- operator.yaml
```
**Create ArgoCD Application (argocd/apps/cloudnative-pg.yaml):**
```yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: cloudnative-pg
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/eblume/blumeops.git
targetRevision: main
path: argocd/manifests/cloudnative-pg
destination:
server: https://kubernetes.default.svc
namespace: cnpg-system
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
```
**Apply:**
```bash
kubectl apply -f argocd/apps/cloudnative-pg.yaml
```
**Verification:**
```bash
kubectl get pods -n cnpg-system
# Expected: cnpg-controller-manager pod Running
# Expected: cnpg-controller-manager Running
```
---
### 5. Create PostgreSQL cluster
### 7. Create PostgreSQL Cluster via ArgoCD
Create namespace and cluster manifest:
```bash
kubectl create namespace databases
```
Create the database cluster. **Not exposed via Tailscale yet** - internal only until apps migrate.
**Create argocd/manifests/databases/blumeops-pg.yaml:**
```yaml
# ansible/k8s/databases/blumeops-pg.yaml
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
@ -146,10 +319,48 @@ spec:
storageClass: standard
monitoring:
enablePodMonitor: true
bootstrap:
initdb:
database: miniflux
owner: miniflux
```
**Create kustomization.yaml:**
```yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: databases
resources:
- blumeops-pg.yaml
```
**Create ArgoCD Application (argocd/apps/blumeops-pg.yaml):**
```yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: blumeops-pg
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/eblume/blumeops.git
targetRevision: main
path: argocd/manifests/databases
destination:
server: https://kubernetes.default.svc
namespace: databases
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
```
**Apply:**
```bash
kubectl apply -f ansible/k8s/databases/blumeops-pg.yaml
kubectl apply -f argocd/apps/blumeops-pg.yaml
```
**Verification:**
@ -158,30 +369,92 @@ kubectl get cluster -n databases
# Expected: blumeops-pg with STATUS "Cluster in healthy state"
kubectl get pods -n databases
# Expected: blumeops-pg-1 pod Running
# Expected: blumeops-pg-1 Running
# Get connection secret
kubectl -n databases get secret blumeops-pg-app -o jsonpath='{.data.uri}' | base64 -d
```
---
### 6. Update Alloy config
### 8. Create App-of-Apps Root Application
Add kubernetes_sd_configs for k8s metrics scraping.
Once all components are deployed, create a root application to manage all apps.
**Files to modify:**
- `ansible/roles/alloy/templates/config.alloy.j2`
**Create argocd/apps/root.yaml:**
```yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: root
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/eblume/blumeops.git
targetRevision: main
path: argocd/apps
destination:
server: https://kubernetes.default.svc
namespace: argocd
syncPolicy:
automated:
prune: true
selfHeal: true
```
**Changes:**
- Add scrape config for CloudNativePG metrics
- Add scrape config for Tailscale operator metrics (if exposed)
**Apply:**
```bash
kubectl apply -f argocd/apps/root.yaml
```
Now ArgoCD manages itself and all other applications via the app-of-apps pattern.
---
## New Files
## New Files Summary
| File | Purpose |
|------|---------|
| `ansible/k8s/operators/` | Operator deployment notes/scripts |
| `ansible/k8s/databases/blumeops-pg.yaml` | PostgreSQL cluster manifest |
```
argocd/
apps/
root.yaml # App-of-apps root
tailscale-operator.yaml # Tailscale operator app
cloudnative-pg.yaml # CNPG operator app
blumeops-pg.yaml # PostgreSQL cluster app
manifests/
tailscale-operator/
kustomization.yaml
operator.yaml
argocd/
kustomization.yaml
install.yaml
service-tailscale.yaml
cloudnative-pg/
kustomization.yaml
operator.yaml
databases/
kustomization.yaml
blumeops-pg.yaml
```
---
## Pulumi ACL Updates Required
Add to `pulumi/policy.hujson`:
```hujson
"tag:argocd": ["autogroup:admin", "tag:blumeops"],
```
Add to Erich's test accept list:
```hujson
"accept": [..., "tag:argocd:443"],
```
Add to Allison's deny list:
```hujson
"deny": [..., "tag:argocd:443"],
```
---
@ -191,16 +464,18 @@ Add kubernetes_sd_configs for k8s metrics scraping.
# 1. Tailscale operator running
kubectl get pods -n tailscale-system
# 2. CloudNativePG operator running
# 2. ArgoCD accessible
curl -k https://argocd.tail8d86e.ts.net/healthz
# 3. CloudNativePG operator running
kubectl get pods -n cnpg-system
# 3. PostgreSQL cluster healthy
# 4. PostgreSQL cluster healthy
kubectl get cluster -n databases
kubectl get pods -n databases
# 4. Test database connection (from indri)
kubectl -n databases get secret blumeops-pg-app -o jsonpath='{.data.uri}' | base64 -d
# Use the URI to connect via psql
# 5. All ArgoCD apps synced
kubectl get applications -n argocd
# All should show STATUS: Synced, HEALTH: Healthy
```
---
@ -208,15 +483,19 @@ kubectl -n databases get secret blumeops-pg-app -o jsonpath='{.data.uri}' | base
## Rollback
```bash
# Remove PostgreSQL cluster
kubectl delete cluster -n databases blumeops-pg
# Remove ArgoCD apps (will cascade delete managed resources)
kubectl delete application -n argocd root
kubectl delete application -n argocd blumeops-pg
kubectl delete application -n argocd cloudnative-pg
kubectl delete application -n argocd tailscale-operator
# Remove ArgoCD
kubectl delete -k argocd/manifests/argocd/
kubectl delete namespace argocd
# Remove namespaces
kubectl delete namespace databases
# Remove CloudNativePG operator
kubectl delete -f https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.24/releases/cnpg-1.24.0.yaml
# Remove Tailscale operator
helm uninstall tailscale-operator -n tailscale-system
kubectl delete namespace cnpg-system
kubectl delete namespace tailscale-system
# Revert ACL changes