blumeops/argocd/manifests/databases
Erich Blume 17023085cb Migrate observability stack to Kubernetes (#42)
Note: the name of this branch was chosen before the scope widened to encompass the entire observability stack.

Summary

  - Fix Grafana data source URLs (docker driver uses host.minikube.internal, not host.containers.internal)
  - Migrate Prometheus and Loki from indri to Kubernetes with Tailscale Ingresses
  - Expose CNPG PostgreSQL metrics via Tailscale and update dashboard to use cnpg_* metrics
  - Update Alloy to push metrics/logs to k8s endpoints (prometheus.tail8d86e.ts.net, loki.tail8d86e.ts.net)
  - Add ACL rule for port 9187 (CNPG metrics)
  - Delete obsolete ansible roles for prometheus and loki

Changes

  - argocd/manifests/prometheus/ - New Prometheus StatefulSet with 20Gi PVC and Tailscale Ingress
  - argocd/manifests/loki/ - New Loki StatefulSet with 20Gi PVC and Tailscale Ingress
  - argocd/apps/prometheus.yaml, argocd/apps/loki.yaml - ArgoCD Applications
  - argocd/manifests/grafana/values.yaml - Data sources now use k8s internal DNS
  - argocd/manifests/databases/service-metrics-tailscale.yaml - CNPG metrics endpoint
  - argocd/manifests/grafana-config/dashboards/configmap-postgresql.yaml - Updated to cnpg_* metrics
  - ansible/roles/alloy/defaults/main.yml - Push to k8s Tailscale endpoints
  - pulumi/policy.hujson - ACL for port 9187
  - Deleted ansible/roles/prometheus/ and ansible/roles/loki/

Deployment and Testing

  - Stop prometheus and loki on indri
  - Sync ArgoCD apps (apps, prometheus, loki, grafana)
  - Run mise run provision-indri -- --tags alloy
  - Verify Grafana dashboards show data

🤖 Generated with https://claude.ai/claude-code

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/42
2026-01-22 12:06:02 -08:00
..
blumeops-pg.yaml Add CNPG default values to prevent ArgoCD drift 2026-01-19 18:02:42 -08:00
kustomization.yaml Migrate observability stack to Kubernetes (#42) 2026-01-22 12:06:02 -08:00
README.md P3: PostgreSQL disaster recovery test and borgmatic k8s-pg backup (#32) 2026-01-19 18:00:32 -08:00
secret-borgmatic.yaml.tpl P3: PostgreSQL disaster recovery test and borgmatic k8s-pg backup (#32) 2026-01-19 18:00:32 -08:00
secret-eblume.yaml.tpl K8s Migration Phase 1: Infrastructure Setup (#29) 2026-01-19 09:49:52 -08:00
service-metrics-tailscale.yaml Migrate observability stack to Kubernetes (#42) 2026-01-22 12:06:02 -08:00
service-tailscale.yaml P5.1: Migrate minikube from podman to QEMU2 driver (#38) 2026-01-21 16:03:37 -08:00

Database Manifests

PostgreSQL clusters managed by CloudNativePG operator.

blumeops-pg

Single-instance PostgreSQL cluster for blumeops services.

Configuration

  • Instances: 1 (single-node for minikube)
  • Storage: 10Gi on standard storage class
  • Initial database: miniflux owned by miniflux user

Users/Roles

User Role Purpose Password Source
postgres superuser CNPG internal (avoid using) blumeops-pg-superuser secret
miniflux app owner Owns miniflux database blumeops-pg-app secret
eblume superuser Admin access (matches brew pg) blumeops-pg-eblume secret (manual)
borgmatic pg_read_all_data Backup access for borgmatic blumeops-pg-borgmatic secret (manual)

Manual Secret Setup

Before deploying, create the password secrets:

# Create namespace first
kubectl create namespace databases

# Apply eblume password from 1Password
op inject -i argocd/manifests/databases/secret-eblume.yaml.tpl | kubectl apply -f -

# Apply borgmatic password from 1Password
op inject -i argocd/manifests/databases/secret-borgmatic.yaml.tpl | kubectl apply -f -

The miniflux user password is auto-generated by CloudNativePG and stored in blumeops-pg-app.

Connection Information

After the cluster is healthy:

# Connect via Tailscale (temporary hostname during migration)
psql -h k8s-pg.tail8d86e.ts.net -U eblume -W -d miniflux

# Or with password from 1Password
PGPASSWORD=$(op --vault blumeops item get guxu3j7ajhjyey6xxl2ovsl2ui --fields password --reveal) \
  psql -h k8s-pg.tail8d86e.ts.net -U eblume -d miniflux

# Get miniflux app credentials (for applications)
kubectl -n databases get secret blumeops-pg-app -o jsonpath='{.data.uri}' | base64 -d

# Get postgres superuser credentials (emergency only)
kubectl -n databases get secret blumeops-pg-superuser -o jsonpath='{.data.password}' | base64 -d

Connecting via kubectl port-forward

Alternative if Tailscale service is unavailable:

# Terminal 1: Port-forward to the primary
kubectl -n databases port-forward svc/blumeops-pg-rw 5432:5432

# Terminal 2: Connect as eblume
PGPASSWORD=$(op --vault blumeops item get guxu3j7ajhjyey6xxl2ovsl2ui --fields password --reveal) \
  psql -h localhost -U eblume -d miniflux

Status

# Check cluster health
kubectl -n databases get cluster blumeops-pg

# Check pods
kubectl -n databases get pods -l cnpg.io/cluster=blumeops-pg

# Check managed roles status
kubectl -n databases get cluster blumeops-pg -o jsonpath='{.status.managedRolesStatus}' | jq

# Operator logs
kubectl -n databases logs -l cnpg.io/cluster=blumeops-pg

Tailscale Exposure

Current: Temporary Service

k8s-pg.tail8d86e.ts.net - LoadBalancer service for testing during migration.

Phase 4: Production Service

After miniflux migrates to k8s, the pg.tail8d86e.ts.net Tailscale service will switch from brew PostgreSQL (indri) to this k8s cluster. At that point:

  1. Delete service-tailscale.yaml (the k8s-pg service)
  2. Update/create a service with tailscale.com/hostname: "pg"
  3. Verify the orphaned k8s-pg device is removed from tailnet