blumeops

Author	SHA1	Message	Date
Erich Blume	9fac4439b1	Migrate minikube ansible role from qemu2 to docker driver - Change driver from qemu2 to docker - Remove socket_vmnet and qemu dependencies - Remove NFS mount and minikube mount LaunchAgent/LaunchDaemon - Remove old podman zot-mirror.conf - Update containerd registry mirror config for docker driver - Uses host.minikube.internal:5050 to reach zot - Configures pull-through cache for docker.io, ghcr.io, quay.io - Add dynamic tailscale serve configuration for k8s API (port is dynamic with docker driver, not fixed at 6443) - Remove svc:k8s from tailscale_serve defaults (minikube role handles it) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-21 13:52:52 -08:00
Erich Blume	201c90b27e	Add mise task for minikube-indri kubectl config Creates reusable script that fetches certificates from indri and sets up kubeconfig at ~/.kube/minikube-indri/config.yml for remote kubectl access. Part of P5.1 migration to docker driver. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-21 13:44:30 -08:00
Erich Blume	5724b61fb4	save some work	2026-01-21 13:27:27 -08:00
Erich Blume	2c28a3fc54	Update tailscale_serve for qemu2 API server address The k8s API server is now at 192.168.105.2:6443 (inside qemu2 VM) instead of localhost:44491 (old podman port mapping). Note: TCP passthrough via tailscale svc:k8s is configured but connection times out - may need admin console approval or debugging. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-21 11:45:31 -08:00
Erich Blume	b096df4c71	Fix ansible idempotency and document macOS network permission - Check containerd registry config before writing to avoid unnecessary changes - Fix ansible_env deprecation warnings (use ansible_facts['env']) - Document macOS network permission popup for minikube mount - Document passwordless sudo configuration for indri - Add checks to skip sudo tasks when state already matches Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-21 11:24:44 -08:00
Erich Blume	40376b635f	Add LaunchDaemon/LaunchAgent for persistent NFS and minikube mounts - LaunchDaemon: mounts sifaka:/volume1/torrents to /Volumes/torrents-nfs at boot - LaunchAgent: runs minikube mount to pass through to /mnt/torrents in VM - Handlers to load both services when plist files change Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-21 08:22:53 -08:00
Erich Blume	26ec02e1be	P5.1: Add VM config to ansible role, mark phase complete - Add hosts file entry for registry.tail8d86e.ts.net in VM - Configure containerd registry mirror to use local zot - Update P5.1 doc with implementation notes and manual steps - Mark P5.1 as complete Manual steps still required after cluster creation: 1. sudo brew services start socket_vmnet (once per reboot) 2. sudo mount -t nfs sifaka:/volume1/torrents /Volumes/torrents-nfs 3. minikube mount /Volumes/torrents-nfs:/mnt/torrents (GUI session) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-21 08:03:21 -08:00
Erich Blume	4b2c1a346f	Add socket_vmnet for proper qemu2 networking - Install socket_vmnet via homebrew - Start socket_vmnet service (requires sudo) - Add --network=socket_vmnet to minikube start Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-20 21:41:47 -08:00
Erich Blume	0474962e89	Increase minikube resources to 6 CPUs and 12GB RAM Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-20 21:20:04 -08:00
Erich Blume	919f926241	P5.1: Update minikube role for QEMU2 driver - Change minikube driver from podman to qemu2 - Change container runtime from cri-o to containerd - Add qemu installation to minikube role - Remove podman role from indri.yml playbook - Update handlers for containerd instead of cri-o - Temporarily disable registry mirror config (needs containerd format) - Add k8s-storage synology user creation steps to P5.1 doc - Add post-migration tasks for zot registry mirror reconfiguration Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-20 21:06:53 -08:00
Erich Blume	7b60cca31e	Document P6 blocker and add P5.1 QEMU2 migration plan (#37 ) ## Summary - Document P6 (Kiwix/Transmission) blocker: podman driver cannot mount external volumes - Add P5.1 plan to migrate minikube from podman to QEMU2 driver - Update overview with corrected phase statuses and driver information ## Background P6 implementation (`feature/p6-kiwix-transmission`) was completed but blocked because all volume mount approaches failed with the podman driver: \| Approach \| Result \| \|----------\|--------\| \| NFS volume \| Failed - CAP_SYS_ADMIN required \| \| SMB CSI driver \| Failed - EPERM in rootless container \| \| `minikube mount` (9p) \| Failed - permission denied \| \| hostPath \| Failed - path doesn't exist in container \| Root cause: Podman driver runs minikube in a rootless container lacking kernel capabilities for filesystem mounts. ## What's Next 1. Merge this documentation PR 2. Execute P5.1 (QEMU2 migration) in a fresh session 3. Retry P6 with the QEMU2 driver ## Deployment and Testing - [x] No deployment needed - documentation only - [x] ArgoCD apps reset to main - [x] Cluster healthy (except kiwix/transmission intentionally offline) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/37	2026-01-20 20:49:48 -08:00
Erich Blume	b97d461a5a	P6: Kiwix and Transmission migration planning (#35 ) ## Summary - Detailed planning document for Phase 6 of k8s migration - Transmission as standalone general-purpose torrent service with web UI at torrent.tail8d86e.ts.net - NFS storage on sifaka (/volume1/torrents) shared between both services - Declarative ZIM torrent list in kiwix's ConfigMap, synced to transmission via sidecar - ZIM watcher CronJob for automatic kiwix restart when new archives complete - Supports both GitOps (declarative) and interactive (web UI) torrent management ## Architecture Highlights - torrent namespace: Standalone transmission with Tailscale ingress - kiwix namespace: kiwix-serve with torrent-sync sidecar - Shared NFS PV: Single PV referenced by PVCs in both namespaces - No backup needed: Sifaka is RAID 5/6 and already the backup target ## Deployment and Testing - [ ] Review plan document - [ ] Verify NFS export on sifaka is feasible - [ ] Approve architecture decisions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/35	2026-01-20 18:42:11 -08:00
Erich Blume	f98103a58d	P5 done	2026-01-20 15:04:46 -08:00
Erich Blume	0439fbb704	P5: Migrate devpi to Kubernetes (#34 ) ## Summary - Migrate devpi PyPI caching proxy from indri LaunchAgent to Kubernetes - Custom container image with devpi-server + devpi-web + auto-init - StatefulSet with 50Gi PVC, Tailscale Ingress at pypi.tail8d86e.ts.net - Remove devpi from ansible playbooks and update CLAUDE.md with k8s workflow ## Key Changes - Add CRI-O registry mirror config for registry.tail8d86e.ts.net - Change ArgoCD apps to manual sync (was auto-sync causing issues) - 2Gi memory limit for Whoosh indexer (reclaimed after startup) ## Deployment and Testing - [x] devpi pod healthy in k8s - [x] pip install through proxy works - [x] mcquack 1.0.0 uploaded and installable - [x] Old devpi stopped on indri ## Post-Merge Reset ArgoCD to main: ``` argocd app set apps --revision main && argocd app sync apps argocd app set devpi --revision main && argocd app sync devpi ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/34	2026-01-20 14:55:37 -08:00
Erich Blume	b2307412fc	Add P4 implementation notes and mark complete	2026-01-20 09:10:23 -08:00
Erich Blume	735b643429	P4: Miniflux migration + PostgreSQL consolidation (#33 ) ## Summary - Deploy miniflux in k8s via ArgoCD - Expose via Tailscale Ingress at feed.tail8d86e.ts.net - Retire brew PostgreSQL (no longer needed) - Rename k8s-pg to pg (canonical hostname) - Remove ansible miniflux and postgresql roles - Update borgmatic to backup pg.tail8d86e.ts.net - Update all zk documentation ## Deployment and Testing - [x] Miniflux pod running in k8s - [x] User login works at https://feed.tail8d86e.ts.net - [x] Feeds and entries visible - [x] brew miniflux and postgresql stopped - [x] Tailscale services migrated (feed, pg) - [x] zk documentation updated - [x] Run ansible to apply role removals - [ ] Verify borgmatic backup with new pg hostname 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/33	2026-01-20 09:04:47 -08:00
Erich Blume	463f476374	P3 done Updated P3_postgresql.complete.md with full implementation notes including: - borgmatic borg path fix - Disaster recovery testing - CloudNativePG managed roles for borgmatic user - Dual database backup configuration - ACL grant for homelab → k8s - ArgoCD selfHeal disabled for feature branch workflow - CNPG default values to prevent drift Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-19 18:19:33 -08:00
Erich Blume	e69a3df2d4	P3 done	2026-01-19 18:03:48 -08:00
Erich Blume	0c6f0a13c3	Add CNPG default values to prevent ArgoCD drift CloudNativePG operator fills in connectionLimit, ensure, and inherit defaults on managed roles. Adding these explicitly keeps ArgoCD in sync. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-19 18:02:42 -08:00
Erich Blume	eb952aae01	P3: PostgreSQL disaster recovery test and borgmatic k8s-pg backup (#32 ) ## Summary - Fixed borgmatic `borg: command not found` by adding `local_path` config option - Successfully tested disaster recovery: restored miniflux data from borgmatic backup to k8s-pg - Added borgmatic user to k8s-pg via CloudNativePG managed roles - Configured borgmatic to backup both localhost and k8s-pg PostgreSQL databases - Added Tailscale ACL grant for `tag:homelab` → `tag:k8s` on port 5432 - Disabled selfHeal on apps app to allow manual revision changes during development ## Changes - `ansible/roles/borgmatic/` - Added `local_path` and k8s-pg database entry - `ansible/roles/postgresql/tasks/main.yml` - Added k8s-pg to `.pgpass` - `argocd/apps/apps.yaml` - Disabled selfHeal - `argocd/manifests/databases/blumeops-pg.yaml` - Added borgmatic managed role - `argocd/manifests/databases/secret-borgmatic.yaml.tpl` - New secret template - `pulumi/policy.hujson` - Added ACL grant for backup access ## Deployment and Testing - [x] Borgmatic backup runs successfully - [x] Miniflux data restored to k8s-pg (2 users, 2 feeds, 44 entries verified) - [x] borgmatic user created in k8s-pg with pg_read_all_data role - [x] Both localhost and k8s-pg databases in backup archive - [x] zk documentation updated (borgmatic.md, postgresql.md) - [ ] After merge: set blumeops-pg app back to main revision 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/32	2026-01-19 18:00:32 -08:00
Erich Blume	f2541c3f77	Fix minikube role idempotency for zot mirror config (#31 ) ## Summary - Fixed trailing newline mismatch in config comparison (ansible command module strips whitespace, slurp preserves it) - Only copy temp file when config actually needs updating (avoids spurious changes) - Task now properly skips when config is already correct ## Deployment and Testing - [x] Verified idempotency: `changed=0` on repeated runs - [x] Verified change detection: corrupted config triggers proper update - [x] ansible-lint passes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/31	2026-01-19 16:19:52 -08:00
Erich Blume	130c044523	Fix hanging minikube provision	2026-01-19 15:49:11 -08:00
Erich Blume	f0c28a3cdd	Rename P2 plan to .complete.md	2026-01-19 15:06:27 -08:00
Erich Blume	45dfefa8df	Mark P2 complete with implementation notes Documents lessons learned: - SSH credential template for all forge repos - Kustomize patches must omit namespace for matching - Tailscale hostname cutover requires manual admin console deletion - ArgoCD workflow: all apps target main, manual sync for control Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-19 15:06:14 -08:00
Erich Blume	258c88f2f7	Fix kustomize patch: remove namespace for proper matching Kustomize matches patches before namespace transformation, so the patch file shouldn't specify namespace (kustomization.yaml adds it). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-19 15:00:33 -08:00
Erich Blume	623b122f58	Fix kustomization: known_hosts as resource not patch The argocd-ssh-known-hosts-cm ConfigMap needs to be a resource, not a patch, because the upstream install.yaml includes it inline in a way kustomize can't patch. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-19 14:45:33 -08:00
Erich Blume	7e6742ad24	K8s Migration Phase 2: Grafana to Kubernetes (#30 ) ## Summary - Migrate Grafana from Homebrew/Ansible to Kubernetes deployment - Switch CloudNativePG to use forge-mirrored Helm chart (HTTPS, no auth needed) - Add Grafana Helm chart deployment via ArgoCD with multi-source pattern - Add Grafana config (Tailscale Ingress, 9 dashboard ConfigMaps) - Update Loki to bind 0.0.0.0 for k8s pod access via `host.containers.internal` ## Key Changes - `argocd/apps/grafana.yaml` - Grafana Helm chart Application - `argocd/apps/grafana-config.yaml` - Ingress + dashboard ConfigMaps - `argocd/apps/cloudnative-pg.yaml` - Now uses forge mirror instead of external Helm repo - `ansible/roles/loki/templates/loki-config.yaml.j2` - Bind 0.0.0.0 ## Deployment and Testing - [x] Deploy Loki config change: `mise run provision-indri -- --tags loki` - [x] Create namespace: `ki create namespace monitoring` - [x] Create secret: `op inject -i argocd/manifests/grafana-config/secret-admin.yaml.tpl \| ki apply -f -` - [x] Sync ArgoCD apps (grafana, grafana-config) - [x] Verify Grafana works at https://grafana.tail8d86e.ts.net - [x] Remove svc:grafana from ansible tailscale_serve - [x] Stop brew grafana: `ssh indri 'brew services stop grafana'` - [x] Delete ansible grafana role 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/30	2026-01-19 14:40:25 -08:00
Erich Blume	4c1c4b92e1	Scan full repo history in trufflehog	2026-01-19 10:12:56 -08:00
Erich Blume	680ad1095b	Rename P1 to complete	2026-01-19 10:03:52 -08:00
Erich Blume	a8f4d00294	K8s Migration Phase 1: Infrastructure Setup (#29 ) ## Summary - Split k8s migration plan into phases folder for easier navigation - Added `tag:k8s` to Pulumi ACLs for Kubernetes workloads - Phase 1 work in progress ## Phase 1 Goals - Tailscale Kubernetes Operator - CloudNativePG Operator - PostgreSQL cluster for future app migrations ## Deployment and Testing - [ ] Review Phase 1 plan - [ ] `mise run tailnet-preview` to verify ACL changes - [ ] `mise run tailnet-up` to apply ACL changes - [ ] Create Tailscale OAuth client (manual) - [ ] Deploy operators and PostgreSQL cluster 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/29	2026-01-19 09:49:52 -08:00
Erich Blume	61dced048b	Fix borgmatic-metrics script PATH issue (#28 ) ## Summary - Fixed borgmatic-metrics script failing in LaunchAgent context - Changed from `mise x -- borg` to absolute paths (`/opt/homebrew/bin/borg`, `/opt/homebrew/bin/jq`) - This fixes the Grafana dashboard showing "DOWN" for Repository Status and missing time series data ## Deployment and Testing - [ ] Run `mise run provision-indri -- --tags borgmatic-metrics` to deploy the fix - [ ] Wait for the hourly metrics collection (or manually run `ssh indri '~/bin/borgmatic-metrics'`) - [ ] Verify Grafana dashboard shows "UP" status and populated graphs 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/28	2026-01-18 14:57:35 -08:00
Erich Blume	3679124ebd	Expose Kubernetes API as Tailscale service (Step 0.14) (#27 ) ## Summary - Add `tag:k8s-api` to Pulumi ACLs and indri device tags - Configure Tailscale serve with TCP passthrough for k8s API at `k8s.tail8d86e.ts.net` - Update minikube role to include `k8s.tail8d86e.ts.net` in certificate SANs - Add `apiserver_port` config option (internal port 6443, dynamic host port with podman driver) - Document Step 0.14 in k8s-migration plan (added post-Phase 0 completion) The Kubernetes API is now accessible at `https://k8s.tail8d86e.ts.net` using TCP passthrough to preserve mTLS authentication. ## Deployment and Testing - [x] Pulumi ACLs applied - [x] Tailscale service created and approved in admin console - [x] Minikube cluster recreated with new cert SANs - [x] tailscale serve configured with TCP passthrough - [x] 1Password credentials updated with new certs - [x] Kubeconfig updated on gilbert - [x] `mise run indri-services-check` passes - [x] `kubectl --context=minikube-indri get nodes` works via Tailscale 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/27	2026-01-18 12:49:20 -08:00
Erich Blume	19a82373d5	K8s Migration Phase 0: Foundation Infrastructure (#26 ) ## Summary - Step 0.1: Update Pulumi ACLs with tag:registry - Step 0.3: Create Zot registry ansible role with mcquack LaunchAgent - Step 0.4: Add Zot to Tailscale Serve configuration - Step 0.5: Create Zot metrics role for Prometheus scraping - Step 0.6: Add Zot log collection to Alloy - Step 0.7: Update indri-services-check with zot checks - Step 0.8: Add podman role for container runtime - Step 0.9: Add minikube role for Kubernetes cluster - Step 0.10: Configure remote kubectl access with 1Password credentials ## Remaining Steps - [ ] Step 0.11: Add minikube to indri-services-check - [ ] Step 0.12: Create zettelkasten documentation - [ ] Step 0.13: Verify main playbook (already done - roles added) ## Deployment and Testing - [x] Zot registry deployed and accessible at https://registry.tail8d86e.ts.net - [x] Podman machine running on indri - [x] Minikube cluster running on indri - [x] kubectl access from gilbert working with 1Password credentials - [ ] indri-services-check passes all checks 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/26	2026-01-18 12:06:28 -08:00
Erich Blume	ee196b0c10	Fix Phase 0 plan based on review feedback (#25 ) ## Summary - Step 0.3: Use launchctl unload/load pattern for handlers (consistent with existing handlers) - Step 0.6: Correct file path - add zot logs to alloy defaults/main.yml - Step 0.9: Use cri-o runtime instead of containerd - Step 0.10: Simplify kubeconfig instructions - focus on goal not implementation ## Deployment and Testing - [x] Documentation-only change, no deployment needed 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/25	2026-01-17 20:07:10 -08:00
Erich Blume	c8433467c1	Add Kubernetes migration plan documentation (#24 ) ## Summary - Comprehensive phased plan for migrating blumeops services to minikube - Technical decisions documented: Zot registry, Podman driver, CloudNativePG, Tailscale Operator - 9 migration phases with verification and rollback procedures - LaunchAgent absolute path requirements documented - Observability requirements (zk docs, logging, metrics, dashboards) for new services ## Deployment and Testing - [x] Plan document created at `docs/k8s-migration.md` - [ ] Review plan phases for completeness - [ ] Validate technical decisions align with requirements 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/24	2026-01-17 17:34:53 -08:00
Erich Blume	e6d302b40b	Harden Tailscale ACL policy with least-privilege grants (#23 ) ## Summary - Replace permissive wildcard ACL (`` -> ``) with specific service grants - Admin: full access to all services including NAS - Member: user-facing services only (no Grafana/Loki/NAS) - Add device tagging for gilbert (workstation) and sifaka (NAS) via Pulumi - SSH hardening: remove root access, use "check" action with MFA - Add ACL tests to validate policy behavior ## Deployment and Testing - [x] Pulumi preview passes - [x] HuJSON syntax validated - [x] ACL tests defined and passing - [ ] Deploy with `mise run tailnet-up` - [ ] Verify SSH access from gilbert to indri - [ ] Verify Allison cannot access Grafana/Loki/NAS 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/23	2026-01-17 11:58:04 -08:00
Erich Blume	0918764e93	Rename Node Exporter dashboard to macOS (#22 ) ## Summary - Renamed dashboard from "Node Exporter - macOS" to just "macOS" since it now uses Alloy - Updated filename, title, uid, and tags to reflect the change ## Deployment and Testing - [ ] Deploy with `mise run provision-indri -- --tags grafana` - [ ] Verify dashboard accessible at https://grafana.tail8d86e.ts.net 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/22	2026-01-17 09:29:19 -08:00
Erich Blume	3962e5a7de	Fix borgmatic PostgreSQL backup and update backup sources (#21 ) ## Summary - Fix PostgreSQL backup failure by adding explicit `pg_dump_command` path (was failing with "pg_dump: command not found" in LaunchAgent) - Remove `~/code/3rd/kiwix-tools` from backups (was just symlinks to ZIM archives in transmission) - Enable Loki log backup by removing from exclude_patterns ## Deployment and Testing - [x] Dry run with `--check --diff` shows expected changes - [ ] Deploy with `mise run provision-indri -- --tags borgmatic` - [ ] Verify config deployed: `ssh indri 'cat ~/.config/borgmatic/config.yaml'` - [ ] Run manual backup to test: `ssh indri 'mise x -- borgmatic create --verbosity 1'` - [ ] Verify PostgreSQL dump succeeds (no "pg_dump: command not found" error) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/21	2026-01-17 09:22:01 -08:00
Erich Blume	75426be1dc	Remove ansible role meta dependencies to fix duplicate execution (#20 ) ## Summary - Remove all `meta/main.yml` dependencies from ansible roles - Role ordering is now controlled entirely by `indri.yml` playbook - Fix incorrect roles path in CLAUDE.md (`playbooks/roles` → `roles`) ## Why Ansible's tag accumulation behavior prevents proper role deduplication when using meta dependencies. When a role is pulled in as a dependency, the parent role's tags are added to the dependency's tags (e.g., `[loki]` becomes `[alloy, loki]`), making them appear as different invocations to Ansible and causing roles to run multiple times. ## Deployment and Testing - [x] Verified with `ansible-playbook --list-tasks` that each role now appears exactly once - [x] Run full provision to verify no regressions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/20	2026-01-16 22:50:34 -08:00
Erich Blume	9931829d03	Add pre-commit hooks for code quality (#19 ) ## Summary - Add pre-commit framework with hooks for YAML, Ansible, Python, shell, TOML, JSON, and secret detection - Fix all 91+ ansible-lint violations (variable naming, handler capitalization, changed_when) - Fix shellcheck warnings in mise-tasks scripts - Document pre-commit setup in README.md ## Deployment and Testing - [x] All pre-commit hooks pass (`uvx pre-commit run --all-files`) - [x] Test ansible playbook with `--check` mode - [x] Run `mise run indri-services-check` after deploy 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/19	2026-01-16 19:33:02 -08:00
Erich Blume	78f14f8bde	Reworded CLAUDE.md	2026-01-16 18:47:47 -08:00
Erich Blume	d3d3041b27	Decouple ZIM/torrent ansible tasks for faster provisioning (#18 ) ## Summary - Simplify kiwix role from 213 lines to 151 lines (-30%) - Replace per-archive torrent status loops with single shell command - Decouple kiwix startup from declared inventory - now serves whatever completed ZIM files exist - Fix tailscale_serve role to handle empty JSON in check mode ## Performance improvement - Before: ~132 operations (44 archives × 3 loops for status check, recheck, symlink) - After: ~5 operations (1 shell script + 1 find + conditional symlinks) - Expected reduction: ~3 minutes per ansible run ## Test plan - [x] Ran `mise run provision-indri -- --check --diff` to preview changes - [x] Ran `mise run provision-indri` to apply changes - [x] Ran `mise run indri-services-check` - all services healthy 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/18	2026-01-16 15:14:00 -08:00
Erich Blume	812b78bf61	Use explicit PostgreSQL superuser name and fix check mode (#17 ) ## Summary - Add `postgresql_superuser` variable (`eblume`) to prevent PostgreSQL from inheriting OS username during initdb - Update all psql/createdb commands to use explicit `-U` flag - Add `check_mode: false` to op commands so 1Password fetches run during `--check` mode - Add PostgreSQL and Miniflux health checks to indri-services-check ## Test plan - [x] Renamed existing superuser from `erichblume` to `eblume` - [x] Ran `mise run provision-indri -- --tags postgresql --check --diff` successfully - [x] Verified connection as `eblume` superuser via Tailscale - [x] Ran `mise run indri-services-check` - all services healthy 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/17	2026-01-16 14:41:36 -08:00
Erich Blume	adf6f4fbe9	Add PostgreSQL and Miniflux services to tailnet (#16 ) ## Summary - Add PostgreSQL 18 as a new service at `pg.tail8d86e.ts.net:5432` - Add Miniflux RSS/Atom feed reader at `feed.tail8d86e.ts.net` - Both services managed via homebrew/brew services - Pulumi ACL tags added (tag:pg, tag:feed) - Alloy log collection configured for both services - Zettelkasten documentation updated ## Manual Setup Required Before running ansible, the following steps are needed on indri: ### 1. Apply Pulumi tags ```bash mise run tailnet-up ``` Then apply tags to indri in Tailscale admin console. ### 2. Create 1Password entries - miniflux PostgreSQL user password - miniflux admin password (for first run) ### 3. Set PostgreSQL user password (after ansible installs postgres) ```bash ssh indri '/opt/homebrew/opt/postgresql@18/bin/psql -c "ALTER USER miniflux PASSWORD '\''your-password'\'';"' ``` ### 4. Create password files on indri ```bash ssh indri 'echo "your-db-password" > ~/.miniflux-db-password && chmod 600 ~/.miniflux-db-password' ssh indri 'echo "your-admin-password" > ~/.miniflux-admin-password && chmod 600 ~/.miniflux-admin-password' ``` ### 5. Create ~/.pgpass for borgmatic ```bash ssh indri 'echo "localhost:5432:miniflux:miniflux:YOUR_PASSWORD" > ~/.pgpass && chmod 600 ~/.pgpass' ``` ### 6. Run ansible with first-run admin creation ```bash mise run provision-indri -- -e miniflux_create_admin=1 ``` ### 7. Update borgmatic config Add to `~/.config/borgmatic/config.yaml` on indri: ```yaml postgresql_databases: - name: miniflux hostname: localhost port: 5432 username: miniflux ``` ### 8. Cleanup after first run ```bash ssh indri 'rm ~/.miniflux-admin-password' ``` ## Test plan - [ ] Run `mise run tailnet-up` and verify Pulumi changes - [ ] Apply tags to indri in Tailscale admin - [ ] Run `mise run provision-indri -- --check --diff` for dry run - [ ] Run `mise run provision-indri -- -e miniflux_create_admin=1` - [ ] Approve services in Tailscale admin - [ ] Verify PostgreSQL: `ssh indri '/opt/homebrew/opt/postgresql@18/bin/pg_isready'` - [ ] Verify Miniflux: `curl https://feed.tail8d86e.ts.net/healthcheck` - [ ] Run `mise run indri-services-check` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/16	2026-01-16 12:30:20 -08:00
Erich Blume	3f4e40f3ae	Add Pulumi for tailnet IaC management (#15 ) ## Summary - Manage tail8d86e.ts.net ACLs, tags, and DNS via Pulumi + Python - State stored in Pulumi Cloud (free tier) to avoid circular dependency - OAuth authentication via 1Password for secure credential management - New mise tasks: `tailnet-preview`, `tailnet-up` ## Architecture Two-layer approach: - Layer 1 (Pulumi): Tailnet-wide config (ACLs, tags, DNS) - Layer 2 (Ansible): Node-local `tailscale serve` config (unchanged) ## Test plan - [x] Exported current ACL from Tailscale API - [x] Imported existing ACL into Pulumi state - [x] Verified `mise run tailnet-preview` shows no changes - [x] Verified `mise run tailnet-up` applies successfully 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/15	2026-01-15 20:55:25 -08:00
Erich Blume	72c2dd7096	Add blumeops-tasks mise task for Todoist integration (#14 ) ## Summary - Add `mise run blumeops-tasks` to fetch and display tasks from Todoist - Uses uv run script with inline dependencies (httpx, rich) - Fetches API credential securely via 1Password CLI - Sorts tasks by custom priority order: p1, p2, p4, p3 (backlog last) - Documents the task discovery workflow in CLAUDE.md ## Test plan - [x] Verified `mise run blumeops-tasks` fetches and displays tasks correctly - [x] Confirmed priority sorting works as expected 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/14	2026-01-15 18:03:19 -08:00
Erich Blume	ae1513e7e9	Add Plex Media Server observability (#13 ) ## Summary - Add `plex_metrics` ansible role with textfile collector for Prometheus metrics - Add Plex log collection to Alloy (forwards to Loki) - Add Grafana dashboard for Plex monitoring (status, library counts, sessions, transcoding, logs) ## Metrics Collected - `plex_up` - server health - `plex_version_info` - server version - `plex_sessions_total/playing/paused` - active sessions - `plex_transcode_sessions_total/video/audio` - transcoding status - `plex_library_items{library,type}` - library item counts ## Prerequisites Plex token must be stored at `~/.plex-token` on indri (already done). ## Test plan - [x] Dry-run passed (`mise run provision-indri -- --check --diff`) - [ ] Apply changes (`mise run provision-indri`) - [ ] Verify metrics: `ssh indri 'cat /opt/homebrew/var/node_exporter/textfile/plex.prom'` - [ ] Verify logs in Grafana Explore: `{service="plex"}` - [ ] Check Plex dashboard in Grafana 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/13	2026-01-15 15:27:59 -08:00
Erich Blume	2a1359a3b6	Fix ansible handler timeouts for alloy and loki restarts (#12 ) ## Summary - Use async with poll: 0 for alloy and loki restart handlers - Fire-and-forget approach prevents ansible from hanging on graceful shutdown ## Test plan - [x] Manually verified `brew services restart grafana-alloy` works - [x] Run full ansible playbook and verify it completes without timeout 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/12	2026-01-15 13:56:11 -08:00
Erich Blume	ba5cd75ee2	Fix ansible handler timeouts for alloy and loki restarts Use async with poll: 0 to fire-and-forget service restarts. These services have graceful shutdown periods that can exceed ansible's default command timeout. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-15 12:39:28 -08:00
Erich Blume	242c1880de	Add Grafana Alloy and Loki for unified observability (#11 ) ## Summary - Add Grafana Alloy to replace node_exporter for metrics collection - Add Loki for log aggregation and storage - Configure Alloy to collect logs from all services (grafana, forgejo, prometheus, tailscale, transmission, devpi, kiwix, borgmatic) - Update Prometheus to accept metrics via remote_write - Add Loki datasource to Grafana ## Test plan - [ ] Run \`mise run provision-indri -- --check --diff\` to verify changes - [ ] Apply with \`mise run provision-indri\` - [ ] Verify services: \`mise run indri-services-check\` - [ ] Check Grafana Explore with Loki datasource - [ ] Query logs: \`{service="grafana"}\` - [ ] Verify metrics still flowing to Prometheus dashboards 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/11	2026-01-15 12:24:13 -08:00

1 2

91 commits