blumeops

Author	SHA1	Message	Date
Erich Blume	0956e0ed2b	Update P4 plan with implementation notes and mark complete	2026-01-19 21:44:32 -08:00
Erich Blume	d8fbe7fbee	Remove miniflux and postgresql ansible roles - Remove postgresql and miniflux roles from playbook - Delete ansible/roles/miniflux/ and ansible/roles/postgresql/ - Update borgmatic to backup only pg.tail8d86e.ts.net (k8s) - Move .pgpass management to borgmatic role - Disable postgres metrics in alloy (k8s CNPG metrics TBD) - Remove svc:pg and svc:feed from tailscale_serve	2026-01-19 21:31:18 -08:00
Erich Blume	b9fb7f53cf	Rename k8s-pg to pg (canonical PostgreSQL hostname)	2026-01-19 19:42:02 -08:00
Erich Blume	ad2ad22ccf	Fix miniflux secret to use CNPG-generated password The miniflux user password is auto-generated by CloudNativePG and stored in blumeops-pg-app secret. Updated README and secret template to document the correct setup process.	2026-01-19 19:14:50 -08:00
Erich Blume	8875cc4a36	Add miniflux k8s manifests and ArgoCD app	2026-01-19 18:33:53 -08:00
Erich Blume	463f476374	P3 done Updated P3_postgresql.complete.md with full implementation notes including: - borgmatic borg path fix - Disaster recovery testing - CloudNativePG managed roles for borgmatic user - Dual database backup configuration - ACL grant for homelab → k8s - ArgoCD selfHeal disabled for feature branch workflow - CNPG default values to prevent drift Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-19 18:19:33 -08:00
Erich Blume	e69a3df2d4	P3 done	2026-01-19 18:03:48 -08:00
Erich Blume	0c6f0a13c3	Add CNPG default values to prevent ArgoCD drift CloudNativePG operator fills in connectionLimit, ensure, and inherit defaults on managed roles. Adding these explicitly keeps ArgoCD in sync. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-19 18:02:42 -08:00
Erich Blume	eb952aae01	P3: PostgreSQL disaster recovery test and borgmatic k8s-pg backup (#32 ) ## Summary - Fixed borgmatic `borg: command not found` by adding `local_path` config option - Successfully tested disaster recovery: restored miniflux data from borgmatic backup to k8s-pg - Added borgmatic user to k8s-pg via CloudNativePG managed roles - Configured borgmatic to backup both localhost and k8s-pg PostgreSQL databases - Added Tailscale ACL grant for `tag:homelab` → `tag:k8s` on port 5432 - Disabled selfHeal on apps app to allow manual revision changes during development ## Changes - `ansible/roles/borgmatic/` - Added `local_path` and k8s-pg database entry - `ansible/roles/postgresql/tasks/main.yml` - Added k8s-pg to `.pgpass` - `argocd/apps/apps.yaml` - Disabled selfHeal - `argocd/manifests/databases/blumeops-pg.yaml` - Added borgmatic managed role - `argocd/manifests/databases/secret-borgmatic.yaml.tpl` - New secret template - `pulumi/policy.hujson` - Added ACL grant for backup access ## Deployment and Testing - [x] Borgmatic backup runs successfully - [x] Miniflux data restored to k8s-pg (2 users, 2 feeds, 44 entries verified) - [x] borgmatic user created in k8s-pg with pg_read_all_data role - [x] Both localhost and k8s-pg databases in backup archive - [x] zk documentation updated (borgmatic.md, postgresql.md) - [ ] After merge: set blumeops-pg app back to main revision 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/32	2026-01-19 18:00:32 -08:00
Erich Blume	f2541c3f77	Fix minikube role idempotency for zot mirror config (#31 ) ## Summary - Fixed trailing newline mismatch in config comparison (ansible command module strips whitespace, slurp preserves it) - Only copy temp file when config actually needs updating (avoids spurious changes) - Task now properly skips when config is already correct ## Deployment and Testing - [x] Verified idempotency: `changed=0` on repeated runs - [x] Verified change detection: corrupted config triggers proper update - [x] ansible-lint passes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/31	2026-01-19 16:19:52 -08:00
Erich Blume	130c044523	Fix hanging minikube provision	2026-01-19 15:49:11 -08:00
Erich Blume	f0c28a3cdd	Rename P2 plan to .complete.md	2026-01-19 15:06:27 -08:00
Erich Blume	45dfefa8df	Mark P2 complete with implementation notes Documents lessons learned: - SSH credential template for all forge repos - Kustomize patches must omit namespace for matching - Tailscale hostname cutover requires manual admin console deletion - ArgoCD workflow: all apps target main, manual sync for control Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-19 15:06:14 -08:00
Erich Blume	258c88f2f7	Fix kustomize patch: remove namespace for proper matching Kustomize matches patches before namespace transformation, so the patch file shouldn't specify namespace (kustomization.yaml adds it). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-19 15:00:33 -08:00
Erich Blume	623b122f58	Fix kustomization: known_hosts as resource not patch The argocd-ssh-known-hosts-cm ConfigMap needs to be a resource, not a patch, because the upstream install.yaml includes it inline in a way kustomize can't patch. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-19 14:45:33 -08:00
Erich Blume	7e6742ad24	K8s Migration Phase 2: Grafana to Kubernetes (#30 ) ## Summary - Migrate Grafana from Homebrew/Ansible to Kubernetes deployment - Switch CloudNativePG to use forge-mirrored Helm chart (HTTPS, no auth needed) - Add Grafana Helm chart deployment via ArgoCD with multi-source pattern - Add Grafana config (Tailscale Ingress, 9 dashboard ConfigMaps) - Update Loki to bind 0.0.0.0 for k8s pod access via `host.containers.internal` ## Key Changes - `argocd/apps/grafana.yaml` - Grafana Helm chart Application - `argocd/apps/grafana-config.yaml` - Ingress + dashboard ConfigMaps - `argocd/apps/cloudnative-pg.yaml` - Now uses forge mirror instead of external Helm repo - `ansible/roles/loki/templates/loki-config.yaml.j2` - Bind 0.0.0.0 ## Deployment and Testing - [x] Deploy Loki config change: `mise run provision-indri -- --tags loki` - [x] Create namespace: `ki create namespace monitoring` - [x] Create secret: `op inject -i argocd/manifests/grafana-config/secret-admin.yaml.tpl \| ki apply -f -` - [x] Sync ArgoCD apps (grafana, grafana-config) - [x] Verify Grafana works at https://grafana.tail8d86e.ts.net - [x] Remove svc:grafana from ansible tailscale_serve - [x] Stop brew grafana: `ssh indri 'brew services stop grafana'` - [x] Delete ansible grafana role 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/30	2026-01-19 14:40:25 -08:00
Erich Blume	4c1c4b92e1	Scan full repo history in trufflehog	2026-01-19 10:12:56 -08:00
Erich Blume	680ad1095b	Rename P1 to complete	2026-01-19 10:03:52 -08:00
Erich Blume	a8f4d00294	K8s Migration Phase 1: Infrastructure Setup (#29 ) ## Summary - Split k8s migration plan into phases folder for easier navigation - Added `tag:k8s` to Pulumi ACLs for Kubernetes workloads - Phase 1 work in progress ## Phase 1 Goals - Tailscale Kubernetes Operator - CloudNativePG Operator - PostgreSQL cluster for future app migrations ## Deployment and Testing - [ ] Review Phase 1 plan - [ ] `mise run tailnet-preview` to verify ACL changes - [ ] `mise run tailnet-up` to apply ACL changes - [ ] Create Tailscale OAuth client (manual) - [ ] Deploy operators and PostgreSQL cluster 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/29	2026-01-19 09:49:52 -08:00
Erich Blume	61dced048b	Fix borgmatic-metrics script PATH issue (#28 ) ## Summary - Fixed borgmatic-metrics script failing in LaunchAgent context - Changed from `mise x -- borg` to absolute paths (`/opt/homebrew/bin/borg`, `/opt/homebrew/bin/jq`) - This fixes the Grafana dashboard showing "DOWN" for Repository Status and missing time series data ## Deployment and Testing - [ ] Run `mise run provision-indri -- --tags borgmatic-metrics` to deploy the fix - [ ] Wait for the hourly metrics collection (or manually run `ssh indri '~/bin/borgmatic-metrics'`) - [ ] Verify Grafana dashboard shows "UP" status and populated graphs 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/28	2026-01-18 14:57:35 -08:00
Erich Blume	3679124ebd	Expose Kubernetes API as Tailscale service (Step 0.14) (#27 ) ## Summary - Add `tag:k8s-api` to Pulumi ACLs and indri device tags - Configure Tailscale serve with TCP passthrough for k8s API at `k8s.tail8d86e.ts.net` - Update minikube role to include `k8s.tail8d86e.ts.net` in certificate SANs - Add `apiserver_port` config option (internal port 6443, dynamic host port with podman driver) - Document Step 0.14 in k8s-migration plan (added post-Phase 0 completion) The Kubernetes API is now accessible at `https://k8s.tail8d86e.ts.net` using TCP passthrough to preserve mTLS authentication. ## Deployment and Testing - [x] Pulumi ACLs applied - [x] Tailscale service created and approved in admin console - [x] Minikube cluster recreated with new cert SANs - [x] tailscale serve configured with TCP passthrough - [x] 1Password credentials updated with new certs - [x] Kubeconfig updated on gilbert - [x] `mise run indri-services-check` passes - [x] `kubectl --context=minikube-indri get nodes` works via Tailscale 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/27	2026-01-18 12:49:20 -08:00
Erich Blume	19a82373d5	K8s Migration Phase 0: Foundation Infrastructure (#26 ) ## Summary - Step 0.1: Update Pulumi ACLs with tag:registry - Step 0.3: Create Zot registry ansible role with mcquack LaunchAgent - Step 0.4: Add Zot to Tailscale Serve configuration - Step 0.5: Create Zot metrics role for Prometheus scraping - Step 0.6: Add Zot log collection to Alloy - Step 0.7: Update indri-services-check with zot checks - Step 0.8: Add podman role for container runtime - Step 0.9: Add minikube role for Kubernetes cluster - Step 0.10: Configure remote kubectl access with 1Password credentials ## Remaining Steps - [ ] Step 0.11: Add minikube to indri-services-check - [ ] Step 0.12: Create zettelkasten documentation - [ ] Step 0.13: Verify main playbook (already done - roles added) ## Deployment and Testing - [x] Zot registry deployed and accessible at https://registry.tail8d86e.ts.net - [x] Podman machine running on indri - [x] Minikube cluster running on indri - [x] kubectl access from gilbert working with 1Password credentials - [ ] indri-services-check passes all checks 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/26	2026-01-18 12:06:28 -08:00
Erich Blume	ee196b0c10	Fix Phase 0 plan based on review feedback (#25 ) ## Summary - Step 0.3: Use launchctl unload/load pattern for handlers (consistent with existing handlers) - Step 0.6: Correct file path - add zot logs to alloy defaults/main.yml - Step 0.9: Use cri-o runtime instead of containerd - Step 0.10: Simplify kubeconfig instructions - focus on goal not implementation ## Deployment and Testing - [x] Documentation-only change, no deployment needed 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/25	2026-01-17 20:07:10 -08:00
Erich Blume	c8433467c1	Add Kubernetes migration plan documentation (#24 ) ## Summary - Comprehensive phased plan for migrating blumeops services to minikube - Technical decisions documented: Zot registry, Podman driver, CloudNativePG, Tailscale Operator - 9 migration phases with verification and rollback procedures - LaunchAgent absolute path requirements documented - Observability requirements (zk docs, logging, metrics, dashboards) for new services ## Deployment and Testing - [x] Plan document created at `docs/k8s-migration.md` - [ ] Review plan phases for completeness - [ ] Validate technical decisions align with requirements 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/24	2026-01-17 17:34:53 -08:00
Erich Blume	e6d302b40b	Harden Tailscale ACL policy with least-privilege grants (#23 ) ## Summary - Replace permissive wildcard ACL (`` -> ``) with specific service grants - Admin: full access to all services including NAS - Member: user-facing services only (no Grafana/Loki/NAS) - Add device tagging for gilbert (workstation) and sifaka (NAS) via Pulumi - SSH hardening: remove root access, use "check" action with MFA - Add ACL tests to validate policy behavior ## Deployment and Testing - [x] Pulumi preview passes - [x] HuJSON syntax validated - [x] ACL tests defined and passing - [ ] Deploy with `mise run tailnet-up` - [ ] Verify SSH access from gilbert to indri - [ ] Verify Allison cannot access Grafana/Loki/NAS 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/23	2026-01-17 11:58:04 -08:00
Erich Blume	0918764e93	Rename Node Exporter dashboard to macOS (#22 ) ## Summary - Renamed dashboard from "Node Exporter - macOS" to just "macOS" since it now uses Alloy - Updated filename, title, uid, and tags to reflect the change ## Deployment and Testing - [ ] Deploy with `mise run provision-indri -- --tags grafana` - [ ] Verify dashboard accessible at https://grafana.tail8d86e.ts.net 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/22	2026-01-17 09:29:19 -08:00
Erich Blume	3962e5a7de	Fix borgmatic PostgreSQL backup and update backup sources (#21 ) ## Summary - Fix PostgreSQL backup failure by adding explicit `pg_dump_command` path (was failing with "pg_dump: command not found" in LaunchAgent) - Remove `~/code/3rd/kiwix-tools` from backups (was just symlinks to ZIM archives in transmission) - Enable Loki log backup by removing from exclude_patterns ## Deployment and Testing - [x] Dry run with `--check --diff` shows expected changes - [ ] Deploy with `mise run provision-indri -- --tags borgmatic` - [ ] Verify config deployed: `ssh indri 'cat ~/.config/borgmatic/config.yaml'` - [ ] Run manual backup to test: `ssh indri 'mise x -- borgmatic create --verbosity 1'` - [ ] Verify PostgreSQL dump succeeds (no "pg_dump: command not found" error) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/21	2026-01-17 09:22:01 -08:00
Erich Blume	75426be1dc	Remove ansible role meta dependencies to fix duplicate execution (#20 ) ## Summary - Remove all `meta/main.yml` dependencies from ansible roles - Role ordering is now controlled entirely by `indri.yml` playbook - Fix incorrect roles path in CLAUDE.md (`playbooks/roles` → `roles`) ## Why Ansible's tag accumulation behavior prevents proper role deduplication when using meta dependencies. When a role is pulled in as a dependency, the parent role's tags are added to the dependency's tags (e.g., `[loki]` becomes `[alloy, loki]`), making them appear as different invocations to Ansible and causing roles to run multiple times. ## Deployment and Testing - [x] Verified with `ansible-playbook --list-tasks` that each role now appears exactly once - [x] Run full provision to verify no regressions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/20	2026-01-16 22:50:34 -08:00
Erich Blume	9931829d03	Add pre-commit hooks for code quality (#19 ) ## Summary - Add pre-commit framework with hooks for YAML, Ansible, Python, shell, TOML, JSON, and secret detection - Fix all 91+ ansible-lint violations (variable naming, handler capitalization, changed_when) - Fix shellcheck warnings in mise-tasks scripts - Document pre-commit setup in README.md ## Deployment and Testing - [x] All pre-commit hooks pass (`uvx pre-commit run --all-files`) - [x] Test ansible playbook with `--check` mode - [x] Run `mise run indri-services-check` after deploy 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/19	2026-01-16 19:33:02 -08:00
Erich Blume	78f14f8bde	Reworded CLAUDE.md	2026-01-16 18:47:47 -08:00
Erich Blume	d3d3041b27	Decouple ZIM/torrent ansible tasks for faster provisioning (#18 ) ## Summary - Simplify kiwix role from 213 lines to 151 lines (-30%) - Replace per-archive torrent status loops with single shell command - Decouple kiwix startup from declared inventory - now serves whatever completed ZIM files exist - Fix tailscale_serve role to handle empty JSON in check mode ## Performance improvement - Before: ~132 operations (44 archives × 3 loops for status check, recheck, symlink) - After: ~5 operations (1 shell script + 1 find + conditional symlinks) - Expected reduction: ~3 minutes per ansible run ## Test plan - [x] Ran `mise run provision-indri -- --check --diff` to preview changes - [x] Ran `mise run provision-indri` to apply changes - [x] Ran `mise run indri-services-check` - all services healthy 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/18	2026-01-16 15:14:00 -08:00
Erich Blume	812b78bf61	Use explicit PostgreSQL superuser name and fix check mode (#17 ) ## Summary - Add `postgresql_superuser` variable (`eblume`) to prevent PostgreSQL from inheriting OS username during initdb - Update all psql/createdb commands to use explicit `-U` flag - Add `check_mode: false` to op commands so 1Password fetches run during `--check` mode - Add PostgreSQL and Miniflux health checks to indri-services-check ## Test plan - [x] Renamed existing superuser from `erichblume` to `eblume` - [x] Ran `mise run provision-indri -- --tags postgresql --check --diff` successfully - [x] Verified connection as `eblume` superuser via Tailscale - [x] Ran `mise run indri-services-check` - all services healthy 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/17	2026-01-16 14:41:36 -08:00
Erich Blume	adf6f4fbe9	Add PostgreSQL and Miniflux services to tailnet (#16 ) ## Summary - Add PostgreSQL 18 as a new service at `pg.tail8d86e.ts.net:5432` - Add Miniflux RSS/Atom feed reader at `feed.tail8d86e.ts.net` - Both services managed via homebrew/brew services - Pulumi ACL tags added (tag:pg, tag:feed) - Alloy log collection configured for both services - Zettelkasten documentation updated ## Manual Setup Required Before running ansible, the following steps are needed on indri: ### 1. Apply Pulumi tags ```bash mise run tailnet-up ``` Then apply tags to indri in Tailscale admin console. ### 2. Create 1Password entries - miniflux PostgreSQL user password - miniflux admin password (for first run) ### 3. Set PostgreSQL user password (after ansible installs postgres) ```bash ssh indri '/opt/homebrew/opt/postgresql@18/bin/psql -c "ALTER USER miniflux PASSWORD '\''your-password'\'';"' ``` ### 4. Create password files on indri ```bash ssh indri 'echo "your-db-password" > ~/.miniflux-db-password && chmod 600 ~/.miniflux-db-password' ssh indri 'echo "your-admin-password" > ~/.miniflux-admin-password && chmod 600 ~/.miniflux-admin-password' ``` ### 5. Create ~/.pgpass for borgmatic ```bash ssh indri 'echo "localhost:5432:miniflux:miniflux:YOUR_PASSWORD" > ~/.pgpass && chmod 600 ~/.pgpass' ``` ### 6. Run ansible with first-run admin creation ```bash mise run provision-indri -- -e miniflux_create_admin=1 ``` ### 7. Update borgmatic config Add to `~/.config/borgmatic/config.yaml` on indri: ```yaml postgresql_databases: - name: miniflux hostname: localhost port: 5432 username: miniflux ``` ### 8. Cleanup after first run ```bash ssh indri 'rm ~/.miniflux-admin-password' ``` ## Test plan - [ ] Run `mise run tailnet-up` and verify Pulumi changes - [ ] Apply tags to indri in Tailscale admin - [ ] Run `mise run provision-indri -- --check --diff` for dry run - [ ] Run `mise run provision-indri -- -e miniflux_create_admin=1` - [ ] Approve services in Tailscale admin - [ ] Verify PostgreSQL: `ssh indri '/opt/homebrew/opt/postgresql@18/bin/pg_isready'` - [ ] Verify Miniflux: `curl https://feed.tail8d86e.ts.net/healthcheck` - [ ] Run `mise run indri-services-check` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/16	2026-01-16 12:30:20 -08:00
Erich Blume	3f4e40f3ae	Add Pulumi for tailnet IaC management (#15 ) ## Summary - Manage tail8d86e.ts.net ACLs, tags, and DNS via Pulumi + Python - State stored in Pulumi Cloud (free tier) to avoid circular dependency - OAuth authentication via 1Password for secure credential management - New mise tasks: `tailnet-preview`, `tailnet-up` ## Architecture Two-layer approach: - Layer 1 (Pulumi): Tailnet-wide config (ACLs, tags, DNS) - Layer 2 (Ansible): Node-local `tailscale serve` config (unchanged) ## Test plan - [x] Exported current ACL from Tailscale API - [x] Imported existing ACL into Pulumi state - [x] Verified `mise run tailnet-preview` shows no changes - [x] Verified `mise run tailnet-up` applies successfully 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/15	2026-01-15 20:55:25 -08:00
Erich Blume	72c2dd7096	Add blumeops-tasks mise task for Todoist integration (#14 ) ## Summary - Add `mise run blumeops-tasks` to fetch and display tasks from Todoist - Uses uv run script with inline dependencies (httpx, rich) - Fetches API credential securely via 1Password CLI - Sorts tasks by custom priority order: p1, p2, p4, p3 (backlog last) - Documents the task discovery workflow in CLAUDE.md ## Test plan - [x] Verified `mise run blumeops-tasks` fetches and displays tasks correctly - [x] Confirmed priority sorting works as expected 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/14	2026-01-15 18:03:19 -08:00
Erich Blume	ae1513e7e9	Add Plex Media Server observability (#13 ) ## Summary - Add `plex_metrics` ansible role with textfile collector for Prometheus metrics - Add Plex log collection to Alloy (forwards to Loki) - Add Grafana dashboard for Plex monitoring (status, library counts, sessions, transcoding, logs) ## Metrics Collected - `plex_up` - server health - `plex_version_info` - server version - `plex_sessions_total/playing/paused` - active sessions - `plex_transcode_sessions_total/video/audio` - transcoding status - `plex_library_items{library,type}` - library item counts ## Prerequisites Plex token must be stored at `~/.plex-token` on indri (already done). ## Test plan - [x] Dry-run passed (`mise run provision-indri -- --check --diff`) - [ ] Apply changes (`mise run provision-indri`) - [ ] Verify metrics: `ssh indri 'cat /opt/homebrew/var/node_exporter/textfile/plex.prom'` - [ ] Verify logs in Grafana Explore: `{service="plex"}` - [ ] Check Plex dashboard in Grafana 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/13	2026-01-15 15:27:59 -08:00
Erich Blume	2a1359a3b6	Fix ansible handler timeouts for alloy and loki restarts (#12 ) ## Summary - Use async with poll: 0 for alloy and loki restart handlers - Fire-and-forget approach prevents ansible from hanging on graceful shutdown ## Test plan - [x] Manually verified `brew services restart grafana-alloy` works - [x] Run full ansible playbook and verify it completes without timeout 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/12	2026-01-15 13:56:11 -08:00
Erich Blume	ba5cd75ee2	Fix ansible handler timeouts for alloy and loki restarts Use async with poll: 0 to fire-and-forget service restarts. These services have graceful shutdown periods that can exceed ansible's default command timeout. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-15 12:39:28 -08:00
Erich Blume	242c1880de	Add Grafana Alloy and Loki for unified observability (#11 ) ## Summary - Add Grafana Alloy to replace node_exporter for metrics collection - Add Loki for log aggregation and storage - Configure Alloy to collect logs from all services (grafana, forgejo, prometheus, tailscale, transmission, devpi, kiwix, borgmatic) - Update Prometheus to accept metrics via remote_write - Add Loki datasource to Grafana ## Test plan - [ ] Run \`mise run provision-indri -- --check --diff\` to verify changes - [ ] Apply with \`mise run provision-indri\` - [ ] Verify services: \`mise run indri-services-check\` - [ ] Check Grafana Explore with Loki datasource - [ ] Query logs: \`{service="grafana"}\` - [ ] Verify metrics still flowing to Prometheus dashboards 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/11	2026-01-15 12:24:13 -08:00
Erich Blume	070f26dc6d	Add zk-docs mise task for zettelkasten documentation (#10 ) ## Summary - Add `mise run zk-docs` task to concatenate all blumeops-tagged zettelkasten cards - Main project card is shown first, followed by service management logs - Uses `bat` for output (added to Brewfile) - Args are passed through to bat for custom formatting - Update CLAUDE.md to use zk-docs command with plain output options - Update README.md to note zettelkasten is private with contact email ## Test plan - [x] `mise run zk-docs` displays all 6 blumeops cards - [x] `mise run zk-docs -- --style=header --color=never --decorations=always` shows filenames without decoration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/10	2026-01-15 11:25:02 -08:00
Erich Blume	2e326eb30d	Critical security note for claude	2026-01-15 09:02:27 -08:00
Erich Blume	c660674891	Remove settings.local.json from repo and add to gitignore Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-15 08:59:24 -08:00
Erich Blume	d8a0ef6482	Add devpi PyPI caching proxy role for indri (#9 ) ## Summary - Add ansible role for devpi-server as a transparent PyPI caching proxy - LaunchAgent with KeepAlive runs via `mise x -- devpi-server` - Listens on port 3141, data stored in `~/devpi` - Health checks added to `indri-services-check` script ## Manual Setup Required (on indri, before provisioning) 1. Add to `~/.config/mise/config.toml`: ```toml [tools] "pipx:devpi-server" = "latest" "pipx:devpi-web" = "latest" "pipx:devpi-client" = "latest" ``` 2. Run `mise install` 3. Initialize: `mise x -- devpi-init --serverdir ~/devpi` ## Post-Provisioning - Set up Tailscale service `pypi` on port 443 → 3141 - Configure client pip.conf with index-url ## Test plan - [x] Ansible syntax check passes - [x] Dry-run: `mise run provision-indri -- --check --diff` - [x] Apply: `mise run provision-indri` - [x] Health check: `mise run indri-services-check` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/9	2026-01-15 08:31:09 -08:00
Erich Blume	50c713b5de	Add macOS-compatible Node Exporter Grafana dashboard (#8 ) ## Summary - Adds a new Grafana dashboard for Node Exporter metrics on macOS hosts - Uses macOS-native memory metrics (node_memory_total_bytes, node_memory_active_bytes, etc.) instead of Linux-specific ones - Includes dropdown selectors for instance, disk, and network device filtering ## Details The standard Node Exporter dashboards show "No Data" for memory panels on macOS because they query Linux-specific metrics like `node_memory_MemTotal_bytes`. macOS node_exporter exports different metrics: \| Linux \| macOS \| \|-------\|-------\| \| node_memory_MemTotal_bytes \| node_memory_total_bytes \| \| node_memory_MemFree_bytes \| node_memory_free_bytes \| \| node_memory_Buffers_bytes \| (not available) \| \| node_memory_Cached_bytes \| (not available) \| macOS has unique memory categories: Wired, Active, Compressed, Inactive, Free. ## Test plan - [x] Dashboard deployed to indri via ansible - [x] All panels showing data for indri - [x] Instance selector works to switch between hosts - [x] Disk and network device filters work 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/8	2026-01-14 20:53:57 -08:00
Erich Blume	d9be8c27bc	Add 32 devdocs ZIM archives for programming documentation (#7 ) ## Summary - Adds offline documentation for: bash, c, click, cmake, cpp, css, django-rest-framework, django, docker, duckdb, fish, gcc, git, go, godot, hammerspoon, homebrew, javascript, kubectl, kubernetes, latex, lua, markdown, nginx, nix, postgresql, python, redis, sqlite, typescript, werkzeug, zig - All January 2026 versions from download.kiwix.org/zim/devdocs/ - Downloads via BitTorrent through transmission ## Test plan - [x] Deployed to indri via `mise run provision-indri` - [x] All 32 torrents added and downloaded (small files, completed instantly) - [x] 43 ZIM files now available in kiwix directory 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/7	2026-01-14 18:28:34 -08:00
Erich Blume	10012a4cf2	Add upload/download ratio and period transfer panels to Transmission dashboard (#6 ) ## Summary - Adds Upload/Download Ratio stat panel with color thresholds (red < 0.5, yellow < 1, green >= 1) - Adds Downloaded (Period) stat panel showing bytes downloaded in selected time range - Adds Uploaded (Period) stat panel showing bytes uploaded in selected time range Uses PromQL `increase()` on existing counter metrics - no new metrics collection needed. ## Test plan - [x] Deployed to indri via `mise run provision-indri` - [x] Grafana restarted successfully 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/6	2026-01-14 18:08:39 -08:00
Erich Blume	ba03af15eb	Set MISE_TASK_OUTPUT=interleave in provision-indri Shows ansible output in real-time instead of buffered. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-14 14:15:11 -08:00
Erich Blume	2f28b151f5	Fix launchctl idempotency in kiwix and borgmatic roles Check if LaunchAgent is already loaded before attempting to load it. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-14 14:14:52 -08:00
Erich Blume	e534e59556	Add provision-indri mise task and fix idempotency - Add mise-tasks/provision-indri script to run ansible playbook - Fix transmission_metrics launchctl load to be idempotent - Update CLAUDE.md to reference mise run provision-indri Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-14 14:10:30 -08:00
Erich Blume	e264b39cd6	Add total torrent size metric and dashboard panel - Query torrent-get RPC to sum totalSize of all torrents - Add transmission_torrents_size_bytes gauge metric - Add "Total Torrent Size" timeseries panel to dashboard Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-14 14:00:52 -08:00

1 2

80 commits