Commit graph

77 commits

Author SHA1 Message Date
bcd96d86f0 Phase 0 review fixes
- Bump podman disk-size to 220G (> minikube's 200G)
- Fix Step 0.3 test to use curl instead of podman (not installed yet)
- Simplify Step 0.5 zot metrics to just zot_up for now
- Add Backup Strategy section to Technical Decisions
- Add zot restart handler to Step 0.3
- Move dashboard steps to Phase 0 Follow-up section
- Renumber steps (0.14->0.12, 0.15->0.13)
- Fix Modified Files Summary (tag:k8s deferred to Phase 1)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 17:32:55 -08:00
a31e8935c9 Remove Step 0.16 (NFS on Sifaka)
Nothing in Phase 0 requires NFS, and it's per-share config anyway.
Will add when actually needed.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 17:11:34 -08:00
5f35084176 Add Step 0.16: Enable NFS on Sifaka, bump minikube disk to 200g
- Add manual step for enabling NFS on Synology DSM
- Document NFS permissions config for k8s-volumes share
- Include verification commands for testing NFS mount
- Bump minikube disk-size from 100g to 200g
- Add note explaining storage options (hostPath, NFS, SMB)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 17:09:49 -08:00
2c8ced07b4 Tighten podman ansible tasks based on manual testing
- Use 'started successfully' instead of just 'started' for changed_when
- Use specific failed_when: rc not in [0, 125] instead of false
- 125 = already exists (init) or already running (start)

Tested manually on indri - podman machine initialized and running.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 16:58:35 -08:00
b333b7ff2c Move plan to plans/ directory, add completion step
- Rename docs/k8s-migration.md to plans/k8s-migration.md
- Create plans/completed/ for finished plans
- Add Plan Completion section with instructions to archive when done

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 16:47:29 -08:00
b703abe4d1 Remove manual alloy restart from Step 0.6
Ansible handler restarts alloy automatically when config changes

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 16:37:50 -08:00
bf1664d117 Move tailscale URL tests from Step 0.3 to Step 0.4
registry.tail8d86e.ts.net isn't available until tailscale serve
is configured in Step 0.4. Keep localhost tests in Step 0.3.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 16:31:35 -08:00
cff951a0f9 Add quay.io to zot sync config and namespace convention
Config template and namespace docs now match defaults/main.yml

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 16:11:43 -08:00
9edecf78dd Defer tag:k8s to Phase 1, clarify kubeconfig setup
- Remove tag:k8s from Phase 0 Step 0.1 (not needed until Tailscale
  Kubernetes Operator is deployed)
- Add tag:k8s ACL setup as new Step 1 in Phase 1
- Clarify Step 0.10: no special Tailscale service needed for K8s API
  (admin wildcard grant covers it)
- Add sed commands to replace localhost with indri in kubeconfig

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 14:29:09 -08:00
546fe08d9c Fix Zot paths in Technical Decisions section
Update to match Phase 0 details:
- Built from source, not homebrew
- Config at ~/.config/zot/config.json
- Data at ~/zot/
- Binary path documented

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 14:23:22 -08:00
6df7153766 Update Step 0.3 with verified zot build process
- Use localhost:3001 for forge clone (hairpinning limitation)
- Document mise go@1.25 setup in repo directory
- Correct build command: mise x -- make binary
- Mark prerequisites as already completed with verification

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 14:19:29 -08:00
c9d7acfafe Fix example: mcquack is not a mirror, use devpi instead
mcquack is Erich's own project, not a third-party mirror.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 14:01:36 -08:00
6d84ff7bca Use forge mirror for zot, add third-party project guidance
- Updated Step 0.3 to clone zot from forge mirror instead of GitHub
- Added "Third-Party Projects" section to CLAUDE.md explaining:
  - Ask user to mirror 3rd party repos to forge first
  - Clone from mirror to ~/code/3rd/
  - Avoids external dependencies

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 14:00:54 -08:00
f064ba3afa Update Zot installation: clone to ~/code/3rd/ and build from source
Zot isn't in homebrew. Following existing pattern (like kiwix-tools),
clone to ~/code/3rd/zot on indri and build with 'make binary'.
Updated defaults and LaunchAgent template to use built binary path.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 13:56:17 -08:00
adba123ad4 Document both registry modes: pull-through cache + private images
- Added Zot config.json template showing sync extension for pull-through
- Documented namespace convention:
  - registry.../docker.io/* → cached from Docker Hub
  - registry.../ghcr.io/* → cached from GHCR
  - registry.../blumeops/* → private images
- Added testing steps for both pull-through and private push
- Updated zk template with namespace table and build/push commands
- Updated verification checklist

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 13:51:46 -08:00
950604bf25 Add tag:k8s grant for registry access (Woodpecker CI)
K8s workloads (like Woodpecker CI) need to push/pull images from Zot.
They'll get Tailscale identity via the operator (Phase 1) with tag:k8s.
Added grant and test case for tag:k8s → tag:registry access.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 13:47:24 -08:00
97dce31171 Remove member grant for registry - admins only
Registry access restricted to admins (who already have full access).
Members don't need to push/pull container images.
K8s accesses registry locally on indri, not via Tailscale.
Added note about Zot htpasswd auth for future reference.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 13:45:03 -08:00
ee42f0f1a2 Fix Step 0.1: Use correct policy.hujson structure
- Use 'grants' not 'acls' (that's the newer format)
- Show exact line numbers and locations for each change
- Include tagOwners, grants, and tests sections
- Follow existing pattern with tag:blumeops in tagOwners

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 13:42:19 -08:00
26113aee42 Remove Brewfile from Phase 0 (it's for gilbert tooling only)
Brewfile is for development tooling on gilbert, not for indri services.
Ansible roles handle homebrew installations on indri directly.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 13:33:09 -08:00
ace4822305 Expand Phase 0 with detailed implementation steps
- Add 16 numbered steps with specific files, code, and testing commands
- Add Tailscale service creation order warning (must create in admin
  console BEFORE running tailscale serve)
- Add comprehensive verification checklist and rollback procedures
- Document indri-services-check updates for zot and minikube
- Include zk documentation templates

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 13:27:04 -08:00
4d916a46d3 Add Kubernetes migration plan documentation
Comprehensive phased plan for migrating blumeops services from direct
hosting on indri to a minikube cluster. Documents technical decisions
(Zot registry, Podman driver, CloudNativePG, Tailscale Operator) and
9 migration phases with verification and rollback procedures.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 13:12:09 -08:00
e6d302b40b Harden Tailscale ACL policy with least-privilege grants (#23)
## Summary
- Replace permissive wildcard ACL (`*` -> `*`) with specific service grants
- Admin: full access to all services including NAS
- Member: user-facing services only (no Grafana/Loki/NAS)
- Add device tagging for gilbert (workstation) and sifaka (NAS) via Pulumi
- SSH hardening: remove root access, use "check" action with MFA
- Add ACL tests to validate policy behavior

## Deployment and Testing
- [x] Pulumi preview passes
- [x] HuJSON syntax validated
- [x] ACL tests defined and passing
- [ ] Deploy with `mise run tailnet-up`
- [ ] Verify SSH access from gilbert to indri
- [ ] Verify Allison cannot access Grafana/Loki/NAS

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/23
2026-01-17 11:58:04 -08:00
0918764e93 Rename Node Exporter dashboard to macOS (#22)
## Summary
- Renamed dashboard from "Node Exporter - macOS" to just "macOS" since it now uses Alloy
- Updated filename, title, uid, and tags to reflect the change

## Deployment and Testing
- [ ] Deploy with `mise run provision-indri -- --tags grafana`
- [ ] Verify dashboard accessible at https://grafana.tail8d86e.ts.net

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/22
2026-01-17 09:29:19 -08:00
3962e5a7de Fix borgmatic PostgreSQL backup and update backup sources (#21)
## Summary
- Fix PostgreSQL backup failure by adding explicit `pg_dump_command` path (was failing with "pg_dump: command not found" in LaunchAgent)
- Remove `~/code/3rd/kiwix-tools` from backups (was just symlinks to ZIM archives in transmission)
- Enable Loki log backup by removing from exclude_patterns

## Deployment and Testing
- [x] Dry run with `--check --diff` shows expected changes
- [ ] Deploy with `mise run provision-indri -- --tags borgmatic`
- [ ] Verify config deployed: `ssh indri 'cat ~/.config/borgmatic/config.yaml'`
- [ ] Run manual backup to test: `ssh indri 'mise x -- borgmatic create --verbosity 1'`
- [ ] Verify PostgreSQL dump succeeds (no "pg_dump: command not found" error)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/21
2026-01-17 09:22:01 -08:00
75426be1dc Remove ansible role meta dependencies to fix duplicate execution (#20)
## Summary
- Remove all `meta/main.yml` dependencies from ansible roles
- Role ordering is now controlled entirely by `indri.yml` playbook
- Fix incorrect roles path in CLAUDE.md (`playbooks/roles` → `roles`)

## Why
Ansible's tag accumulation behavior prevents proper role deduplication when using meta dependencies. When a role is pulled in as a dependency, the parent role's tags are added to the dependency's tags (e.g., `[loki]` becomes `[alloy, loki]`), making them appear as different invocations to Ansible and causing roles to run multiple times.

## Deployment and Testing
- [x] Verified with `ansible-playbook --list-tasks` that each role now appears exactly once
- [x] Run full provision to verify no regressions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/20
2026-01-16 22:50:34 -08:00
9931829d03 Add pre-commit hooks for code quality (#19)
## Summary
- Add pre-commit framework with hooks for YAML, Ansible, Python, shell, TOML, JSON, and secret detection
- Fix all 91+ ansible-lint violations (variable naming, handler capitalization, changed_when)
- Fix shellcheck warnings in mise-tasks scripts
- Document pre-commit setup in README.md

## Deployment and Testing
- [x] All pre-commit hooks pass (`uvx pre-commit run --all-files`)
- [x] Test ansible playbook with `--check` mode
- [x] Run `mise run indri-services-check` after deploy

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/19
2026-01-16 19:33:02 -08:00
78f14f8bde Reworded CLAUDE.md 2026-01-16 18:47:47 -08:00
d3d3041b27 Decouple ZIM/torrent ansible tasks for faster provisioning (#18)
## Summary
- Simplify kiwix role from 213 lines to 151 lines (-30%)
- Replace per-archive torrent status loops with single shell command
- Decouple kiwix startup from declared inventory - now serves whatever completed ZIM files exist
- Fix tailscale_serve role to handle empty JSON in check mode

## Performance improvement
- **Before**: ~132 operations (44 archives × 3 loops for status check, recheck, symlink)
- **After**: ~5 operations (1 shell script + 1 find + conditional symlinks)
- Expected reduction: ~3 minutes per ansible run

## Test plan
- [x] Ran `mise run provision-indri -- --check --diff` to preview changes
- [x] Ran `mise run provision-indri` to apply changes
- [x] Ran `mise run indri-services-check` - all services healthy

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/18
2026-01-16 15:14:00 -08:00
812b78bf61 Use explicit PostgreSQL superuser name and fix check mode (#17)
## Summary
- Add `postgresql_superuser` variable (`eblume`) to prevent PostgreSQL from inheriting OS username during initdb
- Update all psql/createdb commands to use explicit `-U` flag
- Add `check_mode: false` to op commands so 1Password fetches run during `--check` mode
- Add PostgreSQL and Miniflux health checks to indri-services-check

## Test plan
- [x] Renamed existing superuser from `erichblume` to `eblume`
- [x] Ran `mise run provision-indri -- --tags postgresql --check --diff` successfully
- [x] Verified connection as `eblume` superuser via Tailscale
- [x] Ran `mise run indri-services-check` - all services healthy

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/17
2026-01-16 14:41:36 -08:00
adf6f4fbe9 Add PostgreSQL and Miniflux services to tailnet (#16)
## Summary
- Add PostgreSQL 18 as a new service at `pg.tail8d86e.ts.net:5432`
- Add Miniflux RSS/Atom feed reader at `feed.tail8d86e.ts.net`
- Both services managed via homebrew/brew services
- Pulumi ACL tags added (tag:pg, tag:feed)
- Alloy log collection configured for both services
- Zettelkasten documentation updated

## Manual Setup Required

Before running ansible, the following steps are needed on indri:

### 1. Apply Pulumi tags
```bash
mise run tailnet-up
```
Then apply tags to indri in Tailscale admin console.

### 2. Create 1Password entries
- miniflux PostgreSQL user password
- miniflux admin password (for first run)

### 3. Set PostgreSQL user password (after ansible installs postgres)
```bash
ssh indri '/opt/homebrew/opt/postgresql@18/bin/psql -c "ALTER USER miniflux PASSWORD '\''your-password'\'';"'
```

### 4. Create password files on indri
```bash
ssh indri 'echo "your-db-password" > ~/.miniflux-db-password && chmod 600 ~/.miniflux-db-password'
ssh indri 'echo "your-admin-password" > ~/.miniflux-admin-password && chmod 600 ~/.miniflux-admin-password'
```

### 5. Create ~/.pgpass for borgmatic
```bash
ssh indri 'echo "localhost:5432:miniflux:miniflux:YOUR_PASSWORD" > ~/.pgpass && chmod 600 ~/.pgpass'
```

### 6. Run ansible with first-run admin creation
```bash
mise run provision-indri -- -e miniflux_create_admin=1
```

### 7. Update borgmatic config
Add to `~/.config/borgmatic/config.yaml` on indri:
```yaml
postgresql_databases:
    - name: miniflux
      hostname: localhost
      port: 5432
      username: miniflux
```

### 8. Cleanup after first run
```bash
ssh indri 'rm ~/.miniflux-admin-password'
```

## Test plan
- [ ] Run `mise run tailnet-up` and verify Pulumi changes
- [ ] Apply tags to indri in Tailscale admin
- [ ] Run `mise run provision-indri -- --check --diff` for dry run
- [ ] Run `mise run provision-indri -- -e miniflux_create_admin=1`
- [ ] Approve services in Tailscale admin
- [ ] Verify PostgreSQL: `ssh indri '/opt/homebrew/opt/postgresql@18/bin/pg_isready'`
- [ ] Verify Miniflux: `curl https://feed.tail8d86e.ts.net/healthcheck`
- [ ] Run `mise run indri-services-check`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/16
2026-01-16 12:30:20 -08:00
3f4e40f3ae Add Pulumi for tailnet IaC management (#15)
## Summary
- Manage tail8d86e.ts.net ACLs, tags, and DNS via Pulumi + Python
- State stored in Pulumi Cloud (free tier) to avoid circular dependency
- OAuth authentication via 1Password for secure credential management
- New mise tasks: `tailnet-preview`, `tailnet-up`

## Architecture
Two-layer approach:
- **Layer 1 (Pulumi)**: Tailnet-wide config (ACLs, tags, DNS)
- **Layer 2 (Ansible)**: Node-local `tailscale serve` config (unchanged)

## Test plan
- [x] Exported current ACL from Tailscale API
- [x] Imported existing ACL into Pulumi state
- [x] Verified `mise run tailnet-preview` shows no changes
- [x] Verified `mise run tailnet-up` applies successfully

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/15
2026-01-15 20:55:25 -08:00
72c2dd7096 Add blumeops-tasks mise task for Todoist integration (#14)
## Summary
- Add `mise run blumeops-tasks` to fetch and display tasks from Todoist
- Uses uv run script with inline dependencies (httpx, rich)
- Fetches API credential securely via 1Password CLI
- Sorts tasks by custom priority order: p1, p2, p4, p3 (backlog last)
- Documents the task discovery workflow in CLAUDE.md

## Test plan
- [x] Verified `mise run blumeops-tasks` fetches and displays tasks correctly
- [x] Confirmed priority sorting works as expected

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/14
2026-01-15 18:03:19 -08:00
ae1513e7e9 Add Plex Media Server observability (#13)
## Summary
- Add `plex_metrics` ansible role with textfile collector for Prometheus metrics
- Add Plex log collection to Alloy (forwards to Loki)
- Add Grafana dashboard for Plex monitoring (status, library counts, sessions, transcoding, logs)

## Metrics Collected
- `plex_up` - server health
- `plex_version_info` - server version
- `plex_sessions_total/playing/paused` - active sessions
- `plex_transcode_sessions_total/video/audio` - transcoding status
- `plex_library_items{library,type}` - library item counts

## Prerequisites
Plex token must be stored at `~/.plex-token` on indri (already done).

## Test plan
- [x] Dry-run passed (`mise run provision-indri -- --check --diff`)
- [ ] Apply changes (`mise run provision-indri`)
- [ ] Verify metrics: `ssh indri 'cat /opt/homebrew/var/node_exporter/textfile/plex.prom'`
- [ ] Verify logs in Grafana Explore: `{service="plex"}`
- [ ] Check Plex dashboard in Grafana

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/13
2026-01-15 15:27:59 -08:00
2a1359a3b6 Fix ansible handler timeouts for alloy and loki restarts (#12)
## Summary
- Use async with poll: 0 for alloy and loki restart handlers
- Fire-and-forget approach prevents ansible from hanging on graceful shutdown

## Test plan
- [x] Manually verified `brew services restart grafana-alloy` works
- [x] Run full ansible playbook and verify it completes without timeout

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/12
2026-01-15 13:56:11 -08:00
ba5cd75ee2 Fix ansible handler timeouts for alloy and loki restarts
Use async with poll: 0 to fire-and-forget service restarts.
These services have graceful shutdown periods that can exceed
ansible's default command timeout.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 12:39:28 -08:00
242c1880de Add Grafana Alloy and Loki for unified observability (#11)
## Summary
- Add Grafana Alloy to replace node_exporter for metrics collection
- Add Loki for log aggregation and storage
- Configure Alloy to collect logs from all services (grafana, forgejo, prometheus, tailscale, transmission, devpi, kiwix, borgmatic)
- Update Prometheus to accept metrics via remote_write
- Add Loki datasource to Grafana

## Test plan
- [ ] Run \`mise run provision-indri -- --check --diff\` to verify changes
- [ ] Apply with \`mise run provision-indri\`
- [ ] Verify services: \`mise run indri-services-check\`
- [ ] Check Grafana Explore with Loki datasource
- [ ] Query logs: \`{service="grafana"}\`
- [ ] Verify metrics still flowing to Prometheus dashboards

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/11
2026-01-15 12:24:13 -08:00
070f26dc6d Add zk-docs mise task for zettelkasten documentation (#10)
## Summary
- Add `mise run zk-docs` task to concatenate all blumeops-tagged zettelkasten cards
- Main project card is shown first, followed by service management logs
- Uses `bat` for output (added to Brewfile)
- Args are passed through to bat for custom formatting
- Update CLAUDE.md to use zk-docs command with plain output options
- Update README.md to note zettelkasten is private with contact email

## Test plan
- [x] `mise run zk-docs` displays all 6 blumeops cards
- [x] `mise run zk-docs -- --style=header --color=never --decorations=always` shows filenames without decoration

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/10
2026-01-15 11:25:02 -08:00
2e326eb30d Critical security note for claude 2026-01-15 09:02:27 -08:00
c660674891 Remove settings.local.json from repo and add to gitignore
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 08:59:24 -08:00
d8a0ef6482 Add devpi PyPI caching proxy role for indri (#9)
## Summary
- Add ansible role for devpi-server as a transparent PyPI caching proxy
- LaunchAgent with KeepAlive runs via `mise x -- devpi-server`
- Listens on port 3141, data stored in `~/devpi`
- Health checks added to `indri-services-check` script

## Manual Setup Required (on indri, before provisioning)
1. Add to `~/.config/mise/config.toml`:
   ```toml
   [tools]
   "pipx:devpi-server" = "latest"
   "pipx:devpi-web" = "latest"
   "pipx:devpi-client" = "latest"
   ```
2. Run `mise install`
3. Initialize: `mise x -- devpi-init --serverdir ~/devpi`

## Post-Provisioning
- Set up Tailscale service `pypi` on port 443 → 3141
- Configure client pip.conf with index-url

## Test plan
- [x] Ansible syntax check passes
- [x] Dry-run: `mise run provision-indri -- --check --diff`
- [x] Apply: `mise run provision-indri`
- [x] Health check: `mise run indri-services-check`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/9
2026-01-15 08:31:09 -08:00
50c713b5de Add macOS-compatible Node Exporter Grafana dashboard (#8)
## Summary
- Adds a new Grafana dashboard for Node Exporter metrics on macOS hosts
- Uses macOS-native memory metrics (node_memory_total_bytes, node_memory_active_bytes, etc.) instead of Linux-specific ones
- Includes dropdown selectors for instance, disk, and network device filtering

## Details
The standard Node Exporter dashboards show "No Data" for memory panels on macOS because they query Linux-specific metrics like `node_memory_MemTotal_bytes`. macOS node_exporter exports different metrics:

| Linux | macOS |
|-------|-------|
| node_memory_MemTotal_bytes | node_memory_total_bytes |
| node_memory_MemFree_bytes | node_memory_free_bytes |
| node_memory_Buffers_bytes | (not available) |
| node_memory_Cached_bytes | (not available) |

macOS has unique memory categories: Wired, Active, Compressed, Inactive, Free.

## Test plan
- [x] Dashboard deployed to indri via ansible
- [x] All panels showing data for indri
- [x] Instance selector works to switch between hosts
- [x] Disk and network device filters work

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/8
2026-01-14 20:53:57 -08:00
d9be8c27bc Add 32 devdocs ZIM archives for programming documentation (#7)
## Summary
- Adds offline documentation for: bash, c, click, cmake, cpp, css, django-rest-framework, django, docker, duckdb, fish, gcc, git, go, godot, hammerspoon, homebrew, javascript, kubectl, kubernetes, latex, lua, markdown, nginx, nix, postgresql, python, redis, sqlite, typescript, werkzeug, zig
- All January 2026 versions from download.kiwix.org/zim/devdocs/
- Downloads via BitTorrent through transmission

## Test plan
- [x] Deployed to indri via `mise run provision-indri`
- [x] All 32 torrents added and downloaded (small files, completed instantly)
- [x] 43 ZIM files now available in kiwix directory

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/7
2026-01-14 18:28:34 -08:00
10012a4cf2 Add upload/download ratio and period transfer panels to Transmission dashboard (#6)
## Summary
- Adds Upload/Download Ratio stat panel with color thresholds (red < 0.5, yellow < 1, green >= 1)
- Adds Downloaded (Period) stat panel showing bytes downloaded in selected time range
- Adds Uploaded (Period) stat panel showing bytes uploaded in selected time range

Uses PromQL `increase()` on existing counter metrics - no new metrics collection needed.

## Test plan
- [x] Deployed to indri via `mise run provision-indri`
- [x] Grafana restarted successfully

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/6
2026-01-14 18:08:39 -08:00
ba03af15eb Set MISE_TASK_OUTPUT=interleave in provision-indri
Shows ansible output in real-time instead of buffered.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 14:15:11 -08:00
2f28b151f5 Fix launchctl idempotency in kiwix and borgmatic roles
Check if LaunchAgent is already loaded before attempting to load it.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 14:14:52 -08:00
e534e59556 Add provision-indri mise task and fix idempotency
- Add mise-tasks/provision-indri script to run ansible playbook
- Fix transmission_metrics launchctl load to be idempotent
- Update CLAUDE.md to reference mise run provision-indri

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 14:10:30 -08:00
e264b39cd6 Add total torrent size metric and dashboard panel
- Query torrent-get RPC to sum totalSize of all torrents
- Add transmission_torrents_size_bytes gauge metric
- Add "Total Torrent Size" timeseries panel to dashboard

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 14:00:52 -08:00
d18c3a6f3c Add node_exporter and transmission-metrics to service checks
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 13:57:08 -08:00
0afd34590d Fix transmission-metrics session ID parsing
Transmission doesn't support HEAD requests, so use -i flag with sed to
parse only the HTTP headers (stopping at the blank line before body).
Also anchor grep pattern to line start to avoid matching HTML content.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 13:56:08 -08:00
7468023cd2 Add transmission dashboard to grafana
- Add node_exporter ansible role to enable textfile collector
- Add transmission_metrics role with script and LaunchAgent
  - Collects metrics every 60s via transmission RPC
  - Writes to /opt/homebrew/var/node_exporter/textfile/transmission.prom
- Update grafana role to provision dashboards from files
- Add transmission.json dashboard with:
  - Status indicator, torrent counts
  - Transfer speeds, cumulative stats
  - Time series graphs

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 13:46:51 -08:00