Commit graph

55 commits

Author SHA1 Message Date
deedeecef9 C0: adopt AGENTS.md as canonical agent config 2026-04-18 20:15:30 -07:00
d7c3c687f4 Document DR rebuild procedure and update restart-indri
- New how-to: rebuild-minikube-cluster with full bootstrap procedure
  validated during 2026-04-13 DR event
- Update restart-indri: warn about minikube delete, macOS permission
  dialog on first Tailscale SSH, forgejo_actions_secrets dep cycle
- Update disaster-recovery reference: link to rebuild procedure
- Update CLAUDE.md: never run minikube delete

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:07:54 -07:00
8d80a4a3a5 Rewrite runner-logs: API-based log fetching, multi-repo support
Replace broken SSH+filesystem log retrieval with Forgejo web API
endpoint. Fix CLI to use run numbers (not task IDs), add --repo
for querying any forge repo (e.g. sporks), --limit/-n for listing
size. Document runner-logs as the way to verify build success in
CLAUDE.md and container build docs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 09:42:58 -07:00
f9206bf10b Build custom Kingfisher container from sporked deploy branch (#318)
All checks were successful
Build Container / detect (push) Successful in 2s
Build Container / build-nix (kingfisher) (push) Successful in 12s
## Summary

- Add Dockerfile for Kingfisher built from source (sporked deploy branch)
- Multi-stage: Rust build with Boost/vectorscan, debian-slim runtime
- Switch CronJob from upstream `ghcr.io/mongodb/kingfisher` to `registry.ops.eblu.me/blumeops/kingfisher`
- Add kingfisher to service-versions.yaml (version tracks upstream main SHA)
- Document spork workflow in CLAUDE.md

## Test plan

- [ ] Build container: `mise run container-build-and-release kingfisher 1d37d29`
- [ ] Verify image on registry: `mise run container-list`
- [ ] Update kustomization newTag
- [ ] Sync ArgoCD kingfisher app from branch
- [ ] Trigger manual CronJob and verify scan completes
- [ ] Verify reports on sifaka

Reviewed-on: #318
2026-03-30 06:34:49 -07:00
d121006086 Exclude docs from ai-sources, mention ai-sources in CLAUDE.md
ai-sources now skips docs/ to avoid duplicating ai-docs output.
CLAUDE.md notes ai-sources as available for deep context on problems
with a large surface area.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 18:40:35 -07:00
40f1568088 Remove unused Mosquitto MQTT broker from ringtail
Mosquitto has been dormant since frigate-notify switched from MQTT to
webapi polling (529ba10). Tear down live infra (ArgoCD app, namespace)
and remove all manifests, service-versions entry, services-check, and
doc references.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 18:37:31 -07:00
b3d5478020 Use towncrier orphan fragment naming for C0 changes
C0 changes have no branch name, so `main.<type>.md` fragments collide.
Switch to towncrier's `+<slug>.<type>.md` orphan convention and rename
existing `main.*` fragments.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 15:30:00 -08:00
81a8ca24b9 Clarify that changelog fragments apply to all change levels (C0–C2)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 13:15:06 -08:00
23dc79058e Bake default display options into ai-docs mise task
The --style=header --color=never --decorations=always flags are now built
into the script so callers can just run `mise run ai-docs`. Also adds a
note to CLAUDE.md to never truncate the output.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 17:42:47 -08:00
66b5b32f1d Formalize C0/C1/C2 change classification (#259)
## Summary
- **C0 (Quick Fix):** Now explicitly allows direct-to-main commits with no PR required — for low-risk, fix-forward-safe changes
- **C1 (Human Review):** New docs-first workflow with branch deployment (ArgoCD `--revision`, Ansible from checkout). Includes upgrade criteria for escalation to C2
- **C2 (Mikado Chain):** Introduces the **Mikado Branch Invariant** — strict commit ordering where card-introducing commits come first, followed by code progress, followed by card closures. Branch resets required when new prerequisites are discovered

Updates CLAUDE.md rules (3, 4, 8, 9) to reflect that C0 bypasses branching/PR requirements. Also updates ai-assistance-guide, how-to index, and docs-mikado task description.

## Files changed
- `CLAUDE.md` — rules and classification table
- `docs/how-to/agent-change-process.md` — full process rewrite
- `docs/tutorials/ai-assistance-guide.md` — branching and pitfalls sections
- `docs/how-to/how-to.md` — index description
- `mise-tasks/docs-mikado` — task description
- `docs/changelog.d/formalize-change-classification.doc.md` — changelog fragment

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/259
2026-02-23 16:19:54 -08:00
c1f7b2a9a3 Add agent change process (C0/C1/C2) and docs-mikado tool (#225)
## Summary
- Introduce C0/C1/C2 change classification based on the Mikado method, where documentation cards serve as persistent context for agents across sessions
- Add `docs-mikado` mise task to visualize active Mikado dependency chains from `status: active` and `requires` frontmatter fields
- Rename `zk-docs` task to `ai-docs`

## Changes
- **New:** `docs/how-to/agent-change-process.md` — methodology card
- **New:** `mise-tasks/docs-mikado` — Python uv script for dependency graph visualization
- **Renamed:** `mise-tasks/zk-docs` → `mise-tasks/ai-docs`
- **Updated:** `CLAUDE.md` — added Change Classification section, updated references
- **Updated:** `ai-assistance-guide.md`, `exploring-the-docs.md`, `how-to.md` — updated references and index

## Verification
- [x] `mise run ai-docs` works
- [x] `mise run docs-mikado` runs (no active chains yet, as expected)
- [x] `docs-check-links` — all valid
- [x] `docs-check-index` — all indexed
- [x] `docs-check-frontmatter` — all valid
- [x] All pre-commit hooks pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/225
2026-02-20 08:15:20 -08:00
291fff345c Fix services-check and update docs for Frigate migration to ringtail (#218)
## Summary
- Move mosquitto, ntfy, frigate, frigate-notify pod checks from `minikube-indri` to `k3s-ringtail` context in `services-check`
- Add `nvidia-device-plugin` pod check for ringtail k3s
- Rename "Kubernetes pods" section to "Indri minikube pods" for clarity
- Update 8 documentation files to reflect the migration completed in PRs #216/#217

## Files Changed
| File | Change |
|------|--------|
| `mise-tasks/services-check` | Move 4 pod checks to k3s-ringtail, add nvidia-device-plugin |
| `docs/reference/services/frigate.md` | Image→tensorrt, detector→ONNX/CUDA, shm→512Mi |
| `docs/reference/infrastructure/ringtail.md` | List actual k3s workloads |
| `docs/reference/infrastructure/indri.md` | Note frigate migration |
| `docs/explanation/architecture.md` | Add ringtail to diagram + compute layer |
| `docs/reference/kubernetes/cluster.md` | Note two clusters, add k3s section |
| `docs/reference/reference.md` | Update frigate/ntfy location |
| `docs/how-to/plans/completed/operationalize-reolink-camera.md` | Add post-completion migration note |
| `CLAUDE.md` | Add k3s-ringtail context guidance |

## Test plan
- [ ] `mise run services-check` — all checks pass
- [ ] Review each doc for accuracy against deployed state

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/218
2026-02-19 14:38:21 -08:00
5fbe70d1ba Port ntfy to locally built container image (#202)
All checks were successful
Build Container / build (push) Successful in 6m28s
## Summary
- Add `containers/ntfy/Dockerfile` — three-stage build (Node web UI, Go+CGO server, Alpine runtime) pinned to commit SHA `a03a37fe` (v2.17.0), sourced from forge mirror
- Update ntfy deployment image from `binwiederhier/ntfy:v2.17.0` to `registry.ops.eblu.me/blumeops/ntfy:v1.0.0`
- Note fish shell in CLAUDE.md

## Deployment
After merge, release the container image:
```fish
mise run container-tag-and-release ntfy v1.0.0
```
Then sync:
```fish
argocd app sync ntfy
```

## Test plan
- [x] `docker build` succeeds
- [x] `dagger call build --src=. --container-name=ntfy` succeeds (exit 0, container ID printed)
- [x] `ntfy --help` works in built container
- [ ] Tag and release `ntfy-v1.0.0` after merge
- [ ] Verify ntfy pod starts with new image
- [ ] Verify health endpoint responds at `ntfy.ops.eblu.me/v1/health`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/202
2026-02-17 10:18:20 -08:00
54c3b0a5f3 Expanded some CLAUDE.md stuff manualy 2026-02-17 07:54:34 -08:00
22f418d0dc Doc review: connect-to-postgres, create-release-artifact-workflow, deploy-k8s-service (#191)
## Summary

Review session covering 3 docs, plus a codebase-wide cleanup:

### Docs reviewed
- **connect-to-postgres** — verified end-to-end (psql connection tested), stamped
- **create-release-artifact-workflow** — clarified that `build-blumeops.yaml` is only a version bump example (not a packages API example)
- **deploy-k8s-service** — fixed stale repoURL (`indri:2200` → `forge.ops.eblu.me:2222`), wrong Caddy config keys (`upstream` → `backend`, added missing `host`), updated Homepage group to "Services", added Tailscale tag documentation

### Codebase cleanup
- Migrated all remaining `op item get --fields` calls to `op read` URI syntax across 7 files (docs, READMEs, YAML comments)
- Simplified the `op read` vs `op item get` guidance in CLAUDE.md

## Side findings (not addressed)
- New `immich-pg` CNPG cluster not yet documented in the postgresql reference card

## Test plan
- [x] `psql` connection to `pg.ops.eblu.me` verified
- [x] All pre-commit hooks pass
- [x] `docs-check-links`, `docs-check-index`, `docs-check-frontmatter` pass

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/191
2026-02-15 07:42:01 -08:00
0d5f48e2c2 Document op read vs op item get convention (#143)
## Summary
- Adds guidance to CLAUDE.md: use `op read` for secret values, `op item get` only for metadata
- Fixes the argocd login example which used `op item get --fields`
- `op item get --fields` wraps multi-line values in quotes, which corrupts keys and other secrets

Discovered while verifying the sifaka borg repokey in 1Password — hashes didn't match until we switched to `op read`.

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/143
2026-02-10 13:09:55 -08:00
e6cf7e47e0 Restrict flyio-proxy ACLs to dedicated tag:flyio-target endpoints (#126)
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 1m8s
## Summary
- Introduce `tag:flyio-target` so services must explicitly opt in to be reachable by the fly.io proxy
- Replace broad `tag:k8s` and `tag:homelab` grants with the new tag in the ACL rule and test
- Add `tailscale.com/tags: "tag:k8s,tag:flyio-target"` annotation to docs, loki, and prometheus Ingresses
- Switch Alloy push endpoints from `*.ops.eblu.me` (Caddy) to `*.tail8d86e.ts.net` (Tailscale Ingress)
- Update docs: flyio-proxy, caddy, tailscale, forgejo (future public access + security checklist), expose-service-publicly

## Manual step (not in PR)
Update the k8s operator OAuth client in the Tailscale admin console to include `tag:flyio-target` in its scope. Without this, the operator cannot assign the new tag to Ingress proxy nodes.

## Deployment order
1. **Pulumi ACLs** — `mise run tailnet-preview && mise run tailnet-up`
2. **OAuth client** — Manual update in Tailscale admin console
3. **K8s Ingresses** — `argocd app sync apps && argocd app sync docs loki prometheus`
4. **Fly.io proxy** — `mise run fly-deploy`
5. **Verify** — `mise run services-check`, check Grafana dashboards

## Test plan
- [ ] `mise run tailnet-preview` shows clean diff
- [ ] `argocd app diff docs`, `argocd app diff loki`, `argocd app diff prometheus` show only annotation additions
- [ ] After deploy: Grafana dashboards show continued log/metric flow
- [ ] `curl -sf https://docs.eblu.me` returns 200
- [ ] `mise run services-check` passes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/126
2026-02-08 21:54:18 -08:00
e720b524d3 Rename indri-services-check to services-check (#103)
## Summary
- Rename `indri-services-check` task to `services-check` since it checks all services (indri native, Kubernetes, HTTP endpoints), not just indri-specific ones
- Update references in CLAUDE.md, ai-assistance-guide.md, and troubleshooting.md

## Deployment and Testing
- [ ] Run `mise run services-check` to verify the task works under its new name

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/103
2026-02-04 07:49:15 -08:00
1e13d4b83d Fix Navidrome automatic library scan schedule (#101)
## Summary
- Fix env var name from `ND_SCANSCHEDULE` to `ND_SCANNER_SCHEDULE` (Navidrome uses viper config where dots become underscores)
- Use explicit `@every 1h` format for clarity
- Reorder CLAUDE.md rules to emphasize running zk-docs first

## Root Cause
Navidrome logs showed "Periodic scan is DISABLED" at startup despite the env var being set. The config key is `scanner.schedule`, which translates to `ND_SCANNER_SCHEDULE` (not `ND_SCANSCHEDULE`).

## Deployment and Testing
- [ ] Sync navidrome app: `argocd app sync navidrome`
- [ ] Verify pod restarts with new env var
- [ ] Check logs for "Scheduling scanner" message instead of "Periodic scan is DISABLED"
- [ ] Wait ~1 hour and confirm scan runs automatically

🤖 Generated with [Claude Code](https://claude.ai/code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/101
2026-02-04 07:23:12 -08:00
72f9f21d46 Remove iCloud Photos from borgmatic backup (#100)
## Summary
- Remove ~/Pictures from borgmatic source directories
- Update borgmatic and backup policy documentation
- Add Sifaka-Native Data section to clarify that photos (via Immich), music (via Navidrome), and video (via Jellyfin) are stored directly on Sifaka

## Deployment and Testing
- [ ] Run `mise run provision-indri -- --tags borgmatic --check --diff` to preview changes
- [ ] Run `mise run provision-indri -- --tags borgmatic` to apply
- [ ] Verify borgmatic config no longer includes ~/Pictures

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/100
2026-02-04 07:09:28 -08:00
f8f11121eb Complete Phase 6: documentation cleanup and integration (#97)
## Summary
- Delete `docs/zk/` directory - all useful content migrated to structured docs
- Delete `docs/README.md` - `docs/index.md` is now the documentation root
- Add `devpi` reference card and `use-pypi-proxy` how-to guide
- Add maintenance notes to `indri` reference (sleep prevention, passwordless sudo)
- Add iCloud Photos backup note to `borgmatic` reference
- Rewrite `zk-docs` mise task to prime AI context with key docs instead of legacy cards
- Update `CLAUDE.md` and `README.md` to remove zk references
- Update `exploring-the-docs` with AI context priming section

This completes the Diataxis documentation restructuring. All six phases are now done.

## Deployment and Testing
- [x] Pre-commit hooks pass (including doc-links validator)
- [ ] Build and deploy to docs.ops.eblu.me to verify rendering

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/97
2026-02-03 20:52:37 -08:00
d0e37b8b91 Fix towncrier fragment format: use flat <name>.<type>.md
The towncrier config uses the type's `directory` field as the type
identifier in filenames, NOT as subdirectories. Correct format:
  docs/changelog.d/<name>.<type>.md

NOT:
  docs/changelog.d/<type>/<name>.md

- Move fragments to root with type suffix
- Remove empty type subdirectories
- Fix CLAUDE.md instructions
- Fix tutorial examples in contributing.md and ai-assistance-guide.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 19:06:14 -08:00
c036562dfe Fix towncrier fragment naming and CLAUDE.md instructions
Fragments in subdirectories should be named `<name>.md`, not
`<name>.<type>.md` - the type is already indicated by the directory.

- Rename feature/auto-deploy-docs.feature.md → feature/auto-deploy-docs.md
- Rename misc/+container-tag-no-confirm.misc.md → misc/+container-tag-no-confirm.md
- Update CLAUDE.md with correct fragment path format

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 19:00:26 -08:00
9a8587b83f Add towncrier changelog system (#86)
## Summary
- Configure towncrier with custom types (feature, bugfix, infra, doc, misc)
- Build initial v0.1.0 changelog from zk management log entries
- Integrate towncrier into build-blumeops workflow
- Update README to mark Phase 1b complete

## How It Works
1. Add changelog fragments to `docs/changelog.d/` as `<id>.<type>.md`
2. When running build-blumeops workflow, towncrier collects fragments
3. CHANGELOG.md is updated and fragments are removed
4. Changes are committed back to main before docs build

## Testing
- [x] Tested `uvx towncrier build` locally
- [ ] Test workflow execution (after merge)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/86
2026-02-03 11:48:13 -08:00
c89e69e25f Add docs/ with blumeops zk cards (#82)
## Summary
- Move 21 blumeops-tagged zettelkasten cards from ~/code/personal/zk/ to docs/
- Create symlink ~/code/personal/zk/blumeops -> blumeops/docs for obsidian integration
- Update zk-docs mise task to read from local docs/ directory
- Add blumeops workspace to obsidian.nvim config (strict=true)

## Benefits
- Docs are now git-managed in the blumeops repo (visible on GitHub)
- Wiki links between blumeops docs continue to work via symlink
- obsidian-sync isolation: docs don't sync to work laptop
- Direct editing via obsidian.nvim with dedicated workspace

## Testing
- [x] Files moved to docs/ (21 files)
- [x] Symlink created: ~/code/personal/zk/blumeops -> blumeops/docs
- [x] zk-docs mise task updated and working
- [ ] Verify obsidian.nvim link resolution (after merge)
- [ ] Verify obsidian backlinks work

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/82
2026-02-02 21:40:53 -08:00
46081f5f10 Update rule 10: also require permission to push to main
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-28 20:45:44 -08:00
55522579dc Add rule: never merge PRs without explicit user request
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-28 20:43:51 -08:00
8621996343 Add Immich photo management + migrate forge URLs (#62)
## Summary
- Migrate all ArgoCD app repo URLs from `indri.tail8d86e.ts.net:2200` to `forge.ops.eblu.me:2222`
- Add Immich self-hosted photo management service with:
  - Helm chart deployment via ArgoCD
  - PostgreSQL cluster with pgvecto.rs for AI vector search (immich-pg)
  - NFS storage on sifaka for photo library (2Ti)
  - Tailscale Ingress + Caddy proxy for `photos.ops.eblu.me`
  - Machine learning service for face/object recognition

## Deployment and Testing
- [x] Update ArgoCD repo-creds-forge secret with new URL (one-time manual step)
- [ ] Sync `apps` to pick up new applications
- [ ] Sync all existing apps to verify new forge URL works
- [ ] Sync `blumeops-pg` to deploy immich-pg cluster
- [ ] Wait for immich-pg to be healthy
- [ ] Create immich-db secret from auto-generated password
- [ ] Sync `immich-storage` (PV, PVC, Ingress)
- [ ] Sync `immich` (Helm chart)
- [ ] Run `mise run provision-indri -- --tags caddy` to add photos.ops.eblu.me
- [ ] Verify Immich UI is accessible

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/62
2026-01-26 11:20:11 -08:00
66badfafd1 Migrate k8s services to Caddy (*.ops.eblu.me) (#59)
All checks were successful
Build Container / build (push) Successful in 13s
## Summary
- Add Caddy reverse proxy routes for all k8s services (grafana, argocd, prometheus, loki, miniflux, devpi, kiwix, torrent, teslamate)
- Add PostgreSQL via Caddy L4 TCP proxy on port 5432
- Caddy proxies to existing Tailscale endpoints - traffic stays local on indri
- Both `*.ops.eblu.me` and `*.tail8d86e.ts.net` URLs continue to work

## Updated References
- Alloy: prometheus/loki push endpoints → `*.ops.eblu.me`
- Borgmatic: PostgreSQL backup host → `pg.ops.eblu.me`
- Devpi: DEVPI_OUTSIDE_URL → `pypi.ops.eblu.me`
- indri-services-check: health check URLs
- CLAUDE.md: argocd login command

## Deployment and Testing
- [ ] Run `mise run provision-indri -- --tags caddy` to deploy new Caddy config
- [ ] Test HTTP services: `curl https://grafana.ops.eblu.me/api/health`
- [ ] Test PostgreSQL: `pg_isready -h pg.ops.eblu.me -p 5432`
- [ ] Run `mise run provision-indri -- --tags alloy` to update Alloy endpoints
- [ ] Run `mise run provision-indri -- --tags borgmatic` to update borgmatic
- [ ] Sync devpi in ArgoCD: `argocd app sync devpi`
- [ ] Re-login to ArgoCD: `argocd login argocd.ops.eblu.me ...`
- [ ] Run `mise run indri-services-check` to verify all services

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/59
2026-01-25 12:56:31 -08:00
9c1b7c7ca1 Update docs for Caddy migration (#57)
## Summary
- Update CLAUDE.md with new service routing documentation
- Document the two DNS domains: `*.ops.eblu.me` (Caddy) vs `*.tail8d86e.ts.net` (Tailscale)
- Fix incorrect service listings (Prometheus/Loki are in k8s, not indri)

## ZK Updates (not in this PR)
Also updated the blumeops zk card with:
- Source code URL (forge is primary, GitHub is mirror)
- Services split into Caddy vs Tailscale sections
- Updated port map for Caddy
- Updated "Adding a New Service" instructions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/57
2026-01-25 11:52:35 -08:00
1184b4de1d Add Caddy layer4 for Forgejo SSH (#56)
## Summary
- Add layer4 TCP proxy configuration to Caddyfile template for SSH services
- Configure Forgejo SSH on port 2222 → localhost:2200
- Switch HTTPS from port 8443 (testing) to 443 (production)
- Requires Caddy rebuilt with `github.com/mholt/caddy-l4` plugin

## What This Enables
Git+SSH access via `forge.ops.eblu.me:2222` is now accessible from:
- Tailnet clients (gilbert)
- Docker containers on indri
- Kubernetes pods in minikube

This solves the DNS resolution issues where containers couldn't reach Tailscale MagicDNS names.

## Testing Done
- [x] Caddy rebuilt with layer4 plugin
- [x] Validated Caddyfile syntax
- [x] Cleared `svc:forge` from tailscale serve
- [x] Verified HTTPS works: `curl https://forge.ops.eblu.me`
- [x] Verified SSH works: `ssh -p 2222 forgejo@forge.ops.eblu.me`
- [x] Verified git clone works via new endpoint
- [x] Verified minikube pods can reach both HTTPS and SSH endpoints

## Deployment
Caddy is already running with the new config on indri. This PR captures the ansible changes.

## Next Steps
- Update zk docs with new git remote format
- Migrate registry and other services to Caddy
- Retire tailscale_services ansible role

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/56
2026-01-25 11:37:23 -08:00
13b8d16eb2 Remove mention of plans dir
All checks were successful
Test CI / test (push) Successful in 3s
2026-01-24 16:22:39 -08:00
8ca8798121 Switch to Buildah for container builds (#51)
All checks were successful
Test CI / test (push) Successful in 4s
## Summary
- Replace Docker with Buildah for container image builds
- No Docker socket required - buildah is daemonless
- Cleaner security model (no privileged containers or socket mounting)
- Remove Docker-related security context from deployment

## Changes
- Update Dockerfile to install buildah/podman instead of docker-cli
- Configure buildah storage with overlay driver and fuse-overlayfs
- Update composite action to use `buildah bud` and `buildah push`
- Add `imagePullPolicy: Always` to ensure fresh image pulls
- Update test workflow to verify buildah/podman

## Testing
- [ ] Runner pod starts successfully
- [ ] Buildah is available in runner
- [ ] Test workflow verifies buildah/podman versions
- [ ] Container build workflow builds and pushes to zot

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/51
2026-01-24 13:30:26 -08:00
7893c41020 Enable Forgejo Actions (Phase 1) (#48)
All checks were successful
Test CI / test (push) Successful in 0s
## Summary
- Refactor Forgejo app.ini to be managed by ansible with secrets from 1Password
- Enable Forgejo Actions in config (`[actions] ENABLED = true`)
- Add `repo.actions` to DEFAULT_REPO_UNITS
- Clean up unused MySQL database fields (we use SQLite)

## Phase 1 Progress
This PR covers the first part of Phase 1 (ci-cd-bootstrap plan):
- [x] Refactor app.ini to ansible template
- [x] Store secrets in 1Password
- [x] Enable Actions in config
- [ ] Deploy config changes (pending review)
- [ ] Create runner registration token
- [ ] Deploy runner to k8s
- [ ] Test with simple workflow

## Deployment and Testing
- [ ] Run `mise run provision-indri -- --tags forgejo` to deploy
- [ ] Verify Forgejo restarts correctly
- [ ] Verify Actions tab appears in repo settings

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/48
2026-01-23 17:00:12 -08:00
2e7ca8a5ff Add mise task to list unresolved PR comments (#40)
## Summary
- New `pr-comments` mise task queries Forge API for unresolved review comments on a PR
- Task takes a PR number as argument and displays all comments without a resolver
- Updated CLAUDE.md to include using this task after user reviews PRs

## Deployment and Testing
- [x] Tested task on PR #39 (shows no unresolved comments since all were resolved)
- [x] Tested error handling with non-existent PR #9999

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/40
2026-01-21 19:14:27 -08:00
0439fbb704 P5: Migrate devpi to Kubernetes (#34)
## Summary
- Migrate devpi PyPI caching proxy from indri LaunchAgent to Kubernetes
- Custom container image with devpi-server + devpi-web + auto-init
- StatefulSet with 50Gi PVC, Tailscale Ingress at pypi.tail8d86e.ts.net
- Remove devpi from ansible playbooks and update CLAUDE.md with k8s workflow

## Key Changes
- Add CRI-O registry mirror config for registry.tail8d86e.ts.net
- Change ArgoCD apps to manual sync (was auto-sync causing issues)
- 2Gi memory limit for Whoosh indexer (reclaimed after startup)

## Deployment and Testing
- [x] devpi pod healthy in k8s
- [x] pip install through proxy works
- [x] mcquack 1.0.0 uploaded and installable
- [x] Old devpi stopped on indri

## Post-Merge
Reset ArgoCD to main:
```
argocd app set apps --revision main && argocd app sync apps
argocd app set devpi --revision main && argocd app sync devpi
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/34
2026-01-20 14:55:37 -08:00
c8433467c1 Add Kubernetes migration plan documentation (#24)
## Summary
- Comprehensive phased plan for migrating blumeops services to minikube
- Technical decisions documented: Zot registry, Podman driver, CloudNativePG, Tailscale Operator
- 9 migration phases with verification and rollback procedures
- LaunchAgent absolute path requirements documented
- Observability requirements (zk docs, logging, metrics, dashboards) for new services

## Deployment and Testing
- [x] Plan document created at `docs/k8s-migration.md`
- [ ] Review plan phases for completeness
- [ ] Validate technical decisions align with requirements

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/24
2026-01-17 17:34:53 -08:00
3962e5a7de Fix borgmatic PostgreSQL backup and update backup sources (#21)
## Summary
- Fix PostgreSQL backup failure by adding explicit `pg_dump_command` path (was failing with "pg_dump: command not found" in LaunchAgent)
- Remove `~/code/3rd/kiwix-tools` from backups (was just symlinks to ZIM archives in transmission)
- Enable Loki log backup by removing from exclude_patterns

## Deployment and Testing
- [x] Dry run with `--check --diff` shows expected changes
- [ ] Deploy with `mise run provision-indri -- --tags borgmatic`
- [ ] Verify config deployed: `ssh indri 'cat ~/.config/borgmatic/config.yaml'`
- [ ] Run manual backup to test: `ssh indri 'mise x -- borgmatic create --verbosity 1'`
- [ ] Verify PostgreSQL dump succeeds (no "pg_dump: command not found" error)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/21
2026-01-17 09:22:01 -08:00
75426be1dc Remove ansible role meta dependencies to fix duplicate execution (#20)
## Summary
- Remove all `meta/main.yml` dependencies from ansible roles
- Role ordering is now controlled entirely by `indri.yml` playbook
- Fix incorrect roles path in CLAUDE.md (`playbooks/roles` → `roles`)

## Why
Ansible's tag accumulation behavior prevents proper role deduplication when using meta dependencies. When a role is pulled in as a dependency, the parent role's tags are added to the dependency's tags (e.g., `[loki]` becomes `[alloy, loki]`), making them appear as different invocations to Ansible and causing roles to run multiple times.

## Deployment and Testing
- [x] Verified with `ansible-playbook --list-tasks` that each role now appears exactly once
- [x] Run full provision to verify no regressions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/20
2026-01-16 22:50:34 -08:00
78f14f8bde Reworded CLAUDE.md 2026-01-16 18:47:47 -08:00
adf6f4fbe9 Add PostgreSQL and Miniflux services to tailnet (#16)
## Summary
- Add PostgreSQL 18 as a new service at `pg.tail8d86e.ts.net:5432`
- Add Miniflux RSS/Atom feed reader at `feed.tail8d86e.ts.net`
- Both services managed via homebrew/brew services
- Pulumi ACL tags added (tag:pg, tag:feed)
- Alloy log collection configured for both services
- Zettelkasten documentation updated

## Manual Setup Required

Before running ansible, the following steps are needed on indri:

### 1. Apply Pulumi tags
```bash
mise run tailnet-up
```
Then apply tags to indri in Tailscale admin console.

### 2. Create 1Password entries
- miniflux PostgreSQL user password
- miniflux admin password (for first run)

### 3. Set PostgreSQL user password (after ansible installs postgres)
```bash
ssh indri '/opt/homebrew/opt/postgresql@18/bin/psql -c "ALTER USER miniflux PASSWORD '\''your-password'\'';"'
```

### 4. Create password files on indri
```bash
ssh indri 'echo "your-db-password" > ~/.miniflux-db-password && chmod 600 ~/.miniflux-db-password'
ssh indri 'echo "your-admin-password" > ~/.miniflux-admin-password && chmod 600 ~/.miniflux-admin-password'
```

### 5. Create ~/.pgpass for borgmatic
```bash
ssh indri 'echo "localhost:5432:miniflux:miniflux:YOUR_PASSWORD" > ~/.pgpass && chmod 600 ~/.pgpass'
```

### 6. Run ansible with first-run admin creation
```bash
mise run provision-indri -- -e miniflux_create_admin=1
```

### 7. Update borgmatic config
Add to `~/.config/borgmatic/config.yaml` on indri:
```yaml
postgresql_databases:
    - name: miniflux
      hostname: localhost
      port: 5432
      username: miniflux
```

### 8. Cleanup after first run
```bash
ssh indri 'rm ~/.miniflux-admin-password'
```

## Test plan
- [ ] Run `mise run tailnet-up` and verify Pulumi changes
- [ ] Apply tags to indri in Tailscale admin
- [ ] Run `mise run provision-indri -- --check --diff` for dry run
- [ ] Run `mise run provision-indri -- -e miniflux_create_admin=1`
- [ ] Approve services in Tailscale admin
- [ ] Verify PostgreSQL: `ssh indri '/opt/homebrew/opt/postgresql@18/bin/pg_isready'`
- [ ] Verify Miniflux: `curl https://feed.tail8d86e.ts.net/healthcheck`
- [ ] Run `mise run indri-services-check`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/16
2026-01-16 12:30:20 -08:00
72c2dd7096 Add blumeops-tasks mise task for Todoist integration (#14)
## Summary
- Add `mise run blumeops-tasks` to fetch and display tasks from Todoist
- Uses uv run script with inline dependencies (httpx, rich)
- Fetches API credential securely via 1Password CLI
- Sorts tasks by custom priority order: p1, p2, p4, p3 (backlog last)
- Documents the task discovery workflow in CLAUDE.md

## Test plan
- [x] Verified `mise run blumeops-tasks` fetches and displays tasks correctly
- [x] Confirmed priority sorting works as expected

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/14
2026-01-15 18:03:19 -08:00
070f26dc6d Add zk-docs mise task for zettelkasten documentation (#10)
## Summary
- Add `mise run zk-docs` task to concatenate all blumeops-tagged zettelkasten cards
- Main project card is shown first, followed by service management logs
- Uses `bat` for output (added to Brewfile)
- Args are passed through to bat for custom formatting
- Update CLAUDE.md to use zk-docs command with plain output options
- Update README.md to note zettelkasten is private with contact email

## Test plan
- [x] `mise run zk-docs` displays all 6 blumeops cards
- [x] `mise run zk-docs -- --style=header --color=never --decorations=always` shows filenames without decoration

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/10
2026-01-15 11:25:02 -08:00
2e326eb30d Critical security note for claude 2026-01-15 09:02:27 -08:00
e534e59556 Add provision-indri mise task and fix idempotency
- Add mise-tasks/provision-indri script to run ansible playbook
- Fix transmission_metrics launchctl load to be idempotent
- Update CLAUDE.md to reference mise run provision-indri

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 14:10:30 -08:00
9c0ff8bb9b Add mise task for indri service health checks
- Create mise-tasks/indri-services-check script
- Checks all indri services (prometheus, grafana, kiwix, transmission, forgejo)
- Verifies both local service status and HTTP endpoints
- Transmission RPC checked via SSH since it's localhost-only (secure)
- Update CLAUDE.md with instructions to run after service changes

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 13:23:05 -08:00
4add1684c3 Enable additional ZIM archives for kiwix
New archives (~95G total):
- Project Gutenberg 2023 (72G) - 60,000+ public domain books
- iFixit (3.3G) - Repair guides
- Stack Exchange: SuperUser (3.7G), Math (6.9G)
- LibreTexts: Biology, Chemistry, Engineering, Mathematics, Physics, Humanities

Also:
- Fix transmission to only restart when config changes
- Update CLAUDE.md to use full ansible paths

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 12:47:54 -08:00
2a6fcd915c Add reminder to always branch from main
Prevents accidentally including commits from other feature branches.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 12:00:00 -08:00
763899e856 Document tea pr create syntax in CLAUDE.md
Add full example with heredoc for multi-line descriptions and note
the difference from gh CLI (--description vs --body).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 10:35:27 -08:00
9fcf27a4c5 Add note to commit often while working 2026-01-14 07:25:15 -08:00