Commit graph

38 commits

Author SHA1 Message Date
49ec05041c Update UniFi Pulumi plan: switch to ubiquiti-community provider (#187)
## Summary
- Switch provider from filipowm/unifi (inactive maintainer, showstopper bug #94 wiping firewall rules) to ubiquiti-community/unifi (actively maintained, API key auth)
- Add UX7 config backup prerequisite before adopting IaC
- Fix safety guard: check default route interface instead of hostname (runs from gilbert, not indri)
- Update 1Password paths to match actual item (`op://blumeops/unifi/credential`)
- Fix ringtail references: not a Raspberry Pi, stays on WiFi (removed from wired topology)
- Update doc steps for already-existing reference files

## Test plan
- [x] Pre-commit hooks pass
- [x] `docs-check-links` pass
- [x] `docs-check-index` pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/187
2026-02-13 20:02:16 -08:00
81690dae0f Review add-ansible-role doc (#185)
## Summary
- Replace `op item get --fields` with `op read` for secrets (matches playbook and CLAUDE.md guidance)
- Change `tags: [<role>]` to `tags: <role>` to match actual playbook style
- Remove redundant `listen:` from handler example, add `changed_when: true`
- Name handler after specific service (e.g. `Restart <service>`) to match real roles
- Add `last-reviewed: 2026-02-13` frontmatter

## Also noted (not fixed here)
Two other docs still use the old `op item get` pattern:
- `docs/how-to/troubleshooting.md:72` (ArgoCD login command)
- `docs/how-to/gandi-operations.md:35` (Gandi token export)

These can be fixed in their own review cycles.

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/185
2026-02-13 16:54:42 -08:00
517080aeab Add reference/tools/ category with Dagger, ArgoCD CLI, Ansible, and Pulumi cards (#178)
## Summary

- Create `docs/reference/tools/` with four reference cards: Dagger (build engine), ArgoCD CLI (deployment workflows), Ansible (config management), and Pulumi (DNS/Tailscale IaC)
- Move `ansible/roles.md` → `tools/ansible.md`, broadened with CLI patterns and dry-run usage
- Update `reference.md` index: add "Tools" section, remove old "Ansible" section
- Update `update-documentation.md` to reflect Dagger build process (workflow steps, manual build recipe, runner environment)
- Update `adopt-dagger-ci.md` plan to note how-to articles were handled via reference card + existing how-to updates
- Fix all broken `[[roles]]` wiki-links across 5 files → `[[ansible]]`

## Verification

- `docs-check-links` ✓ — no broken wiki-links
- `docs-check-index` ✓ — all docs referenced in category index
- `docs-check-filenames` ✓ — no duplicate filenames
- All pre-commit hooks pass

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/178
2026-02-12 19:18:46 -08:00
9717863f65 Update CV release to v1.0.3, add X-Clacks-Overhead header (#176)
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 1m5s
## Summary
- Update CV release URL from v1.0.2 to v1.0.3
- Add `X-Clacks-Overhead: GNU Terry Pratchett` header to both `docs.eblu.me` and `cv.eblu.me` server blocks in the Fly.io proxy nginx config

## Deployment and Testing
- [ ] Sync CV app: `argocd app sync cv`
- [ ] Verify CV is serving v1.0.3 content
- [ ] Deploy fly proxy (workflow or `mise run fly-deploy`)
- [ ] Verify header: `curl -sI https://docs.eblu.me | grep -i clacks`
- [ ] Verify header: `curl -sI https://cv.eblu.me | grep -i clacks`

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/176
2026-02-12 17:08:22 -08:00
545433055e Add release artifact workflow how-to and changelog fragments (#170)
All checks were successful
Build Container / build (push) Successful in 17s
## Summary
- How-to guide for creating release artifact workflows with Forgejo packages
- Changelog fragment for the multi-repo forgejo_actions_secrets Ansible role change
- Changelog fragment for the new docs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/170
2026-02-12 11:20:05 -08:00
f3319c753c Update deploy-k8s-service doc with ProxyGroup ingress pattern (#166)
## Summary
- Updated the "Configure Ingress" section to use the current ProxyGroup pattern (`proxy-group: "ingress"`, `defaultBackend`, `tls.hosts`)
- Replaced the old per-ingress proxy example that used `rules:` with `host:` (which breaks ProxyGroup routing)
- Added key points explaining why `defaultBackend` is required and what each annotation does
- Updated checklist to mention ProxyGroup

## Test plan
- [ ] Review rendered doc for accuracy against existing ingress manifests

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/166
2026-02-11 21:10:42 -08:00
1fd54dbab7 Close Dagger CI plan (Phases 1-3 complete) (#165)
## Summary

- Move `adopt-dagger-ci.md` to new `docs/how-to/plans/completed/` archive
- Update plan status to "Phases 1–3 complete" with all verification checklists checked
- Update runner description to reflect what actually stayed (Docker CLI, Node.js needed)
- Document known issue: changelog dates still show UTC
- Create completed plans index, update plans and how-to indexes
- Changelog fragment

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/165
2026-02-11 18:04:39 -08:00
95364dcb48 Simplify runner image (Dagger Phase 3) (#162)
All checks were successful
Build Container / build (push) Successful in 1m13s
## Summary

With Phases 1 and 2 complete, the runner image no longer needs most of its bundled tools. This PR strips it down and adds what was missing.

**Removed** (now inside Dagger containers):
- Node.js 24.x
- Docker CLI + buildx plugin
- skopeo
- gnupg, lsb-release, xz-utils

**Added:**
- `tzdata` — fixes the TZ env var (#159, #160, #161) so `TZ=America/Los_Angeles` actually works
- `flyctl` — was being installed from scratch every release

**Workflow changes:**
- Remove "Ensure Dagger CLI" bootstrap steps from both workflows (Dagger is in the image)
- Remove "Install flyctl" step from build-blumeops (flyctl is in the image)
- Remove job-level `TZ` from build-blumeops (moved to runner configmap `runner.envs`)
- Set `TZ: America/Los_Angeles` in runner configmap so all job containers inherit it

## Deployment

After merge:
1. Build and release the new runner image: `mise run container-release forgejo-runner v2.0.0`
2. Sync the runner: `argocd app sync forgejo-runner`
3. Verify: `kubectl -n forgejo-runner exec deploy/forgejo-runner -c runner -- date` (but the real test is running a docs release and checking the changelog date)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/162
2026-02-11 17:24:20 -08:00
b0bac91ca9 Fix frontmatter field name for Quartz date display (#158)
## Summary

- Rename `date-modified` -> `modified` in all 80 docs and the `docs-check-frontmatter` task

Quartz's `CreatedModifiedDate` plugin recognizes `modified`, `lastmod`, `updated`, and `last-modified` — but not `date-modified`. The wrong field name caused Quartz to ignore frontmatter dates entirely and fall through to filesystem timestamps (UTC inside Dagger), showing Feb 12 on pages built late on Feb 11 PST.

## Test plan

- [x] `mise run docs-check-frontmatter` passes
- [ ] Kick off docs release after merge — verify rendered dates match frontmatter values

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/158
2026-02-11 16:45:12 -08:00
b197bd5f58 Adopt Dagger CI for docs build (Phase 2) (#157)
## Summary

Migrates the docs build pipeline to Dagger (Phase 2 of the Dagger CI adoption plan).

- **Backfill `date-modified` frontmatter** on all 80 docs — Dagger's `--src=.` excludes `.git`, so Quartz can't use git history for page dates. Frontmatter dates work with or without git.
- **New `docs-check-frontmatter` mise task + pre-commit hook** — validates all docs have `title`, `tags`, and `date-modified`
- **New Dagger functions** — `build_changelog` (towncrier in Python container) and `build_docs` (chains changelog → Quartz build in Node container, returns tarball)
- **Simplified CI workflow** — the ~44-line inline Quartz build (clone, npm ci, build, tar, cleanup) is replaced by `dagger call build-docs`. Changelog step remains local on the runner since towncrier needs to modify the host working tree for the git commit.

### Design decisions

- **Towncrier runs twice in CI**: once inside Dagger (for the docs tarball) and once on the runner (for the git commit). This is intentional — Dagger's directory export is additive and can't delete the consumed changelog fragments from the host.
- **Artifact hosting stays on Forgejo Releases** (not migrated to Forgejo Packages as the plan doc originally suggested). That migration can happen independently.
- **`date-modified` frontmatter** preserved even though `build_changelog` installs git — the git there is only for towncrier's `git add` call, not for history. The local iteration story (`dagger call build-docs --src=. --version=dev` with uncommitted changes) depends on frontmatter dates.

### Local iteration

```bash
dagger call build-docs --src=. --version=dev export --path=./docs-dev.tar.gz
tar tf docs-dev.tar.gz | head -20
```

## Deployment and Testing

- [x] `dagger call build-docs --src=. --version=dev` produces valid 1.1MB tarball (149 HTML pages)
- [x] Pre-commit hooks pass (including new `docs-check-frontmatter`)
- [ ] Full `workflow_dispatch` run after merge

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/157
2026-02-11 16:33:16 -08:00
738b39a321 Mark Phase 1 verification checklist items as complete
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 15:49:49 -08:00
1bc2b421a8 Adopt Dagger CI for container builds (Phase 1) (#156)
All checks were successful
Build Container / build (push) Successful in 13s
## Summary

- Add Dagger Python module (`.dagger/`) with `build` and `publish` functions for container images
- Replace Docker buildx + skopeo composite action with `dagger call publish` in `build-container.yaml`
- BuildKit's native push is compatible with Zot — **skopeo workaround eliminated**
- Add Dagger CLI (v0.19.11) to forgejo-runner Dockerfile, bump runner to v2.6.0
- Bootstrap step in workflow curl-installs dagger if not in runner (for first build on v2.5.1 runner)
- Delete old `.forgejo/actions/build-push-image/` composite action
- Add GPLv3 LICENSE

## Verified locally

- `dagger call build --src=. --container-name=nettest` — builds ✓
- `dagger call publish --src=. --container-name=nettest --version=dagger-test` — pushed to Zot ✓
- `dagger call build --src=. --container-name=forgejo-runner` — new runner image builds ✓
- Dagger CLI accessible inside built runner image ✓

## Deployment sequence (after merge)

1. `mise run container-tag-and-release forgejo-runner v2.6.0` — old runner bootstraps dagger via curl, builds new runner
2. `argocd app sync forgejo-runner` — runner restarts with v2.6.0 (dagger baked in)
3. `mise run container-tag-and-release nettest v0.13.0` — end-to-end test of new pipeline
4. `mise run container-list` — verify tags

## Not included (future phases)

- Phase 2: docs build + Forgejo packages migration
- Phase 3: runner simplification (remove skopeo, Node.js, etc.)
- Phase 4: future workflows

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/156
2026-02-11 15:38:31 -08:00
834c9fa57b Bump Fly.io proxy VM to 512MB, fix TruffleHog scanning (#152)
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 1m37s
## Summary
- Bump Fly.io proxy VM memory from 256MB to 512MB — Alloy was OOM-killed, causing the Grafana Fly.io dashboard to lose metrics
- Fix TruffleHog pre-commit hook to scan only staged changes (`--since-commit HEAD`) instead of full repo history
- Sanitize example credential URL in Reolink camera plan doc

## Deployment and Testing
- [ ] Fly.io deploy triggers automatically on merge (workflow watches `fly/**`)
- [ ] After deploy, verify Alloy is running: `fly ssh console -a blumeops-proxy -C "ps aux"` should show alloy process
- [ ] Grafana Fly.io dashboard should start populating within ~1 minute

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/152
2026-02-11 12:03:51 -08:00
651fed8f1a Transcribe backlog tasks into plan documents (#151)
## Summary
- **adopt-oidc-provider:** Dex-based OIDC identity provider for SSO across services (status: Planning — service dependency/recovery design needed)
- **harden-zot-registry:** OIDC + API key auth and tag immutability for zot (depends on OIDC provider + Dagger CI)
- **forgejo-actions-dashboard:** Custom textfile Prometheus exporter + Grafana dashboard for Forgejo Actions CI metrics
- **operationalize-reolink-camera:** Cloud-free Frigate NVR with ONNX detection, NFS ring buffer recording to sifaka (depends on network segmentation)
- **add-unifi-pulumi-stack:** Expanded with NFS security motivation, BlumeOps Services subnet, IoT/appliance segregation, firewall rules

## Test plan
- [x] Pre-commit hooks pass (all 3 commits)
- [x] `docs-check-links` passes
- [x] `docs-check-index` passes
- [x] `docs-check-filenames` passes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/151
2026-02-11 11:47:23 -08:00
430f2c6ec5 Add plans for Dagger CI/CD and upstream fork strategy (#150)
## Summary

Two new plan documents in `docs/how-to/plans/`:

- **adopt-dagger-ci** — Migrate CI/CD build logic from Forgejo Actions YAML to Dagger (Python SDK). Forgejo Actions stays as a thin trigger layer. Covers:
  - Container builds with local iteration (`dagger call build ... terminal`)
  - Docs builds with Forgejo packages migration (replacing Forgejo releases)
  - Runner simplification (only Docker + dagger CLI needed)
  - Secrets handling via Dagger's `Secret` type
  - Future: forked project builds, Python packages, pre-merge validation

- **upstream-fork-strategy** — Stacked-branch pattern for maintaining forks of upstream projects. Covers:
  - Daily automated rebase with conflict detection and issue creation
  - Branch model: `upstream/main` → `blumeops` → `feature/*`
  - Quartz fork as first instance, enabling `last-reviewed` frontmatter rendering in docs
  - Upstream PR path for contributing changes back

## Context

These plans emerged from evaluating alternatives to the GHA ecosystem (BuildKite, Concourse, Earthly) for CI/CD. Dagger was chosen for its local iteration story, Python-native pipelines, and zero-infrastructure requirements. The fork strategy is a prerequisite for customizing Quartz and other upstream tools.

Neither plan is ready for execution yet — they are design documents for future work.

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/150
2026-02-11 10:20:14 -08:00
0dce806107 Add plan and reference card for UniFi Express 7 Pulumi stack (#145)
## Summary
- Rewrites the UniFi Pulumi plan doc to use filipowm/unifi Terraform provider via `pulumi package add terraform-provider` (replaces pulumiverse_unifi approach)
- Adds network segmentation goals (main/guest/IoT WiFi zones) and API key auth
- Creates UniFi reference card (`docs/reference/infrastructure/unifi.md`) with topology diagram
- Updates all documentation indexes (plans.md, how-to.md, hosts.md, reference.md)

## What's Deferred
Actual stack scaffolding (`pulumi/unifi/`), mise tasks, and `pulumi import` are blocked on switch purchase and cabling. The plan doc captures everything needed for a future execution session.

## Verification
- `docs-check-links` passes (all wiki-links resolve)
- `docs-check-index` passes (unifi.md referenced in reference.md)
- Pre-commit hooks pass

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/145
2026-02-10 15:36:13 -08:00
54afa0750b Add how-to guide for restoring 1Password backup from borgmatic (#141)
## Summary
- New how-to guide at `docs/how-to/restore-1password-backup.md` with step-by-step procedure for extracting and decrypting a 1Password `.1pux` export from borgmatic backup
- **End-to-end verified**: extracted from today's borg archive, decrypted age key with openssl, decrypted .1pux with age → valid 31MB zip with vault data
- Cross-links added from: disaster-recovery, 1password, borgmatic, backups policy, and how-to index
- Updated disaster-recovery.md from TBD stub to include a procedures table

## Deployment and Testing
- [x] Verified full extraction + decryption flow against live borgmatic archive
- [x] `docs-check-links` passes — all wiki-links valid
- [ ] Review guide for clarity and completeness

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/141
2026-02-10 10:55:00 -08:00
b5746e62c2 Add migration plan for Forgejo brew-to-source transition (#140)
## Summary
- Add `docs/how-to/plans/migrate-forgejo-from-brew.md` — full Diataxis-style plan covering background, one-time migration steps, Ansible role changes (with exact code), verification checklist, and future considerations
- Add `docs/how-to/plans/plans.md` — new plans subdirectory index for upcoming migration/transition plans
- Update `docs/how-to/how-to.md` with a Plans section
- Update `docs/tutorials/exploring-the-docs.md` to mention plans in the doc structure table and quick-path sections for Owner and AI audiences

## Test plan
- [x] `docs-check-links` passes
- [x] `docs-check-index` passes
- [x] All pre-commit hooks pass

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/140
2026-02-10 10:18:53 -08:00
41dfae1f80 Add CNI conflict troubleshooting to restart-indri how-to (#139)
## Summary
- Documents a troubleshooting procedure for broken pod networking after unclean shutdown
- During minikube recovery, a stale `1-k8s.conflist` CNI config can override kindnet's `10-kindnet.conflist`, causing new pods to use bridge+firewall networking instead of kindnet's ptp — breaking pod-to-pod communication
- Covers symptoms (DNS failures, liveness probe timeouts), diagnosis steps, and the fix

## Context
Encountered this during the 2026-02-10 power outage. Immich, kiwix, and transmission were all crash-looping for ~8 hours due to the CNI conflict. The minikube ansible role's clean boot detection has been improved (#137) so this may not recur, but the troubleshooting guide is valuable if it does.

## Test plan
- [x] Documentation only — no code changes
- [x] Pre-commit hooks pass

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/139
2026-02-10 07:24:42 -08:00
9e361cf38f Add docs-review task with last-reviewed frontmatter tracking (#129)
## Summary
- New `docs-review` mise task replaces `docs-review-random` — sorts docs by `last-reviewed` frontmatter field (never-reviewed first, then oldest)
- Updated review-documentation how-to to explain the new workflow and how to mark cards as reviewed
- Updated ai-assistance-guide task table to reference `docs-review`

## Test plan
- [x] `mise run docs-review` runs and shows staleness table + most stale doc
- [x] `mise run docs-review -- --limit 5` respects the limit flag
- [x] All pre-commit checks pass (links, index, filenames)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/129
2026-02-09 07:29:45 -08:00
e6cf7e47e0 Restrict flyio-proxy ACLs to dedicated tag:flyio-target endpoints (#126)
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 1m8s
## Summary
- Introduce `tag:flyio-target` so services must explicitly opt in to be reachable by the fly.io proxy
- Replace broad `tag:k8s` and `tag:homelab` grants with the new tag in the ACL rule and test
- Add `tailscale.com/tags: "tag:k8s,tag:flyio-target"` annotation to docs, loki, and prometheus Ingresses
- Switch Alloy push endpoints from `*.ops.eblu.me` (Caddy) to `*.tail8d86e.ts.net` (Tailscale Ingress)
- Update docs: flyio-proxy, caddy, tailscale, forgejo (future public access + security checklist), expose-service-publicly

## Manual step (not in PR)
Update the k8s operator OAuth client in the Tailscale admin console to include `tag:flyio-target` in its scope. Without this, the operator cannot assign the new tag to Ingress proxy nodes.

## Deployment order
1. **Pulumi ACLs** — `mise run tailnet-preview && mise run tailnet-up`
2. **OAuth client** — Manual update in Tailscale admin console
3. **K8s Ingresses** — `argocd app sync apps && argocd app sync docs loki prometheus`
4. **Fly.io proxy** — `mise run fly-deploy`
5. **Verify** — `mise run services-check`, check Grafana dashboards

## Test plan
- [ ] `mise run tailnet-preview` shows clean diff
- [ ] `argocd app diff docs`, `argocd app diff loki`, `argocd app diff prometheus` show only annotation additions
- [ ] After deploy: Grafana dashboards show continued log/metric flow
- [ ] `curl -sf https://docs.eblu.me` returns 200
- [ ] `mise run services-check` passes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/126
2026-02-08 21:54:18 -08:00
64a78422b1 Add Fly.io public reverse proxy for docs.eblu.me (#120)
Some checks failed
Deploy Fly.io Proxy / deploy (push) Failing after 9s
## Summary

- Adds a Fly.io reverse proxy (`blumeops-proxy`) that tunnels public traffic to homelab services over Tailscale
- First service exposed: `docs.eblu.me` — the Quartz static docs site
- Includes Pulumi IaC for Tailscale auth key/ACLs and Gandi DNS CNAME
- Adds mise tasks (`fly-deploy`, `fly-setup`, `fly-shutoff`) and Forgejo CI workflow

## Key details

- Fly.io Firecracker VMs support TUN devices natively — no userspace networking needed
- Tailscale auth key is `preauthorized=True` to avoid device approval hangs on container restarts
- nginx caches aggressively for the static site; health check is on the default_server block
- ACLs restrict `tag:flyio-proxy` to `tag:k8s` on port 443 only
- DNS CNAME deployed and verified: `docs.eblu.me` → `blumeops-proxy.fly.dev`

## Test plan

- [x] `curl -sf https://blumeops-proxy.fly.dev/healthz` returns `ok`
- [x] `curl -I -H "Host: docs.eblu.me" https://blumeops-proxy.fly.dev/` returns 200 with `X-Cache-Status`
- [x] `curl -I https://docs.eblu.me/` returns 200 with valid Let's Encrypt cert
- [x] `dig forge.ops.eblu.me` still resolves to 100.98.163.89 (private services unaffected)
- [x] Set `FLY_DEPLOY_TOKEN` Forgejo Actions secret for CI auto-deploy

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/120
2026-02-08 02:36:19 -08:00
fbedaf2833 docs/expose-service-publicly pt2 - fly.io (#119)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/119
2026-02-08 00:38:27 -08:00
ae2445c99a Add how-to guide for public service exposure via Cloudflare (#118)
## Summary
- Adds `docs/how-to/expose-service-publicly.md` documenting the full plan for exposing `docs.eblu.me` to the public internet
- Covers Cloudflare Tunnel + CDN architecture, DNS migration from Gandi, Caddy TLS changes, Pulumi IaC, k8s cloudflared deployment, and verification steps
- Pattern is reusable for future public services
- Marked as "Plan — not yet implemented" status

## Test plan
- [x] `docs-check-links` passes
- [x] `docs-check-index` passes
- [x] All pre-commit hooks pass

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/118
2026-02-07 22:09:52 -08:00
dc46eb7def Update all docs titles to human-readable (#117)
## Summary
- Updated frontmatter `title:` in all 63 doc cards from slug-case to human-readable (e.g. `borgmatic` → `Borgmatic`, `ai-assistance-guide` → `AI Assistance Guide`)
- Titles now closely match file stems so `[[wiki-links]]` render naturally without alternate anchor text
- Corrected titles that diverged from stems (e.g. `host-inventory` → `Hosts`, `grafana-alloy` → `Alloy`, `argocd-applications` → `Apps`)
- Deleted `title-test-alpha.md` and `title-test-beta.md` test cards and removed their reference index entry

## Deployment and Testing
- [x] `docs-check-links` passes — all wiki-links valid
- [x] `docs-check-index` passes
- [x] `docs-check-filenames` passes
- [ ] Verify titles render correctly on docs site after deploy

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/117
2026-02-07 21:44:57 -08:00
8e4afe77e0 Add Gandi DNS docs and rewrite homepage intro (#115)
## Summary
- New reference card (`docs/reference/infrastructure/gandi.md`) covering DNS records, Pulumi config, TLS integration
- New how-to guide (`docs/how-to/gandi-operations.md`) for DNS deployment and PAT cycling with `pbpaste` shortcut
- Rewritten homepage intro for wider audience ahead of public docs.eblu.me
- Cross-linked from reference index, routing, caddy, and how-to index
- Fixed PAT expiration inaccuracy in `pulumi/gandi/README.md` (max is 90 days, not 30)

## Test plan
- [ ] Verify wiki-links resolve in Quartz build
- [ ] Review gandi reference card for accuracy
- [ ] Review gandi-operations how-to for accuracy
- [ ] Check homepage reads well for external visitors

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/115
2026-02-07 21:02:10 -08:00
2cb76de5a2 Fix doc tag inconsistencies and add missing ai changelog type (#114)
## Summary
- Add missing `ai` changelog fragment type to the update-documentation how-to guide (comment and table)
- Consolidate `cicd` → `ci-cd` tag on forgejo.md
- Consolidate `network` → `networking` tag on routing.md and tailscale.md

Found during random doc review via `docs-review-tags`.

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/114
2026-02-06 18:52:36 -08:00
cb343a2e35 Rename doc-* mise tasks to docs-check-* / docs-review-* (#113)
## Summary
- Rename 4 automated-check tasks: `doc-titles` → `docs-check-titles`, `doc-filenames` → `docs-check-filenames`, `doc-links` → `docs-check-links`, `doc-index` → `docs-check-index`
- Rename 3 interactive-review tasks: `doc-random` → `docs-review-random`, `doc-tags` → `docs-review-tags`, `doc-stale` → `docs-review-stale`
- Update all references in `.pre-commit-config.yaml`, `ai-assistance-guide.md`, and `review-documentation.md`
- Historical changelog entries left as-is

## Test plan
- [x] `mise run docs-check-titles` exits 0
- [x] `mise run docs-check-links` exits 0
- [x] `mise run docs-review-tags` exits 0
- [x] `mise run doc-titles` fails with "no task found"
- [x] All pre-commit hooks pass (including renamed hook IDs)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/113
2026-02-06 07:08:46 -08:00
060c7a24e3 Review exploring-the-docs and add doc consistency checks (#112)
## Summary
- Reviewed and cleaned up exploring-the-docs tutorial: simplified wiki-links, fixed broken replication/ reference, added Related section, corrected zk-docs flags to match CLAUDE.md
- Added orphan detection to doc-links (finds docs not linked from any other doc)
- Added new doc tooling: `doc-index` (checks category index coverage), `doc-stale` (staleness report), `doc-tags` (tag inventory)
- Added `doc-index` as a pre-commit hook
- Updated use-pypi-proxy to document env-var-based proxy toggle for pip/uv
- Updated ai-assistance-guide with new doc task descriptions

## Test plan
- [ ] Run `mise run doc-links` — passes
- [ ] Run `mise run doc-index` — passes
- [ ] Run `mise run doc-stale` — informational output
- [ ] Run `mise run doc-tags` — informational output
- [ ] Pre-commit hooks pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/112
2026-02-05 21:12:06 -08:00
61c5328ec2 Update restart-indri docs after power outage recovery (#111)
## Summary
- Simplified restart-indri startup procedure to match reality (most services autostart via mcquack LaunchAgents and brew services)
- Added minikube tailscale serve port fix step (`mise run provision-indri -- --tags minikube`)
- Added Anker SOLIX F2000 GaNPrime UPS to indri reference card

## Context
After a power outage, discovered that the restart-indri docs overstated what needs manual intervention. Docker Desktop, Forgejo, Caddy, and all mcquack services autostart. Only Amphetamine, AutoMounter, and minikube need manual action.

## Test plan
- [ ] Verify restart-indri doc reads clearly
- [ ] Verify indri ref card UPS entry renders correctly

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/111
2026-02-05 21:05:35 -08:00
3da455e49c Enforce unique doc filenames and simple wiki-links (#109)
## Summary
- Rename section index files to match their titles (tutorials.md, reference.md, how-to.md, explanation.md) so all filenames are unique
- Convert all ~47 path-based wiki-links to simple filename format across 15 files
- Update doc-filenames task to no longer skip index.md files
- Update doc-links task to reject path-based links containing '/'

This ensures all wiki-links work correctly in obsidian.nvim by making links resolvable by filename alone.

## Testing
- `mise run doc-filenames` - all unique
- `mise run doc-links` - no broken or path-based links
- `mise run doc-titles` - no duplicates

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/109
2026-02-04 17:21:34 -08:00
7aa0e60b27 Add how-to guide for restarting indri (#108)
## Summary
- Add `docs/how-to/restart-indri.md` with safe shutdown and startup procedures
- Add `docs/reference/services/automounter.md` documenting the SMB share automounter app
- Update indri reference card with GUI applications section
- Update how-to and reference indexes

## Test plan
- [ ] Review restart-indri.md for accuracy
- [ ] Verify AutoMounter details are correct
- [ ] Perform the actual restart using this guide

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/108
2026-02-04 14:39:48 -08:00
03bda41de4 Fix Quartz build to preserve git history for accurate file dates (#105)
## Summary

Fixes the "isn't yet tracked by git, dates will be inaccurate" warnings in the Build docs step by restructuring how Quartz builds the documentation.

## Problem

Previously, we copied docs into Quartz's content folder. Since this was inside a fresh Quartz clone with no history of our files, the `CreatedModifiedDate` plugin couldn't determine accurate dates.

## Solution

Build Quartz from within the blumeops repo instead:
1. Copy Quartz's build system (quartz/, package.json, etc.) into the workspace
2. Symlink `content` -> `docs` (preserves git history)
3. Symlink `docs/CHANGELOG.md` -> `../CHANGELOG.md`
4. Build from workspace root where git can trace file history
5. Clean up artifacts after creating tarball

## Deployment and Testing

- [ ] Run build workflow and verify no "not tracked by git" warnings
- [ ] Verify file dates appear correctly on built docs site

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/105
2026-02-04 08:25:46 -08:00
efdd569285 Improve build workflow with version bump selection and changelog in releases (#104)
## Summary

- Add `version_type` choice input with options: BUMP_PATCH (default), BUMP_MINOR, BUMP_MAJOR, SPECIFIC_VERSION
- Add optional `specific_version` input for explicit version selection
- Include changelog content in Forgejo release body under "What's Changed" section
- Move CHANGELOG.md to repository root (still copied into docs during Quartz build)
- Add CHANGELOG link to docs index page
- Update doc-links script to recognize build-time docs from repo root

## Changes

**Workflow inputs:**
- Previously: single optional `version` string input
- Now: `version_type` choice dropdown (defaults to BUMP_PATCH) + optional `specific_version` for explicit versions

**Release body:**
- Previously: just asset download instructions
- Now: includes "What's Changed" section with changelog entries for this release

**CHANGELOG.md location:**
- Previously: `docs/CHANGELOG.md`
- Now: `CHANGELOG.md` (repo root), copied into docs content during build

## Deployment and Testing

- [ ] Run build workflow with BUMP_PATCH (default)
- [ ] Run build workflow with BUMP_MINOR
- [ ] Verify changelog appears in release body
- [ ] Verify docs site includes CHANGELOG page

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/104
2026-02-04 08:13:16 -08:00
e720b524d3 Rename indri-services-check to services-check (#103)
## Summary
- Rename `indri-services-check` task to `services-check` since it checks all services (indri native, Kubernetes, HTTP endpoints), not just indri-specific ones
- Update references in CLAUDE.md, ai-assistance-guide.md, and troubleshooting.md

## Deployment and Testing
- [ ] Run `mise run services-check` to verify the task works under its new name

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/103
2026-02-04 07:49:15 -08:00
5c79a8dbe2 Add doc-random task and documentation improvements (#98)
## Summary
- Add `doc-random` mise task that selects a random documentation card for review
- Add how-to/knowledgebase section with review-documentation guide
- Add Caddy reference card with proxy configuration details
- Fix replication tutorial sequence (tailscale-setup now links to core-services)
- Fix "BluemeOps" typo in tailscale-setup
- Clean up obsolete zk/ directory references from doc-links

## Deployment and Testing
- [x] `mise run doc-random` works and displays a random card
- [x] `mise run doc-links` passes (all wiki-links valid)
- [x] Pre-commit hooks pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/98
2026-02-03 21:17:58 -08:00
f8f11121eb Complete Phase 6: documentation cleanup and integration (#97)
## Summary
- Delete `docs/zk/` directory - all useful content migrated to structured docs
- Delete `docs/README.md` - `docs/index.md` is now the documentation root
- Add `devpi` reference card and `use-pypi-proxy` how-to guide
- Add maintenance notes to `indri` reference (sleep prevention, passwordless sudo)
- Add iCloud Photos backup note to `borgmatic` reference
- Rewrite `zk-docs` mise task to prime AI context with key docs instead of legacy cards
- Update `CLAUDE.md` and `README.md` to remove zk references
- Update `exploring-the-docs` with AI context priming section

This completes the Diataxis documentation restructuring. All six phases are now done.

## Deployment and Testing
- [x] Pre-commit hooks pass (including doc-links validator)
- [ ] Build and deploy to docs.ops.eblu.me to verify rendering

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/97
2026-02-03 20:52:37 -08:00
e311b36b3c Add Phase 4: how-to guides documentation (#95)
## Summary
- Create `docs/how-to/` directory with index and four how-to guides
- deploy-k8s-service: Quick reference for Kubernetes deployments via ArgoCD
- add-ansible-role: Adding new Ansible roles for indri services
- update-tailscale-acls: Modifying Tailscale ACL policies via Pulumi
- troubleshooting: Diagnosing and fixing common issues
- Update exploring-the-docs to include How-to section links
- Update README.md to mark Phase 4 as complete

## Deployment and Testing
- [x] Pre-commit hooks pass (including doc-links validator)
- [ ] Build and deploy to docs.ops.eblu.me to verify rendering

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/95
2026-02-03 20:17:24 -08:00