Commit graph

24 commits

Author SHA1 Message Date
430f2c6ec5 Add plans for Dagger CI/CD and upstream fork strategy (#150)
## Summary

Two new plan documents in `docs/how-to/plans/`:

- **adopt-dagger-ci** — Migrate CI/CD build logic from Forgejo Actions YAML to Dagger (Python SDK). Forgejo Actions stays as a thin trigger layer. Covers:
  - Container builds with local iteration (`dagger call build ... terminal`)
  - Docs builds with Forgejo packages migration (replacing Forgejo releases)
  - Runner simplification (only Docker + dagger CLI needed)
  - Secrets handling via Dagger's `Secret` type
  - Future: forked project builds, Python packages, pre-merge validation

- **upstream-fork-strategy** — Stacked-branch pattern for maintaining forks of upstream projects. Covers:
  - Daily automated rebase with conflict detection and issue creation
  - Branch model: `upstream/main` → `blumeops` → `feature/*`
  - Quartz fork as first instance, enabling `last-reviewed` frontmatter rendering in docs
  - Upstream PR path for contributing changes back

## Context

These plans emerged from evaluating alternatives to the GHA ecosystem (BuildKite, Concourse, Earthly) for CI/CD. Dagger was chosen for its local iteration story, Python-native pipelines, and zero-infrastructure requirements. The fork strategy is a prerequisite for customizing Quartz and other upstream tools.

Neither plan is ready for execution yet — they are design documents for future work.

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/150
2026-02-11 10:20:14 -08:00
0dce806107 Add plan and reference card for UniFi Express 7 Pulumi stack (#145)
## Summary
- Rewrites the UniFi Pulumi plan doc to use filipowm/unifi Terraform provider via `pulumi package add terraform-provider` (replaces pulumiverse_unifi approach)
- Adds network segmentation goals (main/guest/IoT WiFi zones) and API key auth
- Creates UniFi reference card (`docs/reference/infrastructure/unifi.md`) with topology diagram
- Updates all documentation indexes (plans.md, how-to.md, hosts.md, reference.md)

## What's Deferred
Actual stack scaffolding (`pulumi/unifi/`), mise tasks, and `pulumi import` are blocked on switch purchase and cabling. The plan doc captures everything needed for a future execution session.

## Verification
- `docs-check-links` passes (all wiki-links resolve)
- `docs-check-index` passes (unifi.md referenced in reference.md)
- Pre-commit hooks pass

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/145
2026-02-10 15:36:13 -08:00
54afa0750b Add how-to guide for restoring 1Password backup from borgmatic (#141)
## Summary
- New how-to guide at `docs/how-to/restore-1password-backup.md` with step-by-step procedure for extracting and decrypting a 1Password `.1pux` export from borgmatic backup
- **End-to-end verified**: extracted from today's borg archive, decrypted age key with openssl, decrypted .1pux with age → valid 31MB zip with vault data
- Cross-links added from: disaster-recovery, 1password, borgmatic, backups policy, and how-to index
- Updated disaster-recovery.md from TBD stub to include a procedures table

## Deployment and Testing
- [x] Verified full extraction + decryption flow against live borgmatic archive
- [x] `docs-check-links` passes — all wiki-links valid
- [ ] Review guide for clarity and completeness

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/141
2026-02-10 10:55:00 -08:00
b5746e62c2 Add migration plan for Forgejo brew-to-source transition (#140)
## Summary
- Add `docs/how-to/plans/migrate-forgejo-from-brew.md` — full Diataxis-style plan covering background, one-time migration steps, Ansible role changes (with exact code), verification checklist, and future considerations
- Add `docs/how-to/plans/plans.md` — new plans subdirectory index for upcoming migration/transition plans
- Update `docs/how-to/how-to.md` with a Plans section
- Update `docs/tutorials/exploring-the-docs.md` to mention plans in the doc structure table and quick-path sections for Owner and AI audiences

## Test plan
- [x] `docs-check-links` passes
- [x] `docs-check-index` passes
- [x] All pre-commit hooks pass

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/140
2026-02-10 10:18:53 -08:00
41dfae1f80 Add CNI conflict troubleshooting to restart-indri how-to (#139)
## Summary
- Documents a troubleshooting procedure for broken pod networking after unclean shutdown
- During minikube recovery, a stale `1-k8s.conflist` CNI config can override kindnet's `10-kindnet.conflist`, causing new pods to use bridge+firewall networking instead of kindnet's ptp — breaking pod-to-pod communication
- Covers symptoms (DNS failures, liveness probe timeouts), diagnosis steps, and the fix

## Context
Encountered this during the 2026-02-10 power outage. Immich, kiwix, and transmission were all crash-looping for ~8 hours due to the CNI conflict. The minikube ansible role's clean boot detection has been improved (#137) so this may not recur, but the troubleshooting guide is valuable if it does.

## Test plan
- [x] Documentation only — no code changes
- [x] Pre-commit hooks pass

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/139
2026-02-10 07:24:42 -08:00
9e361cf38f Add docs-review task with last-reviewed frontmatter tracking (#129)
## Summary
- New `docs-review` mise task replaces `docs-review-random` — sorts docs by `last-reviewed` frontmatter field (never-reviewed first, then oldest)
- Updated review-documentation how-to to explain the new workflow and how to mark cards as reviewed
- Updated ai-assistance-guide task table to reference `docs-review`

## Test plan
- [x] `mise run docs-review` runs and shows staleness table + most stale doc
- [x] `mise run docs-review -- --limit 5` respects the limit flag
- [x] All pre-commit checks pass (links, index, filenames)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/129
2026-02-09 07:29:45 -08:00
e6cf7e47e0 Restrict flyio-proxy ACLs to dedicated tag:flyio-target endpoints (#126)
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 1m8s
## Summary
- Introduce `tag:flyio-target` so services must explicitly opt in to be reachable by the fly.io proxy
- Replace broad `tag:k8s` and `tag:homelab` grants with the new tag in the ACL rule and test
- Add `tailscale.com/tags: "tag:k8s,tag:flyio-target"` annotation to docs, loki, and prometheus Ingresses
- Switch Alloy push endpoints from `*.ops.eblu.me` (Caddy) to `*.tail8d86e.ts.net` (Tailscale Ingress)
- Update docs: flyio-proxy, caddy, tailscale, forgejo (future public access + security checklist), expose-service-publicly

## Manual step (not in PR)
Update the k8s operator OAuth client in the Tailscale admin console to include `tag:flyio-target` in its scope. Without this, the operator cannot assign the new tag to Ingress proxy nodes.

## Deployment order
1. **Pulumi ACLs** — `mise run tailnet-preview && mise run tailnet-up`
2. **OAuth client** — Manual update in Tailscale admin console
3. **K8s Ingresses** — `argocd app sync apps && argocd app sync docs loki prometheus`
4. **Fly.io proxy** — `mise run fly-deploy`
5. **Verify** — `mise run services-check`, check Grafana dashboards

## Test plan
- [ ] `mise run tailnet-preview` shows clean diff
- [ ] `argocd app diff docs`, `argocd app diff loki`, `argocd app diff prometheus` show only annotation additions
- [ ] After deploy: Grafana dashboards show continued log/metric flow
- [ ] `curl -sf https://docs.eblu.me` returns 200
- [ ] `mise run services-check` passes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/126
2026-02-08 21:54:18 -08:00
64a78422b1 Add Fly.io public reverse proxy for docs.eblu.me (#120)
Some checks failed
Deploy Fly.io Proxy / deploy (push) Failing after 9s
## Summary

- Adds a Fly.io reverse proxy (`blumeops-proxy`) that tunnels public traffic to homelab services over Tailscale
- First service exposed: `docs.eblu.me` — the Quartz static docs site
- Includes Pulumi IaC for Tailscale auth key/ACLs and Gandi DNS CNAME
- Adds mise tasks (`fly-deploy`, `fly-setup`, `fly-shutoff`) and Forgejo CI workflow

## Key details

- Fly.io Firecracker VMs support TUN devices natively — no userspace networking needed
- Tailscale auth key is `preauthorized=True` to avoid device approval hangs on container restarts
- nginx caches aggressively for the static site; health check is on the default_server block
- ACLs restrict `tag:flyio-proxy` to `tag:k8s` on port 443 only
- DNS CNAME deployed and verified: `docs.eblu.me` → `blumeops-proxy.fly.dev`

## Test plan

- [x] `curl -sf https://blumeops-proxy.fly.dev/healthz` returns `ok`
- [x] `curl -I -H "Host: docs.eblu.me" https://blumeops-proxy.fly.dev/` returns 200 with `X-Cache-Status`
- [x] `curl -I https://docs.eblu.me/` returns 200 with valid Let's Encrypt cert
- [x] `dig forge.ops.eblu.me` still resolves to 100.98.163.89 (private services unaffected)
- [x] Set `FLY_DEPLOY_TOKEN` Forgejo Actions secret for CI auto-deploy

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/120
2026-02-08 02:36:19 -08:00
fbedaf2833 docs/expose-service-publicly pt2 - fly.io (#119)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/119
2026-02-08 00:38:27 -08:00
ae2445c99a Add how-to guide for public service exposure via Cloudflare (#118)
## Summary
- Adds `docs/how-to/expose-service-publicly.md` documenting the full plan for exposing `docs.eblu.me` to the public internet
- Covers Cloudflare Tunnel + CDN architecture, DNS migration from Gandi, Caddy TLS changes, Pulumi IaC, k8s cloudflared deployment, and verification steps
- Pattern is reusable for future public services
- Marked as "Plan — not yet implemented" status

## Test plan
- [x] `docs-check-links` passes
- [x] `docs-check-index` passes
- [x] All pre-commit hooks pass

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/118
2026-02-07 22:09:52 -08:00
dc46eb7def Update all docs titles to human-readable (#117)
## Summary
- Updated frontmatter `title:` in all 63 doc cards from slug-case to human-readable (e.g. `borgmatic` → `Borgmatic`, `ai-assistance-guide` → `AI Assistance Guide`)
- Titles now closely match file stems so `[[wiki-links]]` render naturally without alternate anchor text
- Corrected titles that diverged from stems (e.g. `host-inventory` → `Hosts`, `grafana-alloy` → `Alloy`, `argocd-applications` → `Apps`)
- Deleted `title-test-alpha.md` and `title-test-beta.md` test cards and removed their reference index entry

## Deployment and Testing
- [x] `docs-check-links` passes — all wiki-links valid
- [x] `docs-check-index` passes
- [x] `docs-check-filenames` passes
- [ ] Verify titles render correctly on docs site after deploy

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/117
2026-02-07 21:44:57 -08:00
8e4afe77e0 Add Gandi DNS docs and rewrite homepage intro (#115)
## Summary
- New reference card (`docs/reference/infrastructure/gandi.md`) covering DNS records, Pulumi config, TLS integration
- New how-to guide (`docs/how-to/gandi-operations.md`) for DNS deployment and PAT cycling with `pbpaste` shortcut
- Rewritten homepage intro for wider audience ahead of public docs.eblu.me
- Cross-linked from reference index, routing, caddy, and how-to index
- Fixed PAT expiration inaccuracy in `pulumi/gandi/README.md` (max is 90 days, not 30)

## Test plan
- [ ] Verify wiki-links resolve in Quartz build
- [ ] Review gandi reference card for accuracy
- [ ] Review gandi-operations how-to for accuracy
- [ ] Check homepage reads well for external visitors

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/115
2026-02-07 21:02:10 -08:00
2cb76de5a2 Fix doc tag inconsistencies and add missing ai changelog type (#114)
## Summary
- Add missing `ai` changelog fragment type to the update-documentation how-to guide (comment and table)
- Consolidate `cicd` → `ci-cd` tag on forgejo.md
- Consolidate `network` → `networking` tag on routing.md and tailscale.md

Found during random doc review via `docs-review-tags`.

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/114
2026-02-06 18:52:36 -08:00
cb343a2e35 Rename doc-* mise tasks to docs-check-* / docs-review-* (#113)
## Summary
- Rename 4 automated-check tasks: `doc-titles` → `docs-check-titles`, `doc-filenames` → `docs-check-filenames`, `doc-links` → `docs-check-links`, `doc-index` → `docs-check-index`
- Rename 3 interactive-review tasks: `doc-random` → `docs-review-random`, `doc-tags` → `docs-review-tags`, `doc-stale` → `docs-review-stale`
- Update all references in `.pre-commit-config.yaml`, `ai-assistance-guide.md`, and `review-documentation.md`
- Historical changelog entries left as-is

## Test plan
- [x] `mise run docs-check-titles` exits 0
- [x] `mise run docs-check-links` exits 0
- [x] `mise run docs-review-tags` exits 0
- [x] `mise run doc-titles` fails with "no task found"
- [x] All pre-commit hooks pass (including renamed hook IDs)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/113
2026-02-06 07:08:46 -08:00
060c7a24e3 Review exploring-the-docs and add doc consistency checks (#112)
## Summary
- Reviewed and cleaned up exploring-the-docs tutorial: simplified wiki-links, fixed broken replication/ reference, added Related section, corrected zk-docs flags to match CLAUDE.md
- Added orphan detection to doc-links (finds docs not linked from any other doc)
- Added new doc tooling: `doc-index` (checks category index coverage), `doc-stale` (staleness report), `doc-tags` (tag inventory)
- Added `doc-index` as a pre-commit hook
- Updated use-pypi-proxy to document env-var-based proxy toggle for pip/uv
- Updated ai-assistance-guide with new doc task descriptions

## Test plan
- [ ] Run `mise run doc-links` — passes
- [ ] Run `mise run doc-index` — passes
- [ ] Run `mise run doc-stale` — informational output
- [ ] Run `mise run doc-tags` — informational output
- [ ] Pre-commit hooks pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/112
2026-02-05 21:12:06 -08:00
61c5328ec2 Update restart-indri docs after power outage recovery (#111)
## Summary
- Simplified restart-indri startup procedure to match reality (most services autostart via mcquack LaunchAgents and brew services)
- Added minikube tailscale serve port fix step (`mise run provision-indri -- --tags minikube`)
- Added Anker SOLIX F2000 GaNPrime UPS to indri reference card

## Context
After a power outage, discovered that the restart-indri docs overstated what needs manual intervention. Docker Desktop, Forgejo, Caddy, and all mcquack services autostart. Only Amphetamine, AutoMounter, and minikube need manual action.

## Test plan
- [ ] Verify restart-indri doc reads clearly
- [ ] Verify indri ref card UPS entry renders correctly

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/111
2026-02-05 21:05:35 -08:00
3da455e49c Enforce unique doc filenames and simple wiki-links (#109)
## Summary
- Rename section index files to match their titles (tutorials.md, reference.md, how-to.md, explanation.md) so all filenames are unique
- Convert all ~47 path-based wiki-links to simple filename format across 15 files
- Update doc-filenames task to no longer skip index.md files
- Update doc-links task to reject path-based links containing '/'

This ensures all wiki-links work correctly in obsidian.nvim by making links resolvable by filename alone.

## Testing
- `mise run doc-filenames` - all unique
- `mise run doc-links` - no broken or path-based links
- `mise run doc-titles` - no duplicates

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/109
2026-02-04 17:21:34 -08:00
7aa0e60b27 Add how-to guide for restarting indri (#108)
## Summary
- Add `docs/how-to/restart-indri.md` with safe shutdown and startup procedures
- Add `docs/reference/services/automounter.md` documenting the SMB share automounter app
- Update indri reference card with GUI applications section
- Update how-to and reference indexes

## Test plan
- [ ] Review restart-indri.md for accuracy
- [ ] Verify AutoMounter details are correct
- [ ] Perform the actual restart using this guide

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/108
2026-02-04 14:39:48 -08:00
03bda41de4 Fix Quartz build to preserve git history for accurate file dates (#105)
## Summary

Fixes the "isn't yet tracked by git, dates will be inaccurate" warnings in the Build docs step by restructuring how Quartz builds the documentation.

## Problem

Previously, we copied docs into Quartz's content folder. Since this was inside a fresh Quartz clone with no history of our files, the `CreatedModifiedDate` plugin couldn't determine accurate dates.

## Solution

Build Quartz from within the blumeops repo instead:
1. Copy Quartz's build system (quartz/, package.json, etc.) into the workspace
2. Symlink `content` -> `docs` (preserves git history)
3. Symlink `docs/CHANGELOG.md` -> `../CHANGELOG.md`
4. Build from workspace root where git can trace file history
5. Clean up artifacts after creating tarball

## Deployment and Testing

- [ ] Run build workflow and verify no "not tracked by git" warnings
- [ ] Verify file dates appear correctly on built docs site

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/105
2026-02-04 08:25:46 -08:00
efdd569285 Improve build workflow with version bump selection and changelog in releases (#104)
## Summary

- Add `version_type` choice input with options: BUMP_PATCH (default), BUMP_MINOR, BUMP_MAJOR, SPECIFIC_VERSION
- Add optional `specific_version` input for explicit version selection
- Include changelog content in Forgejo release body under "What's Changed" section
- Move CHANGELOG.md to repository root (still copied into docs during Quartz build)
- Add CHANGELOG link to docs index page
- Update doc-links script to recognize build-time docs from repo root

## Changes

**Workflow inputs:**
- Previously: single optional `version` string input
- Now: `version_type` choice dropdown (defaults to BUMP_PATCH) + optional `specific_version` for explicit versions

**Release body:**
- Previously: just asset download instructions
- Now: includes "What's Changed" section with changelog entries for this release

**CHANGELOG.md location:**
- Previously: `docs/CHANGELOG.md`
- Now: `CHANGELOG.md` (repo root), copied into docs content during build

## Deployment and Testing

- [ ] Run build workflow with BUMP_PATCH (default)
- [ ] Run build workflow with BUMP_MINOR
- [ ] Verify changelog appears in release body
- [ ] Verify docs site includes CHANGELOG page

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/104
2026-02-04 08:13:16 -08:00
e720b524d3 Rename indri-services-check to services-check (#103)
## Summary
- Rename `indri-services-check` task to `services-check` since it checks all services (indri native, Kubernetes, HTTP endpoints), not just indri-specific ones
- Update references in CLAUDE.md, ai-assistance-guide.md, and troubleshooting.md

## Deployment and Testing
- [ ] Run `mise run services-check` to verify the task works under its new name

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/103
2026-02-04 07:49:15 -08:00
5c79a8dbe2 Add doc-random task and documentation improvements (#98)
## Summary
- Add `doc-random` mise task that selects a random documentation card for review
- Add how-to/knowledgebase section with review-documentation guide
- Add Caddy reference card with proxy configuration details
- Fix replication tutorial sequence (tailscale-setup now links to core-services)
- Fix "BluemeOps" typo in tailscale-setup
- Clean up obsolete zk/ directory references from doc-links

## Deployment and Testing
- [x] `mise run doc-random` works and displays a random card
- [x] `mise run doc-links` passes (all wiki-links valid)
- [x] Pre-commit hooks pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/98
2026-02-03 21:17:58 -08:00
f8f11121eb Complete Phase 6: documentation cleanup and integration (#97)
## Summary
- Delete `docs/zk/` directory - all useful content migrated to structured docs
- Delete `docs/README.md` - `docs/index.md` is now the documentation root
- Add `devpi` reference card and `use-pypi-proxy` how-to guide
- Add maintenance notes to `indri` reference (sleep prevention, passwordless sudo)
- Add iCloud Photos backup note to `borgmatic` reference
- Rewrite `zk-docs` mise task to prime AI context with key docs instead of legacy cards
- Update `CLAUDE.md` and `README.md` to remove zk references
- Update `exploring-the-docs` with AI context priming section

This completes the Diataxis documentation restructuring. All six phases are now done.

## Deployment and Testing
- [x] Pre-commit hooks pass (including doc-links validator)
- [ ] Build and deploy to docs.ops.eblu.me to verify rendering

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/97
2026-02-03 20:52:37 -08:00
e311b36b3c Add Phase 4: how-to guides documentation (#95)
## Summary
- Create `docs/how-to/` directory with index and four how-to guides
- deploy-k8s-service: Quick reference for Kubernetes deployments via ArgoCD
- add-ansible-role: Adding new Ansible roles for indri services
- update-tailscale-acls: Modifying Tailscale ACL policies via Pulumi
- troubleshooting: Diagnosing and fixing common issues
- Update exploring-the-docs to include How-to section links
- Update README.md to mark Phase 4 as complete

## Deployment and Testing
- [x] Pre-commit hooks pass (including doc-links validator)
- [ ] Build and deploy to docs.ops.eblu.me to verify rendering

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/95
2026-02-03 20:17:24 -08:00