Commit graph

154 commits

Author SHA1 Message Date
eec1edf43d Add how-to guide for connecting to PostgreSQL via psql (#188)
## Summary
- Add new how-to guide (`connect-to-postgres.md`) with the `psql` command using `op read` for 1Password credentials
- Add "Database" section to the how-to index linking to the new guide
- Link the new guide from the PostgreSQL reference card's Related section

## Test plan
- [x] Verified `psql` connection works from gilbert using the documented command
- [ ] Review doc formatting and content

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/188
2026-02-14 07:18:06 -08:00
49ec05041c Update UniFi Pulumi plan: switch to ubiquiti-community provider (#187)
## Summary
- Switch provider from filipowm/unifi (inactive maintainer, showstopper bug #94 wiping firewall rules) to ubiquiti-community/unifi (actively maintained, API key auth)
- Add UX7 config backup prerequisite before adopting IaC
- Fix safety guard: check default route interface instead of hostname (runs from gilbert, not indri)
- Update 1Password paths to match actual item (`op://blumeops/unifi/credential`)
- Fix ringtail references: not a Raspberry Pi, stays on WiFi (removed from wired topology)
- Update doc steps for already-existing reference files

## Test plan
- [x] Pre-commit hooks pass
- [x] `docs-check-links` pass
- [x] `docs-check-index` pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/187
2026-02-13 20:02:16 -08:00
b3747f6c95 Tier 1 version bumps (#186)
All checks were successful
Build Container / build (push) Successful in 8s
## Summary

Audit and upgrade of all deployed images, helm charts, and custom container Dockerfiles to latest stable versions. This PR covers Tier 1 (low-risk minor/patch bumps only).

### Upstream images
| Image | Old | New |
|-------|-----|-----|
| kube-state-metrics | v2.13.0 | v2.18.0 |
| prometheus | v3.2.1 | v3.9.1 |
| loki | 3.3.2 | 3.6.5 |
| alloy | v1.5.1 | v1.13.1 |
| tailscale (proxy + operator) | v1.92.5 | v1.94.1 |
| navidrome | :latest | v0.60.3 (pinned) |

### Helm charts
| Chart | Old | New |
|-------|-----|-----|
| CloudNativePG | v0.27.0 | v0.27.1 |
| 1Password Connect | 2.2.1 | 2.3.0 |

### Custom containers (Dockerfiles updated, images not yet tagged)
| Container | Changes | New tag |
|-----------|---------|---------|
| miniflux | 2.2.16→2.2.17 (security), alpine 3.22 | v1.1.0 |
| kubectl | v1.34.1→v1.34.4, alpine 3.22 | v1.1.0 |
| kiwix-serve | alpine 3.22 | v1.1.0 |
| nettest | alpine 3.22 | v0.14.0 |
| transmission | alpine 3.22, pkg 4.0.6-r4 | v1.1.0 |

All custom containers verified with local `dagger call build`.

### Deferred to Tier 2 (separate PRs)
- Forgejo runner 6→12 (major version scheme change)
- Docker DinD 27→29
- Grafana chart 8→11 (repo migration)
- External Secrets 1→2 (breaking changes)
- Python 3.12→3.13, Elixir 1.18→1.19, Node 22→24
- Transmission 4.0.6→4.1.0 (not in Alpine yet)

## Deployment

After merge:
1. Tag custom containers: `mise run container-tag-and-release <name> <version>` for each
2. Wait for CI builds to complete
3. `argocd app sync apps` then sync individual apps, or let ArgoCD auto-detect

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/186
2026-02-13 17:16:37 -08:00
81690dae0f Review add-ansible-role doc (#185)
## Summary
- Replace `op item get --fields` with `op read` for secrets (matches playbook and CLAUDE.md guidance)
- Change `tags: [<role>]` to `tags: <role>` to match actual playbook style
- Remove redundant `listen:` from handler example, add `changed_when: true`
- Name handler after specific service (e.g. `Restart <service>`) to match real roles
- Add `last-reviewed: 2026-02-13` frontmatter

## Also noted (not fixed here)
Two other docs still use the old `op item get` pattern:
- `docs/how-to/troubleshooting.md:72` (ArgoCD login command)
- `docs/how-to/gandi-operations.md:35` (Gandi token export)

These can be fixed in their own review cycles.

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/185
2026-02-13 16:54:42 -08:00
5b91a1c315 Review why-gitops doc (#184)
## Summary
- Fix misleading `[[tailscale|Pulumi]]` wiki-link → `[[pulumi]]`
- Simplify `[[ansible|Ansible]]` and `[[argocd|ArgoCD]]` to plain wiki-links per convention
- Rename "Tailnet" layer to "Network" to reflect Pulumi's full scope (Tailscale ACLs + Gandi DNS)
- Fix `apt install` → `brew install` (indri is macOS)
- Add `[[pulumi]]` to Related section
- Add `last-reviewed: 2026-02-13` frontmatter

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/184
2026-02-13 16:48:06 -08:00
d5c00192d5 Configure DinD to use Zot as pull-through registry mirror (#183)
## Summary
- Add `daemon.json` with `registry-mirrors` to the forgejo-runner ConfigMap, pointing DinD at `http://host.minikube.internal:5050`
- Mount `daemon.json` into the DinD sidecar at `/etc/docker/daemon.json` via `subPath`
- Docker Hub pulls during Dagger CI builds will now route through Zot's pull-through cache, reducing bandwidth and avoiding rate limits

## Deployment and Testing
- [ ] `argocd app sync forgejo-runner`
- [ ] Exec into DinD container: `docker info` should show the registry mirror
- [ ] Trigger a workflow build and check Zot logs for cache hits

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/183
2026-02-13 12:36:03 -08:00
e364bdd238 Upgrade Node.js from 20 to 22 LTS (#182)
Some checks failed
Build Container / build (push) Failing after 11m14s
## Summary
- Upgrade Dagger docs build image from `node:20-slim` to `node:22-slim`
- Upgrade forgejo-runner container from Node 20 to Node 22
- Fixes Quartz 4.5.2 `EBADENGINE` warning (requires Node >= 22)
- Node 20 EOL is 2026-04-30

Both builds verified locally via Dagger.

## Deployment
1. Merge this PR
2. Tag and release forgejo-runner v3.2.0: `mise run container-tag-and-release forgejo-runner v3.2.0`
3. Update RUNNER_LABELS version in `argocd/manifests/forgejo-runner/deployment.yaml` from `v3.1.0` to `v3.2.0`
4. `argocd app sync forgejo-runner`

The Dagger docs build change takes effect immediately on merge (no container release needed).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/182
2026-02-13 11:07:41 -08:00
Forgejo Actions
02b1397f1a Update docs release to v1.8.2
- Built changelog from towncrier fragments

[skip ci]
2026-02-13 10:36:04 -08:00
0098ac37e0 Move non-secret runner env vars to deployment spec (#181)
## Summary
- Move FORGEJO_URL, RUNNER_NAME, and RUNNER_LABELS from ExternalSecret template to deployment env vars
- ExternalSecret now only contains the actual secret (RUNNER_TOKEN)
- Image version changes in RUNNER_LABELS now trigger automatic pod rollouts

## Deployment
1. Merge this PR
2. `argocd app sync forgejo-runner` — the deployment spec change will auto-roll the pod

No manual restart needed — that's the whole point :)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/181
2026-02-13 10:29:23 -08:00
2fad8db639 Add yq to forgejo-runner and replace sed YAML edits (#180)
All checks were successful
Build Container / build (push) Successful in 1m31s
## Summary
- Install yq in the forgejo-runner container image for structured YAML editing
- Replace fragile `sed` regex patterns with `yq` in `build-blumeops.yaml` and `cv-deploy.yaml` workflows

## Deployment
1. Merge this PR
2. Tag and release forgejo-runner v3.1.0: `mise run container-tag-and-release forgejo-runner v3.1.0`
3. Update runner label in `argocd/manifests/forgejo-runner/external-secret.yaml` from `v3.0.2` to `v3.1.0`
4. Sync the forgejo-runner app: `argocd app sync forgejo-runner`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/180
2026-02-13 10:20:27 -08:00
48ce5b4120 Recategorize homepage into Content and Misc groups (#179)
## Summary
- Replace the three homepage groups (Apps, Observability, Infrastructure) with two cleaner groups
- **Content**: Immich, Kiwix, Miniflux, DJ, Grafana
- **Misc**: CV, TeslaMate, Transmission, Docs, Prometheus, PyPI

## Deployment and Testing
- [ ] Sync affected ingresses via ArgoCD (all 11 services)
- [ ] Verify homepage shows the two new groups correctly

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/179
2026-02-13 09:09:22 -08:00
Forgejo Actions
e21277ae83 Update docs release to v1.8.0
- Built changelog from towncrier fragments

[skip ci]
2026-02-12 19:20:27 -08:00
517080aeab Add reference/tools/ category with Dagger, ArgoCD CLI, Ansible, and Pulumi cards (#178)
## Summary

- Create `docs/reference/tools/` with four reference cards: Dagger (build engine), ArgoCD CLI (deployment workflows), Ansible (config management), and Pulumi (DNS/Tailscale IaC)
- Move `ansible/roles.md` → `tools/ansible.md`, broadened with CLI patterns and dry-run usage
- Update `reference.md` index: add "Tools" section, remove old "Ansible" section
- Update `update-documentation.md` to reflect Dagger build process (workflow steps, manual build recipe, runner environment)
- Update `adopt-dagger-ci.md` plan to note how-to articles were handled via reference card + existing how-to updates
- Fix all broken `[[roles]]` wiki-links across 5 files → `[[ansible]]`

## Verification

- `docs-check-links` ✓ — no broken wiki-links
- `docs-check-index` ✓ — all docs referenced in category index
- `docs-check-filenames` ✓ — no duplicate filenames
- All pre-commit hooks pass

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/178
2026-02-12 19:18:46 -08:00
9c789a1868 Fix cache hit rate on APM and Fly.io dashboards (#177)
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 1m19s
## Summary
- Remove `match_all = true` from `flyio_nginx_cache_requests_total` in Alloy so the metric only counts requests that go through the proxy cache (excludes health checks with empty `cache_status`)
- Change dashboard queries from `rate(...[5m])` to `increase(...[$__range])` — aggregates over the full dashboard time window instead of a 5-minute sliding window, giving meaningful ratios for low-traffic static sites
- Add null/NaN value mapping to show "No traffic" in neutral color instead of blank/red

## Root cause
Health check requests from Fly.io hit the default nginx server block (no `proxy_cache`), producing entries with empty `upstream_cache_status`. With `match_all = true`, these were counted in the cache metric, diluting the Fly.io dashboard ratio. For APM dashboards, `rate()[5m]` on low-traffic sites with 24h cache validity almost always returns either all-HITs (100%) or no data (blank → red background).

## Deployment
- Fly.io proxy redeploy needed for Alloy config change
- ArgoCD sync for dashboard ConfigMap changes

## Test plan
- [ ] Redeploy Fly.io proxy
- [ ] Sync grafana-config in ArgoCD
- [ ] Verify CV APM cache hit ratio shows a real percentage (not 100%)
- [ ] Verify Docs APM shows "No traffic" in neutral color when idle, real ratio when visited
- [ ] Verify Fly.io proxy dashboard cache ratio excludes health checks

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/177
2026-02-12 18:40:48 -08:00
9717863f65 Update CV release to v1.0.3, add X-Clacks-Overhead header (#176)
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 1m5s
## Summary
- Update CV release URL from v1.0.2 to v1.0.3
- Add `X-Clacks-Overhead: GNU Terry Pratchett` header to both `docs.eblu.me` and `cv.eblu.me` server blocks in the Fly.io proxy nginx config

## Deployment and Testing
- [ ] Sync CV app: `argocd app sync cv`
- [ ] Verify CV is serving v1.0.3 content
- [ ] Deploy fly proxy (workflow or `mise run fly-deploy`)
- [ ] Verify header: `curl -sI https://docs.eblu.me | grep -i clacks`
- [ ] Verify header: `curl -sI https://cv.eblu.me | grep -i clacks`

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/176
2026-02-12 17:08:22 -08:00
ed5c9c9b48 Update CV release to v1.0.2 (#175)
## Summary
- Update `CV_RELEASE_URL` in cv deployment from v1.0.1 to v1.0.2

## Deployment and Testing
- [ ] `argocd app sync cv` after merge
- [ ] Verify cv.eblu.me serves updated content

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/175
2026-02-12 16:18:55 -08:00
Forgejo Actions
70d8881959 Update docs release to v1.7.1
- Built changelog from towncrier fragments

[skip ci]
2026-02-12 14:13:12 -08:00
7dc03c0af1 Add CV to services-check, update homepage link (#174)
## Summary
- Add CV to services-check (tailnet endpoint + public cv.eblu.me)
- Update CV homepage annotation to point to cv.eblu.me instead of cv.ops.eblu.me

## Deployment and Testing
- [ ] `argocd app sync cv` (homepage link change)
- [ ] `mise run services-check` passes

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/174
2026-02-12 14:10:03 -08:00
df372fccb6 Expose CV publicly at cv.eblu.me (#173)
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 1m57s
## Summary
- Add nginx server block for `cv.eblu.me` (static site, same pattern as docs)
- Add DNS CNAME record in Pulumi (`cv.eblu.me` → `blumeops-proxy.fly.dev`)
- Add `cv.eblu.me` cert to `fly-setup` mise task
- Tag CV Tailscale ingress with `tag:flyio-target` for ACL access
- Remove `/_error` test endpoint from docs proxy

## Deployment and Testing
- [ ] `argocd app set cv --revision cv/public-cv-eblu-me && argocd app sync cv`
- [ ] `fly certs add cv.eblu.me -a blumeops-proxy`
- [ ] `mise run fly-deploy`
- [ ] Verify proxy: `curl -I -H "Host: cv.eblu.me" https://blumeops-proxy.fly.dev/`
- [ ] `mise run dns-preview` then `mise run dns-up`
- [ ] Verify live: `curl -I https://cv.eblu.me`
- [ ] Merge, then `argocd app set cv --revision main && argocd app sync cv`

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/173
2026-02-12 14:05:00 -08:00
a68542a602 Update CV release to v1.0.1 (#172)
## Summary
- Update `CV_RELEASE_URL` from v0.1.0 to v1.0.1 in the CV deployment manifest

## Deployment and Testing
- [ ] `argocd app sync cv` after merge
- [ ] Verify cv.ops.eblu.me serves updated resume

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/172
2026-02-12 13:38:05 -08:00
Forgejo Actions
200be39492 Update docs release to v1.7.0
- Built changelog from towncrier fragments

[skip ci]
2026-02-12 11:46:38 -08:00
63e67b927a Add CV service reference card and docs updates (#171)
## Summary
- Add CV service reference card (`docs/reference/services/cv.md`)
- Add CV to services index, ArgoCD apps registry, and Caddy proxied services table
- Changelog fragment

## Files changed
- `docs/reference/services/cv.md` (new) — CV reference card
- `docs/reference/reference.md` — Add CV to services table
- `docs/reference/kubernetes/apps.md` — Add CV to app registry
- `docs/reference/services/caddy.md` — Add CV to proxied k8s services
- `docs/changelog.d/cv-docs.doc.md` (new) — Changelog fragment

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/171
2026-02-12 11:45:32 -08:00
545433055e Add release artifact workflow how-to and changelog fragments (#170)
All checks were successful
Build Container / build (push) Successful in 17s
## Summary
- How-to guide for creating release artifact workflows with Forgejo packages
- Changelog fragment for the multi-repo forgejo_actions_secrets Ansible role change
- Changelog fragment for the new docs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/170
2026-02-12 11:20:05 -08:00
01e19023ee Add CV/resume web app at cv.ops.eblu.me (#169)
## Summary
- nginx container (`containers/cv/`) downloads and serves a content tarball at startup (same pattern as quartz)
- ArgoCD app + k8s manifests (deployment, service, Tailscale ingress)
- Caddy route for `cv.ops.eblu.me`
- Deploy workflow: resolves "latest" or specific version from Forgejo packages, updates deployment, syncs ArgoCD
- Content is built and released from the separate [cv repo](https://forge.ops.eblu.me/eblume/cv)

## Deployment steps (after merge)
1. `mise run container-tag-and-release cv v1.0.0`
2. Run "Release CV" workflow in cv repo (SPECIFIC_VERSION `v0.1.0`)
3. Run "Deploy CV" workflow in blumeops (default: latest)
4. `mise run provision-indri -- --tags caddy`
5. Verify at `https://cv.ops.eblu.me/`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/169
2026-02-12 11:09:41 -08:00
Forgejo Actions
a800bdc8b9 Update docs release to v1.6.9
- Built changelog from towncrier fragments

[skip ci]
2026-02-11 21:28:40 -08:00
1fc0c1b18b Fix Dagger towncrier container using UTC instead of local time (#167)
## Summary
- The `build_changelog` Dagger container (`python:3.12-slim`) defaults to UTC, causing towncrier to stamp tomorrow's date when releases are cut in the evening PST.
- This is the root cause of the docs website (built via Dagger) showing Feb 12 while the repo CHANGELOG (built directly on the runner) correctly showed Feb 11.
- Fix: set `TZ=America/Los_Angeles` in the Dagger container before running towncrier.

## Verified
- `docker run --rm python:3.12-slim` → `towncrier _get_date()` returns `2026-02-12` (wrong)
- `docker run --rm -e TZ=America/Los_Angeles python:3.12-slim` → returns `2026-02-11` (correct)

## Test plan
- [ ] Merge, then trigger a build-blumeops release
- [ ] Verify the CHANGELOG date on https://docs.eblu.me/CHANGELOG matches the repo

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/167
2026-02-11 21:27:35 -08:00
Forgejo Actions
b36b30ef7a Update docs release to v1.6.8
- Built changelog from towncrier fragments

[skip ci]
2026-02-11 21:12:50 -08:00
f3319c753c Update deploy-k8s-service doc with ProxyGroup ingress pattern (#166)
## Summary
- Updated the "Configure Ingress" section to use the current ProxyGroup pattern (`proxy-group: "ingress"`, `defaultBackend`, `tls.hosts`)
- Replaced the old per-ingress proxy example that used `rules:` with `host:` (which breaks ProxyGroup routing)
- Added key points explaining why `defaultBackend` is required and what each annotation does
- Updated checklist to mention ProxyGroup

## Test plan
- [ ] Review rendered doc for accuracy against existing ingress manifests

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/166
2026-02-11 21:10:42 -08:00
Forgejo Actions
0528a6f712 Update docs release to v1.6.7
- Built changelog from towncrier fragments

[skip ci]
2026-02-11 18:07:11 -08:00
1fd54dbab7 Close Dagger CI plan (Phases 1-3 complete) (#165)
## Summary

- Move `adopt-dagger-ci.md` to new `docs/how-to/plans/completed/` archive
- Update plan status to "Phases 1–3 complete" with all verification checklists checked
- Update runner description to reflect what actually stayed (Docker CLI, Node.js needed)
- Document known issue: changelog dates still show UTC
- Create completed plans index, update plans and how-to indexes
- Changelog fragment

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/165
2026-02-11 18:04:39 -08:00
Forgejo Actions
9e0487b523 Update docs release to v1.6.6
- Built changelog from towncrier fragments

[skip ci]
2026-02-11 17:57:59 -08:00
24d2a55f9e Restore Docker CLI to runner image for Dagger engine (#164)
Some checks failed
Build Container / build (push) Failing after 3s
## Summary

- Dagger shells out to the `docker` binary to provision its BuildKit engine container
- Phase 3 removed `docker-ce-cli`, breaking all `dagger call` invocations in CI
- This restores `docker-ce-cli` (without buildx/skopeo — those aren't needed)

## Test plan

- [ ] Build locally, release as v3.0.2, update manifest, sync
- [ ] Trigger docs build workflow and verify Dagger engine starts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/164
2026-02-11 17:49:33 -08:00
23fb036b92 Restore Node.js to runner image for JavaScript Actions (#163)
Some checks failed
Build Container / build (push) Failing after 2s
## Summary

- Restores Node.js 20 LTS to the Forgejo runner job image
- `actions/checkout@v4` and other JavaScript Actions require `node` in the job container
- The Phase 3 simplification (PR #162) accidentally removed it, breaking all CI runs

## Changes

- `containers/forgejo-runner/Dockerfile`: Add `gnupg` (for nodesource GPG key) and Node.js 20 via nodesource
- Changelog fragment

## Test plan

- [ ] Merge, release as `forgejo-runner-v3.0.1`
- [ ] Update runner manifest to v3.0.1, sync, restart pod
- [ ] Trigger a workflow_dispatch and verify `actions/checkout` succeeds

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/163
2026-02-11 17:35:33 -08:00
95364dcb48 Simplify runner image (Dagger Phase 3) (#162)
All checks were successful
Build Container / build (push) Successful in 1m13s
## Summary

With Phases 1 and 2 complete, the runner image no longer needs most of its bundled tools. This PR strips it down and adds what was missing.

**Removed** (now inside Dagger containers):
- Node.js 24.x
- Docker CLI + buildx plugin
- skopeo
- gnupg, lsb-release, xz-utils

**Added:**
- `tzdata` — fixes the TZ env var (#159, #160, #161) so `TZ=America/Los_Angeles` actually works
- `flyctl` — was being installed from scratch every release

**Workflow changes:**
- Remove "Ensure Dagger CLI" bootstrap steps from both workflows (Dagger is in the image)
- Remove "Install flyctl" step from build-blumeops (flyctl is in the image)
- Remove job-level `TZ` from build-blumeops (moved to runner configmap `runner.envs`)
- Set `TZ: America/Los_Angeles` in runner configmap so all job containers inherit it

## Deployment

After merge:
1. Build and release the new runner image: `mise run container-release forgejo-runner v2.0.0`
2. Sync the runner: `argocd app sync forgejo-runner`
3. Verify: `kubectl -n forgejo-runner exec deploy/forgejo-runner -c runner -- date` (but the real test is running a docs release and checking the changelog date)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/162
2026-02-11 17:24:20 -08:00
Forgejo Actions
6ce03df819 Update docs release to v1.6.4
- Built changelog from towncrier fragments

[skip ci]
2026-02-12 01:01:23 +00:00
42ebc2b122 Fix Forgejo runner timezone (UTC -> America/Los_Angeles) (#159)
## Summary

- Set `TZ=America/Los_Angeles` on the Forgejo runner container

The runner pod defaults to UTC. When releases are cut in the evening PST, towncrier stamps changelog entries with tomorrow's date (e.g., v1.6.2 shows 2026-02-12 despite being released on the evening of Feb 11 PST).

## Deployment

After merge, sync the forgejo-runner ArgoCD app:
```
argocd app sync forgejo-runner
```

The runner pod will restart with the new timezone. Note: the v1.6.2 changelog entry will remain dated 2026-02-12; future entries will use PST dates, so dates may appear non-sequential once.

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/159
2026-02-11 16:53:41 -08:00
b0bac91ca9 Fix frontmatter field name for Quartz date display (#158)
## Summary

- Rename `date-modified` -> `modified` in all 80 docs and the `docs-check-frontmatter` task

Quartz's `CreatedModifiedDate` plugin recognizes `modified`, `lastmod`, `updated`, and `last-modified` — but not `date-modified`. The wrong field name caused Quartz to ignore frontmatter dates entirely and fall through to filesystem timestamps (UTC inside Dagger), showing Feb 12 on pages built late on Feb 11 PST.

## Test plan

- [x] `mise run docs-check-frontmatter` passes
- [ ] Kick off docs release after merge — verify rendered dates match frontmatter values

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/158
2026-02-11 16:45:12 -08:00
Forgejo Actions
a75089d8ef Update docs release to v1.6.2
- Built changelog from towncrier fragments

[skip ci]
2026-02-12 00:35:02 +00:00
b197bd5f58 Adopt Dagger CI for docs build (Phase 2) (#157)
## Summary

Migrates the docs build pipeline to Dagger (Phase 2 of the Dagger CI adoption plan).

- **Backfill `date-modified` frontmatter** on all 80 docs — Dagger's `--src=.` excludes `.git`, so Quartz can't use git history for page dates. Frontmatter dates work with or without git.
- **New `docs-check-frontmatter` mise task + pre-commit hook** — validates all docs have `title`, `tags`, and `date-modified`
- **New Dagger functions** — `build_changelog` (towncrier in Python container) and `build_docs` (chains changelog → Quartz build in Node container, returns tarball)
- **Simplified CI workflow** — the ~44-line inline Quartz build (clone, npm ci, build, tar, cleanup) is replaced by `dagger call build-docs`. Changelog step remains local on the runner since towncrier needs to modify the host working tree for the git commit.

### Design decisions

- **Towncrier runs twice in CI**: once inside Dagger (for the docs tarball) and once on the runner (for the git commit). This is intentional — Dagger's directory export is additive and can't delete the consumed changelog fragments from the host.
- **Artifact hosting stays on Forgejo Releases** (not migrated to Forgejo Packages as the plan doc originally suggested). That migration can happen independently.
- **`date-modified` frontmatter** preserved even though `build_changelog` installs git — the git there is only for towncrier's `git add` call, not for history. The local iteration story (`dagger call build-docs --src=. --version=dev` with uncommitted changes) depends on frontmatter dates.

### Local iteration

```bash
dagger call build-docs --src=. --version=dev export --path=./docs-dev.tar.gz
tar tf docs-dev.tar.gz | head -20
```

## Deployment and Testing

- [x] `dagger call build-docs --src=. --version=dev` produces valid 1.1MB tarball (149 HTML pages)
- [x] Pre-commit hooks pass (including new `docs-check-frontmatter`)
- [ ] Full `workflow_dispatch` run after merge

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/157
2026-02-11 16:33:16 -08:00
738b39a321 Mark Phase 1 verification checklist items as complete
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 15:49:49 -08:00
1bc2b421a8 Adopt Dagger CI for container builds (Phase 1) (#156)
All checks were successful
Build Container / build (push) Successful in 13s
## Summary

- Add Dagger Python module (`.dagger/`) with `build` and `publish` functions for container images
- Replace Docker buildx + skopeo composite action with `dagger call publish` in `build-container.yaml`
- BuildKit's native push is compatible with Zot — **skopeo workaround eliminated**
- Add Dagger CLI (v0.19.11) to forgejo-runner Dockerfile, bump runner to v2.6.0
- Bootstrap step in workflow curl-installs dagger if not in runner (for first build on v2.5.1 runner)
- Delete old `.forgejo/actions/build-push-image/` composite action
- Add GPLv3 LICENSE

## Verified locally

- `dagger call build --src=. --container-name=nettest` — builds ✓
- `dagger call publish --src=. --container-name=nettest --version=dagger-test` — pushed to Zot ✓
- `dagger call build --src=. --container-name=forgejo-runner` — new runner image builds ✓
- Dagger CLI accessible inside built runner image ✓

## Deployment sequence (after merge)

1. `mise run container-tag-and-release forgejo-runner v2.6.0` — old runner bootstraps dagger via curl, builds new runner
2. `argocd app sync forgejo-runner` — runner restarts with v2.6.0 (dagger baked in)
3. `mise run container-tag-and-release nettest v0.13.0` — end-to-end test of new pipeline
4. `mise run container-list` — verify tags

## Not included (future phases)

- Phase 2: docs build + Forgejo packages migration
- Phase 3: runner simplification (remove skopeo, Node.js, etc.)
- Phase 4: future workflows

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/156
2026-02-11 15:38:31 -08:00
faebc98c3c Fix blumeops-tasks for Todoist API v1 migration (#155)
## Summary
- Migrate from deprecated Todoist REST API v2 (`410 Gone`) to new unified API v1
- Add cursor-based pagination for project and task listing endpoints
- Switch 1Password credential retrieval from `op item get --fields` to `op read`

## Testing
- [x] `mise run blumeops-tasks` returns all 9 tasks successfully
- [x] Pre-commit hooks pass

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/155
2026-02-11 14:33:37 -08:00
Forgejo Actions
362ae22ab7 Update docs release to v1.6.1
- Built changelog from towncrier fragments

[skip ci]
2026-02-11 21:37:34 +00:00
3c4b5b6c10 Add changelog fragment for cache purge BusyBox fix
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 13:36:44 -08:00
Forgejo Actions
eca01a9546 Update docs release to v1.6.0
- Built changelog from towncrier fragments

[skip ci]
2026-02-11 21:33:57 +00:00
0efcce2984 Purge Fly.io proxy cache after docs release (#154)
## Summary
- The Fly.io nginx proxy caches docs responses for 24h (`proxy_cache_valid 200 1d`)
- After a release, docs.eblu.me kept serving stale content until the cache expired
- This caused v1.5.4 to show v1.5.3 on the CHANGELOG page
- Adds `flyctl` install and `fly ssh console` cache purge steps to the build workflow, running after the ArgoCD deploy completes

## Test plan
- [ ] Next release should show the correct version on docs.eblu.me/CHANGELOG immediately
- [ ] Verify the `fly ssh console` command succeeds in the workflow logs

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/154
2026-02-11 13:33:26 -08:00
Forgejo Actions
ab6661f5dd Update docs release to v1.5.4
- Built changelog from towncrier fragments

[skip ci]
2026-02-11 20:17:12 +00:00
a59ff04249 Review security-model.md (#153)
## Summary
- Fix Ansible secret example: replaced incorrect `op item get --fields` with `op read` to match project convention
- Add new "Tailscale Operator Privileges" section documenting the operator's namespaced RBAC and OAuth client permissions
- Stamp `last-reviewed: 2026-02-11`

## Review Notes
First review of this doc (previously never reviewed). Verified:
- All wiki-links resolve
- ACL structure matches actual `pulumi/tailscale/policy.hujson`
- TruffleHog pre-commit config exists as documented
- Ansible `op read` pattern matches actual usage in playbooks/roles

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/153
2026-02-11 12:16:32 -08:00
834c9fa57b Bump Fly.io proxy VM to 512MB, fix TruffleHog scanning (#152)
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 1m37s
## Summary
- Bump Fly.io proxy VM memory from 256MB to 512MB — Alloy was OOM-killed, causing the Grafana Fly.io dashboard to lose metrics
- Fix TruffleHog pre-commit hook to scan only staged changes (`--since-commit HEAD`) instead of full repo history
- Sanitize example credential URL in Reolink camera plan doc

## Deployment and Testing
- [ ] Fly.io deploy triggers automatically on merge (workflow watches `fly/**`)
- [ ] After deploy, verify Alloy is running: `fly ssh console -a blumeops-proxy -C "ps aux"` should show alloy process
- [ ] Grafana Fly.io dashboard should start populating within ~1 minute

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/152
2026-02-11 12:03:51 -08:00
651fed8f1a Transcribe backlog tasks into plan documents (#151)
## Summary
- **adopt-oidc-provider:** Dex-based OIDC identity provider for SSO across services (status: Planning — service dependency/recovery design needed)
- **harden-zot-registry:** OIDC + API key auth and tag immutability for zot (depends on OIDC provider + Dagger CI)
- **forgejo-actions-dashboard:** Custom textfile Prometheus exporter + Grafana dashboard for Forgejo Actions CI metrics
- **operationalize-reolink-camera:** Cloud-free Frigate NVR with ONNX detection, NFS ring buffer recording to sifaka (depends on network segmentation)
- **add-unifi-pulumi-stack:** Expanded with NFS security motivation, BlumeOps Services subnet, IoT/appliance segregation, firewall rules

## Test plan
- [x] Pre-commit hooks pass (all 3 commits)
- [x] `docs-check-links` passes
- [x] `docs-check-index` passes
- [x] `docs-check-filenames` passes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/151
2026-02-11 11:47:23 -08:00