## Summary
- Add how-to guide (`docs/how-to/build-container-image.md`) covering the full container build workflow: directory layout, Dagger local builds, mise release task, and common patterns with links to existing containers
- Port navidrome from upstream `deluan/navidrome:0.60.3` to a custom three-stage build (`containers/navidrome/Dockerfile`) using Node + Go + Alpine
- Update navidrome deployment to use `registry.ops.eblu.me/blumeops/navidrome:v1.0.0`
## Deployment and Testing
- [x] `dagger call build --src=. --container-name=navidrome` builds successfully
- [ ] After merge: `mise run container-tag-and-release navidrome v1.0.0`
- [ ] After image published: `argocd app sync navidrome` and verify pod starts
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/192
## Summary
Review session covering 3 docs, plus a codebase-wide cleanup:
### Docs reviewed
- **connect-to-postgres** — verified end-to-end (psql connection tested), stamped
- **create-release-artifact-workflow** — clarified that `build-blumeops.yaml` is only a version bump example (not a packages API example)
- **deploy-k8s-service** — fixed stale repoURL (`indri:2200` → `forge.ops.eblu.me:2222`), wrong Caddy config keys (`upstream` → `backend`, added missing `host`), updated Homepage group to "Services", added Tailscale tag documentation
### Codebase cleanup
- Migrated all remaining `op item get --fields` calls to `op read` URI syntax across 7 files (docs, READMEs, YAML comments)
- Simplified the `op read` vs `op item get` guidance in CLAUDE.md
## Side findings (not addressed)
- New `immich-pg` CNPG cluster not yet documented in the postgresql reference card
## Test plan
- [x] `psql` connection to `pg.ops.eblu.me` verified
- [x] All pre-commit hooks pass
- [x] `docs-check-links`, `docs-check-index`, `docs-check-frontmatter` pass
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/191
Adds docs/index.md, explanation/explanation.md, how-to/plans/plans.md,
and how-to/plans/completed/completed.md so AI sessions get full
awareness of all doc sections and in-flight plans.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reflect actual UX7 zone-based firewall UI, correct streaming port
(8096 not 443), note indri DHCP reservation, mark plan as completed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Abandon the UniFi Pulumi IaC approach after provider bugs caused a network outage (no-op update reset undeclared properties on the default LAN network)
- Remove untracked IaC artifacts (`pulumi/unifi/`, `mise-tasks/unifi-preview`, `mise-tasks/unifi-up`) locally
- Mark `add-unifi-pulumi-stack` plan as Abandoned with explanation
- Create new `segment-home-network` plan for manual three-network segmentation (Main/IoT/Guest) via UX7 web UI
- Rewrite UniFi reference card to remove all Pulumi/IaC references
- Update plan and how-to indexes
## Test plan
- [x] `docs-check-links` passes
- [x] `docs-check-index` passes
- [x] Pre-commit hooks pass
- [ ] Review segmentation plan for completeness before executing manually
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/189
## Summary
- Add new how-to guide (`connect-to-postgres.md`) with the `psql` command using `op read` for 1Password credentials
- Add "Database" section to the how-to index linking to the new guide
- Link the new guide from the PostgreSQL reference card's Related section
## Test plan
- [x] Verified `psql` connection works from gilbert using the documented command
- [ ] Review doc formatting and content
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/188
Chart 2.3.0 mounts credentials as a file with standard k8s base64
encoding. The old double-encoding workaround (credentials-base64 in
stringData) now produces invalid JSON. Use raw JSON (credentials-file)
instead.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Replace `op item get --fields` with `op read` for secrets (matches playbook and CLAUDE.md guidance)
- Change `tags: [<role>]` to `tags: <role>` to match actual playbook style
- Remove redundant `listen:` from handler example, add `changed_when: true`
- Name handler after specific service (e.g. `Restart <service>`) to match real roles
- Add `last-reviewed: 2026-02-13` frontmatter
## Also noted (not fixed here)
Two other docs still use the old `op item get` pattern:
- `docs/how-to/troubleshooting.md:72` (ArgoCD login command)
- `docs/how-to/gandi-operations.md:35` (Gandi token export)
These can be fixed in their own review cycles.
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/185
## Summary
- Add `daemon.json` with `registry-mirrors` to the forgejo-runner ConfigMap, pointing DinD at `http://host.minikube.internal:5050`
- Mount `daemon.json` into the DinD sidecar at `/etc/docker/daemon.json` via `subPath`
- Docker Hub pulls during Dagger CI builds will now route through Zot's pull-through cache, reducing bandwidth and avoiding rate limits
## Deployment and Testing
- [ ] `argocd app sync forgejo-runner`
- [ ] Exec into DinD container: `docker info` should show the registry mirror
- [ ] Trigger a workflow build and check Zot logs for cache hits
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/183
v3.2.0 build failed (GitHub download timeout), rolling back to
working image while it rebuilds.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Move FORGEJO_URL, RUNNER_NAME, and RUNNER_LABELS from ExternalSecret template to deployment env vars
- ExternalSecret now only contains the actual secret (RUNNER_TOKEN)
- Image version changes in RUNNER_LABELS now trigger automatic pod rollouts
## Deployment
1. Merge this PR
2. `argocd app sync forgejo-runner` — the deployment spec change will auto-roll the pod
No manual restart needed — that's the whole point :)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/181
## Summary
- Install yq in the forgejo-runner container image for structured YAML editing
- Replace fragile `sed` regex patterns with `yq` in `build-blumeops.yaml` and `cv-deploy.yaml` workflows
## Deployment
1. Merge this PR
2. Tag and release forgejo-runner v3.1.0: `mise run container-tag-and-release forgejo-runner v3.1.0`
3. Update runner label in `argocd/manifests/forgejo-runner/external-secret.yaml` from `v3.0.2` to `v3.1.0`
4. Sync the forgejo-runner app: `argocd app sync forgejo-runner`
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/180
Replace old Apps/Observability/Infrastructure layout entries with
Content and Misc to match the recategorized ingress annotations.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ArgoCD's tailscale ingress was missed in the recategorization (filed as
service-tailscale.yaml instead of ingress-tailscale.yaml). Fix the group
annotation and rename the file to match the convention used by all other
services.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Replace the three homepage groups (Apps, Observability, Infrastructure) with two cleaner groups
- **Content**: Immich, Kiwix, Miniflux, DJ, Grafana
- **Misc**: CV, TeslaMate, Transmission, Docs, Prometheus, PyPI
## Deployment and Testing
- [ ] Sync affected ingresses via ArgoCD (all 11 services)
- [ ] Verify homepage shows the two new groups correctly
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/179
## Summary
- Remove `match_all = true` from `flyio_nginx_cache_requests_total` in Alloy so the metric only counts requests that go through the proxy cache (excludes health checks with empty `cache_status`)
- Change dashboard queries from `rate(...[5m])` to `increase(...[$__range])` — aggregates over the full dashboard time window instead of a 5-minute sliding window, giving meaningful ratios for low-traffic static sites
- Add null/NaN value mapping to show "No traffic" in neutral color instead of blank/red
## Root cause
Health check requests from Fly.io hit the default nginx server block (no `proxy_cache`), producing entries with empty `upstream_cache_status`. With `match_all = true`, these were counted in the cache metric, diluting the Fly.io dashboard ratio. For APM dashboards, `rate()[5m]` on low-traffic sites with 24h cache validity almost always returns either all-HITs (100%) or no data (blank → red background).
## Deployment
- Fly.io proxy redeploy needed for Alloy config change
- ArgoCD sync for dashboard ConfigMap changes
## Test plan
- [ ] Redeploy Fly.io proxy
- [ ] Sync grafana-config in ArgoCD
- [ ] Verify CV APM cache hit ratio shows a real percentage (not 100%)
- [ ] Verify Docs APM shows "No traffic" in neutral color when idle, real ratio when visited
- [ ] Verify Fly.io proxy dashboard cache ratio excludes health checks
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/177
## Summary
- How-to guide for creating release artifact workflows with Forgejo packages
- Changelog fragment for the multi-repo forgejo_actions_secrets Ansible role change
- Changelog fragment for the new docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/170
Uses subelements loop to sync secrets across repos. Adds FORGE_TOKEN
to the cv repo for package uploads.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- nginx container (`containers/cv/`) downloads and serves a content tarball at startup (same pattern as quartz)
- ArgoCD app + k8s manifests (deployment, service, Tailscale ingress)
- Caddy route for `cv.ops.eblu.me`
- Deploy workflow: resolves "latest" or specific version from Forgejo packages, updates deployment, syncs ArgoCD
- Content is built and released from the separate [cv repo](https://forge.ops.eblu.me/eblume/cv)
## Deployment steps (after merge)
1. `mise run container-tag-and-release cv v1.0.0`
2. Run "Release CV" workflow in cv repo (SPECIFIC_VERSION `v0.1.0`)
3. Run "Deploy CV" workflow in blumeops (default: latest)
4. `mise run provision-indri -- --tags caddy`
5. Verify at `https://cv.ops.eblu.me/`
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/169
## Summary
- The `build_changelog` Dagger container (`python:3.12-slim`) defaults to UTC, causing towncrier to stamp tomorrow's date when releases are cut in the evening PST.
- This is the root cause of the docs website (built via Dagger) showing Feb 12 while the repo CHANGELOG (built directly on the runner) correctly showed Feb 11.
- Fix: set `TZ=America/Los_Angeles` in the Dagger container before running towncrier.
## Verified
- `docker run --rm python:3.12-slim` → `towncrier _get_date()` returns `2026-02-12` (wrong)
- `docker run --rm -e TZ=America/Los_Angeles python:3.12-slim` → returns `2026-02-11` (correct)
## Test plan
- [ ] Merge, then trigger a build-blumeops release
- [ ] Verify the CHANGELOG date on https://docs.eblu.me/CHANGELOG matches the repo
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/167
## Summary
- Updated the "Configure Ingress" section to use the current ProxyGroup pattern (`proxy-group: "ingress"`, `defaultBackend`, `tls.hosts`)
- Replaced the old per-ingress proxy example that used `rules:` with `host:` (which breaks ProxyGroup routing)
- Added key points explaining why `defaultBackend` is required and what each annotation does
- Updated checklist to mention ProxyGroup
## Test plan
- [ ] Review rendered doc for accuracy against existing ingress manifests
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/166