Removed descriptions, table formatting, and Mikado chain commentary
from the how-to index — it should be links only. Added last-reviewed
date.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Built locally to break the chicken-and-egg: the old runner couldn't
build its own replacement because it needed Dagger 0.20.0.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Dagger module was upgraded to v0.20.0 in d15071a but the runner job
image still had the old CLI, causing build-blumeops to fail with a
version mismatch.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The external-secrets operator adds conversionStrategy, decodingStrategy,
and metadataPolicy defaults to the live object, causing perpetual
OutOfSync in ArgoCD. Declare them explicitly to match.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Add Authentik OAuth2 provider + application blueprint for ArgoCD (ringtail side)
- Add OIDC config to ArgoCD ConfigMap with Authentik as identity provider (indri side)
- Map Authentik `admins` group to ArgoCD `role:admin` via RBAC policy
- ExternalSecrets on both sides pull `argocd-client-secret` from 1Password
- Local admin password remains as break-glass — both login methods coexist
## Pre-deployment manual step
Add `argocd-client-secret` field to "Authentik (blumeops)" in 1Password with a random value (e.g., `openssl rand -hex 32`).
## Deployment order
1. Sync Authentik app on ringtail first (blueprint + secret + worker env var)
2. Sync ArgoCD app on indri second (cm, rbac, ExternalSecret)
## Verification
- [ ] `argocd-client-secret` field added to 1Password
- [ ] Authentik app synced on ringtail — blueprint applied, provider created
- [ ] ArgoCD app synced on indri — OIDC config applied
- [ ] SSO login works: visit `https://argocd.ops.eblu.me` → "Log in via Authentik" → admin access
- [ ] Break-glass: local admin/password login still works
Reviewed-on: #284
Minor upstream release with doc and CI fixes. Also corrects kiwix.md
to reference the actual custom registry image and torrents.txt path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Dashboard "Download/Upload Rate by Torrent" panels were querying
transmission_torrent_download_bytes (total_size * percent_done) and
transmission_torrent_upload_bytes (uploaded_ever) — cumulative byte
gauges, not rates. Added new metrics using Transmission's native
rate_download/rate_upload and updated dashboard queries.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The count-only stat wasn't actionable. New table shows pod name, container,
restart count, and memory limit for each OOMKilled container. Waiting reason
panel narrowed to make room.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
OOMKilled containers previously only appeared briefly in "Unhealthy Pods"
while dying, then vanished on restart. New panels use persistent metrics
(last_terminated_reason) and restart rate tracking.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
During finalization, all mikado frontmatter (requires, status, branch) should
be removed — cards become plain documentation linked via wiki-links. Updated
agent-change-process docs and cleaned up 10 cards from closed chains. Also
fixed ai-docs referencing deleted plans/ files.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The plans/ directory predated the mikado method approach. Deleted all
completed and abandoned plans, converted the still-relevant
migrate-forgejo-from-brew into a lean mikado chain root card under
how-to/forgejo/, cleaned up dangling wiki-links across docs, and
fixed a stale "pre-commit" reference to "prek".
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Upgrade Transmission from 4.0.6-r4 to 4.1.1-r1
- Uses Alpine edge community repo for transmission packages, keeping stable alpine:3.22 base
- Fix stale image reference in service doc (was linuxserver, now custom registry image)
- Mark transmission as reviewed in service-versions.yaml
## Context
Service review found Transmission two minor versions behind (4.0.6 → 4.1.1). Alpine 3.22 only packages 4.0.6, so transmission is installed from edge's community repo with an exact version pin.
4.1.0 added improved µTP performance, IPv6/dual-stack UDP tracker, JSON-RPC 2.0 API. 4.1.1 is a bugfix release (20+ fixes).
Dagger test build passed locally.
## Deployment and Testing
- [ ] Build container via Forgejo workflow (`mise run container-build-and-release transmission`)
- [ ] Update kustomization.yaml with new image tag
- [ ] `argocd app set torrent --revision feature/transmission-review && argocd app sync torrent`
- [ ] Verify web UI at https://torrent.ops.eblu.me
- [ ] Check Grafana Transmission dashboard still receives metrics
- [ ] After merge: `argocd app set torrent --revision main && argocd app sync torrent`
## Note
The transmission-exporter sidecar (OOMKilling every ~30min, 294 restarts) is being tracked separately as a future replacement project.
Reviewed-on: #282
C0 changes have no branch name, so `main.<type>.md` fragments collide.
Switch to towncrier's `+<slug>.<type>.md` orphan convention and rename
existing `main.*` fragments.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The database was at /config/frigate.db (emptyDir, ephemeral) instead of
/db/frigate.db (PVC, persistent). Every pod restart wiped the database,
losing all recording history and leaving orphaned files on NFS.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ONNX detector + CUDA ffmpeg + workers consume ~1.9Gi at steady state,
causing intermittent OOMKills at the 2Gi limit.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Add two-stage Dockerfile for Loki (Go build → Alpine runtime) in `containers/loki/`
- Rewrite kustomize image to `registry.ops.eblu.me/blumeops/loki`
- Tag is `v3.6.5-placeholder` until first CI build; will be updated post-build
## Details
- UID 10001 matches existing StatefulSet `securityContext` (runAsUser/fsGroup)
- CGO_ENABLED=0, ldflags embed version via `github.com/grafana/loki/v3/pkg/util/build`
- Clones from `forge.ops.eblu.me/mirrors/loki` (mirror created this session)
- Pattern follows miniflux (two-stage Go) + prometheus (ldflags)
## Deployment and Testing
- [ ] Trigger container build: `mise run container-build-and-release loki`
- [ ] Update kustomize tag to actual build tag
- [ ] Deploy from branch: `argocd app set loki --revision feature/loki-container && argocd app sync loki`
- [ ] Verify `/ready` endpoint and log ingestion
- [ ] After merge: update to `[main]` tag (C0 follow-up)
Reviewed-on: #280
The bash parameter expansion `${var/pat/rep}` treats `\/` in the
replacement as a literal backslash-slash, not an escaped delimiter.
This produced URLs like `https:\/\/eblume:...` instead of
`https://eblume:...`, breaking Forgejo's URL parser (500 on mirror
settings pages) and preventing mirror syncs.
Use prefix stripping (`${var#prefix}`) instead.
All 22 corrupted mirrors have been repaired on indri.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The NixOS firewall was blocking pod-to-host TCP traffic because only
tailscale0 was trusted. Pods could ping the host but not reach the
API server (port 6443), breaking Tailscale Ingress TLS cert refresh
and all ringtail services (authentik, frigate, ntfy, ollama).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Misfiled fragment from feature/ branch created a subdirectory under
changelog.d/ which towncrier doesn't support. Move the fragment to the
correct flat location and add a changelog-check mise task + prek hook
to prevent this from happening again.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Was the only app still using https://forge.eblu.me (public proxy) for
git polling. All other apps already use the internal SSH endpoint at
forge.ops.eblu.me.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The k8s and ringtail runners were hitting forge.eblu.me (fly.io proxy)
for every FetchTask poll (~every 2s), round-tripping through the public
internet unnecessarily. Use forge.ops.eblu.me (Caddy on indri, tailnet)
for infrastructure workloads.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The doc listed a nonexistent configmap.yaml instead of the actual raw
config files (grafana.ini, datasources.yaml, provider.yaml) consumed
by kustomization.yaml's configMapGenerator. Added last-reviewed date.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>