-
BlumeOps v1.17.0 Stable
released this
2026-06-03 21:52:18 -07:00 | 13 commits to main since this releaseBlumeOps release v1.17.0
What's Changed
Features
-
Deploy the Adelaide / Heidi / Addie baby shower app — guest splash, raffle
picker, and prize assignment console — on ringtail k3s withshower.eblu.me
as the public entry andshower.ops.eblu.meas the tailnet admin host. App
source:adelaide-baby-shower-app. -
Deploy adelaide-baby-shower-app v1.1.0 to ringtail k3s. Replaces the
boolean lock with a four-phaseShowerState(pre_event→party→
prizes_locked→event_locked), adds an append-only "guest memories"
panel where guests can leave photos and comments for the baby, and
polishes the admin and QR views. Three Django migrations
(0009_shower_phase,0010_guest_memories,0011_book_description)
run automatically in the entrypoint against the SQLite PV. No config
or env-var changes.Container build also gains a Forgejo-PyPI workaround: Forgejo's simple
index returns absolute file URLs hardcoded to the public ROOT_URL
(forge.eblu.me), which the Fly edge 403s on/api/packages/*. The
wheel and sdist are now both pulled via directfetchurlagainst
forge.ops.eblu.me(tailnet-only) and the wheel is handed to pip as
a local path. -
review-compliance-reportsnow also fetches and summarizes the weekly Prowler container-image and IaC scans (previously only the K8s CIS in-cluster scan was processed). For each scan it shows status counts, severity breakdown, week-over-week delta, and — for the high-volume image/IaC scans — top-N tables grouped by check ID and resource instead of per-finding listings. -
runner-logs now authenticates with Forgejo API token and auto-detects the repo from git remote. Job logs are fetched via SSH to indri (reading Forgejo's on-disk zstd log files) instead of the web endpoint, which doesn't support token auth for private repos.
Bug Fixes
-
Fix nightly borgmatic backups failing for 2 days. The shower SQLite
dump hook referencedkubectl --context=k3s-ringtail, but indri's
kubeconfig deliberately doesn't carry the ringtail credentials. The
before_backuphook's failure aborted the entire run, taking out
both the local sifaka repo and the BorgBase offsite. Replaced
the inline-shell dump with a~/bin/borgmatic-k8s-sqlite-dump
helper deployed by the ansible role. Each dump entry now declares a
targetof eitherlocal:<context>(mealie — kubectl uses indri's
kubeconfig) orssh:<user@host>(shower — ssh into ringtail and
runk3s kubectlthere, no indri-side kubeconfig needed; k3s.yaml
on ringtail is mode 644 so no sudo required). Bytes stream back via
kubectl exec ... -- catrather thankubectl cp, sincekubectl cprequirestarinside the pod and nix-built images like shower
don't bundle it. -
Shower app container now bakes the wheel + Python deps into the image
at build time viabuildPythonPackageinstead of pip-installing on
first boot. Boots are deterministic and don't depend on forge PyPI
being reachable from the pod. ThewheelHashin
containers/shower/default.nixis the sha256 sourced from the
forge PyPI simple index;
bumping the version means bumping that hash too.Borgmatic now covers the shower app: SQLite is dumped from the live
pod viakubectl exec(mirroring the existing mealie entry, with
context: k3s-ringtail), and the prize-photo media share is picked up
through/Volumes/shower(sifaka SMB mount on indri, same pattern as
/Volumes/photos). -
Disabled adaptive sync (VRR) on ringtail's DP-1 output. The OMEN 27i IPS panel pumps brightness when its refresh rate swings into the low VRR range during low-framerate content (e.g. game cutscenes), producing a flicker that worsened over a session until a reboot. Pinning the panel to a fixed 165Hz eliminates it.
-
Fixed forge.eblu.me static assets (CSS, JS, images, fonts) not loading — the proxy's static asset cache block was missing the
Hostheader, so Caddy couldn't route the requests. -
Fixed homepage container EACCES on cold start: the nix-built image now chowns
/app/configto uid 1000 at build time viafakeRootCommands, matching the
behavior of the old Dockerfile. Without this, homepage couldn't seed missing
skeleton configs (proxmox.yaml etc.) or create/app/config/logs, crashing on
its first uncached request. Caught during the ringtail cutover. -
Fixed sway keybindings on ringtail — the home-manager
keybindingsblock was replacing the module's defaults entirely, leaving only explicit overrides (no workspace switching, focus, move, splits, resize mode, etc). Switched tolib.mkOptionDefaultwithlib.mkForceon the conflicting custom binds (Mod+Return,Mod+d,Mod+space,Mod+l) so defaults merge back in. Also addedMod+F1to show a filterable fuzzel list of current keybindings.Fixed fuzzel config errors on launch —
border-radiusandborder-widthwere under[main], but fuzzel expects them asradius/widthunder a[border]section. -
Pin the Quartz docs build to v4.5.2. The Dagger
build_docspipeline cloned Quartz from the default branch unpinned; Quartz v5.0.0 restructured its config layout (.quartz/plugins,../quartzimports) and broke the docs build against our existingquartz.config.ts/quartz.layout.ts.
Infrastructure
-
Wire the ringtail
blumeops-pgcluster (which holds the wave-1-migrated
paperless + teslamate databases) into backups and Grafana. Adds a Tailscale
LoadBalancer Service (blumeops-pg-ringtail.tail8d86e.ts.net) and a Caddy L4
route (pg.ops.eblu.me:5434), then repoints borgmatic'steslamate+
paperlesspostgres dumps and themealieSQLite dump at ringtail, and the
Grafana TeslaMate datasource at the ringtail DB. Closes the backup gap that
opened at cutover (the migrated live data was still being backed up from the
now-frozen minikube copies) and unblocks the wave-1 decommission. -
Migrated homepage dashboard from minikube (indri/arm64) to k3s (ringtail/amd64).
The container is now built via nix (containers/homepage/default.nix), adapted
from nixpkgshomepage-dashboardwith the upstream Next.js cache patches and
wrapped withdockerTools.buildLayeredImage. Autodiscovery shifts: services on
minikube (ArgoCD, Immich, Kiwix, Mealie, Miniflux, Grafana, Prometheus,
Navidrome, Paperless, TeslaMate, Transmission) become explicit static entries
inservices.yaml; ringtail services (Authentik, Frigate/NVR, Ntfy, Ollama)
auto-populate via Ingress annotations. -
Migrated CV (
cv.eblu.me) and Docs (docs.eblu.me) from minikube Deployments to indri-native ansible roles. Caddy now serves the extracted release tarballs directly via a newkind: staticservice-block in the Caddy template — no daemon, no container — replacing the prior nginx-in-a-pod layer. Removes a network hop on every request and shrinks minikube's footprint. See cv-on-indri and docs-on-indri. Part of the broader minikube wind-down. -
Migrated devpi (PyPI mirror at
pypi.ops.eblu.me) from a minikube StatefulSet to a launchd-managed service on indri. devpi-server now runs in a uv-managed venv with pinneddevpi-serveranddevpi-webversions, listens on127.0.0.1:3141, and is fronted by Caddy. The minikube StatefulSet was crash-looping under memory pressure (and breaking the Python toolchain everywhere); the new layout removes a layer of dependency on cluster health for critical-path tooling. See devpi-on-indri. -
Move the entire Immich stack — server, machine-learning, valkey,
and the PostgreSQL+VectorChord cluster — offminikube-indriand
ontok3s-ringtail. Postgres data migrated zero-loss via CNPG
pg_basebackup(replica catch-up then promote); row counts on
asset,user,album,smart_search,activity,asset_face
verified equal between source and replica before cutover. The ML
pod now uses ringtail's RTX 4080 via the nvidia-device-plugin
(time-slicing bumped 2 → 4 to share with frigate + ollama). Caddy
routing atphotos.ops.eblu.meis unchanged (still
photos.tail8d86e.ts.net, the device just lives on ringtail now).
Borgmatic backups continue against the sameimmich-pgtailnet
hostname. First concrete chain in the broader indri-k8s
decommission effort. -
Add local nix container build for
tailscale(containers/tailscale/default.nix) so ringtail's tailscale-operator ProxyClass proxy pods pull from the forge mirror instead ofdocker.io/tailscale/tailscale. Pinned at v1.94.2 to matchservice-versions.yaml. Indri's tailscale-operator continues to use upstream during the k8s-to-ringtail migration. -
Address the 6 critical Prowler IaC findings against
argocd/manifests/. Prowler's IaC provider hardcodesself._mutelist = Noneand delegates filtering to Trivy, but doesn't plumb--ignorefilethrough — so the documented "use Trivy filtering" path is actually broken. Added a shim aroundtrivyin the Prowler image that injects--ignorefile $TRIVY_IGNOREFILEfortrivy fsinvocations when the env var points at a real file. The IaC cronjob now mountsmutelist/trivyignore.yaml(Trivy's per-path schema) and sets the env var, muting theexternal-secretsandkube-state-metricsSecret-access findings (KSV-0041, KSV-0114). Separately,grafana-clusterroleis tightened to removesecretsaccess entirely: the dashboard sidecar already only consumes ConfigMap-labeled dashboards, so itsRESOURCEenv var is nowconfigmapinstead ofboth. -
Pin ringtail's wired IP to
192.168.1.21via NixOS scripted networking; NetworkManager no longer managesenp5s0. Removes DHCP lease renewal as a failure mode after a silent lease teardown took ringtail offline. Also explicitly enablesnet.ipv4.ip_forward(previously set implicitly by scripted-DHCP) so k3s pod networking and Tailscale routing continue to work with static networking. -
Ripped out the compensating-controls (CC) framework: deleted
compensating-controls.yaml, thereview-compensating-controlsmise task, and the associated how-to / explanation docs. Prowler and Kingfisher continue to run weekly and produce reports; the Prowler mutelist YAML files remain in place but no longer carryCC: <id>prefixes — each entry just keeps a free-formDescriptionof why the finding is muted. The CC review cadence proved to be more overhead than this single-operator homelab needed. -
Wire shower app for public exposure: fly nginx
shower.eblu.meserver
block as a guest-only surface — splash page,/prizes/<token>/, static
assets, media. Everything authenticated (/admin/,/host/,
/accounts/) returns 403 with a "tailnet only" pointer. Staff hit
shower.ops.eblu.mefor the operator console + admin; the app's
v1.0.1DJANGO_PUBLIC_URL_BASEsetting makes QR codes generated on
the tailnet point back at the WAN host for guests. Plus a Caddy route
on indri, Pulumi Gandi CNAME, and a Grafana APM dashboard tracking
request rate, error rate, latency, bandwidth, and access logs. -
Mirror Valkey 8.1 locally as
registry.ops.eblu.me/blumeops/valkey. Replaces direct pulls ofdocker.io/valkey/valkey:8.1-alpinefor paperless and immich sidecars. Built via native Dagger pipeline on Alpine 3.22. Stateless swap — no data migration. Authentik's nix-built Redis remains separate. -
Add nix-built amd64 valkey for ringtail (
containers/valkey/default.nix) so immich-ringtail can stop pulling the upstream multi-archdocker.io/valkey/valkeyimage. Existingcontainer.pycontinues to build Alpine arm64 for paperless on indri. Both bump to valkey 8.1.7 (Alpine 3.22 8.1.7-r0 / nixpkgs 8.1.7). -
Upgrade Grafana Alloy v1.14.0 → v1.16.0 across all four service deployments
(alloy-k8s, alloy-ringtail, alloy-tracing-ringtail on k8s; alloy native on
indri). Pulls in stable database observability (v1.15) and the OTel Collector
v0.147.0 bump. Container build also migrated from Dockerfile to native Dagger
container.pyper the build-container-image migration playbook. -
Upgraded Dagger from v0.20.1 to v0.20.6 (engine, CLI pin, and SDK regen) and migrated
runner-job-imagefrom a Debian-based Dockerfile to a native Daggercontainer.pyon Alpine 3.23, reusing the sharedalpine_runtimehelper. -
Decommission the wave-1 services on minikube-indri now that paperless,
teslamate, and mealie run on ringtail with their data backed up. Removes the
minikubepaperless/teslamate/mealiemanifest dirs + ArgoCD app
definitions (pruning the parked Deployments, Services, and the redundant
minikube mealie/paperless PVCs), and drops thepaperless/teslamateroles
from the minikubeblumeops-pgcluster. Thepaperlessandteslamate
databases are dropped from indri's blumeops-pg as the finalization step.
miniflux + authentik remain on the minikube cluster (later waves). -
Upgraded the k8s Forgejo runner to the v12.8 line, switched it from first-boot registration to declarative
server.connectionscredentials from 1Password, and consolidated the supporting runner how-to documentation. -
Move paperless, teslamate, and mealie off
minikube-indrionto
k3s-ringtail, shedding ~1.1 GiB of resident load from the
OOM-thrashing 8 GiB minikube node (the kernel OOM killer had been
killingkube-apiserver/dockerd/argocd, flapping every
minikube-hosted service at once). paperless + teslamate databases
move into a fresh CNPGblumeops-pgcluster on ringtail via a cold
pg_dump/pg_restorefrom the quiesced source — row counts verified
equal before any routing flip; source DBs dropped only after the
ringtail side serves traffic. mealie's SQLite PVC is copied as-is.
paperless media stays on sifaka NFS. Downtime-tolerant cold cutover
(no streaming replication); rollback is repoint-and-scale-up with the
source untouched. Second chain in the indri-k8s decommission after
migrate-immich-to-ringtail. -
Recurring maintenance batch:
- Ringtail flake inputs refreshed (
disko,home-manager,nixpkgs). - Tooling deps bumped: prek hooks (trufflehog v3.95.3, kingfisher v1.101.0, ruff v0.15.14,
ansible-core2.21.0); fly proxy base images (nginx 1.30.1-alpine, alloy v1.16.1);typer==0.26.2in mise tasks.
- Ringtail flake inputs refreshed (
-
Updated
nixos/ringtail/flake.lock(weekly cadence):disko,home-manager, andnixpkgsinputs refreshed.nixpkgs-servicesskipped per overlay convention. -
Reviewed
mealieservice version freshness; upstream is 5 minor versions ahead (v3.17.0 vs deployed v3.12.0). Marked reviewed; upgrade deferred. -
Deploy shower v1.1.2 — bump container build to new app release.
-
Upgrade unpoller v2.34.0 → v3.2.0 and migrate container build from Dockerfile to native Dagger (container.py). v3.0.0 carries breaking UniFi API changes; v3.2.0 introduces a 60s background poll (cached scrapes) by default — set
interval = 0inup.confto restore on-demand polling. -
Monthly tooling dependency refresh: prek hooks (trufflehog, kingfisher, ruff, shfmt, prettier, actionlint, ansible-lint), fly proxy base images (nginx 1.30.0, tailscale v1.94.2, alloy v1.16.0), normalize pyyaml lower bound in mise-tasks.
-
Add GE-Proton (
pkgs.proton-ge-bin) toprograms.steam.extraCompatPackages
on ringtail. Subnautica 2 hangs at Mercuna plugin init under Proton
Experimental + DXVK D3D12; GE-Proton is available as a Steam per-game
compatibility option to work around it. -
Add
sn2-prelaunchSteam launch wrapper on ringtail that removes
Subnautica 2's staleSaved/running.datandSaved/beforelobby.dat
lockfiles before each launch. SN2 pops up an invisible (0×0-sized)
Error dialog when it detects an unclean exit, blocking GameThread
forever; this is observable only as a black screen with a spinning
loader. Use via Steam launch option:sn2-prelaunch %command%. -
Add local nix container build for
frigate-notify(containers/frigate-notify/default.nix) so the Frigate→ntfy bridge is rebuilt on ringtail from the forge mirror instead of pulled fromghcr.io/0x2142/frigate-notify. -
Add resource limits to all ArgoCD pods to prevent unbounded resource consumption during node-wide pressure events.
-
Black-hole the
/mirrors/*repositories at the Fly proxy edge (return 403→forge.ops.eblu.me). A surprise $29.60 Fly bill traced to ~1.24 TB/30d of egress onforge.eblu.me, 99.95% of all proxy egress — of which ~71% was AI scrapers (Metameta-externalagent, OpenAIGPTBot, Amazonbot) crawling the near-infinite git-history URL space of the public mirror repos and timing out Forgejo in the process. Mirrors exist for supply-chain control and are consumed over the tailnet, so their public web UI had no legitimate audience.robots.txtalready disallowed/mirrors/, but the offending agents ignore it. Tier-2 mitigations (user-agent denylist, Anubis proof-of-work gateway) are documented indocs/explanation/ai-scraper-mitigation.md. -
Bump paperless and immich kustomizations to the main-SHA-built valkey tag (
v8.1.6-r0-fabca04). Routine post-merge follow-up to keep production manifests pointing at images built from a commit on main. -
Bump shower container to v1.1.1 (probe FOD hash).
-
Bumped shower app to v1.1.3 (wheel/sdist + FOD hashes probed on ringtail).
-
Cap systemd-coredump on ringtail (ProcessSizeMax/ExternalSizeMax 1G, MaxUse 2G) so multi-GB Wine/Proton game crash dumps no longer thrash the disk and lock up the desktop.
-
Deploy shower v1.1.1 to ringtail (kustomize newTag bump).
-
Deployed shower v1.1.3 to ringtail (image built and pushed from ringtail; runner bypassed due to indri overload).
-
Fix three follow-ups from the wave-1 decommission: grant the local
break-glassadminaccount ArgoCD admin rights (g, admin, role:admin—
previously only the Authentikadminsgroup had access, so admin was
locked out whenever its token expired), and repoint the alloy blackbox
probe for teslamate from the deleted minikube service to
https://tesla.ops.eblu.me/(through Caddy over Tailscale). The orphaned
paperless/teslamate roles + ExternalSecrets left on the minikube
blumeops-pg are also cleaned up. -
Moved the Immich blackbox health probe from indri's alloy to ringtail's alloy. After the immich migration to ringtail, the probe still targeted
immich-server.immich.svc.cluster.localon indri's cluster where the service no longer exists, causing a persistentServiceProbeFailurealert. -
Pin shower v1.1.1 FOD outputHash (probed locally on ringtail).
-
Rebuild Prowler container against main HEAD (v5.23.0-495e45d) after merging the IaC mutelist Dockerfile changes.
-
Rebuild and retag alloy v1.16.0 container images from the main-branch SHA
following the squash-merge of #345, per the build-container-image
squash-merge convention. Both images (registry.ops.eblu.me/blumeops/alloy)
now reference9564435rather than the branch SHA26a3ab5, restoring
source traceability after branch cleanup. -
Rebuild shower from the post-merge commit on main so the container's
SHA tag points at a commit that will still exist after the 30-day
branch-cleanup window. Functionally identical to the branch-tag image
already deployed, just preserves source traceability per
build-container-image#Squash-merge and container tags. -
Rebuild unpoller container from squashed main commit so the image SHA tag matches a commit in main's history (was tagged with the pre-squash branch SHA).
-
Rebuild valkey container from squashed main commit (both arm64 dagger and amd64 nix variants), and update paperless + immich-ringtail kustomizations to the main-SHA tags
v8.1.7-ecded30andv8.1.7-ecded30-nix. -
Retired the
blumeops-tasksmise task (Todoist API) in favor ofheph list --project Blumeops --jsonfrom the self-hosted hephaestus system. Updated docs to point task discovery and rotation reminders at heph, and noted that the~/code/personal/zkzettelkasten is migrating into heph docs. -
Switch the Fly proxy deploy strategy from
bluegreentoimmediateinfly/fly.toml. With a single proxy machine, bluegreen offers little benefit — the green machine routinely failed to reach "started" inside Fly's default 5-minute deploy timeout (the cold-start sequence oftailscaled→tailscale up→ wait-for-MagicDNS → nginx startup eats most of the budget), and the failed deploys would roll back.immediatereplaces the machine in place with a brief downtime (~5–10s) but actually completes. -
Switch the ringtail provisioning playbook's blumeops clone URL from
forge.eblu.me(public, via Fly proxy) toforge.ops.eblu.me(tailnet, direct via Caddy on indri). Ringtail is always on the tailnet, so the WAN round-trip is pure overhead — it also madeprovision-ringtailbrittle whenever the Fly proxy was slow or down. -
Switched Grafana's deployment strategy from
RollingUpdatetoRecreate. With an RWO PVC holding the SQLite database and Bleve search index,RollingUpdatereliably crashloops the new pod on the index lock until rollout timeout.Recreateterminates the old pod first so the new one acquires the lock cleanly. -
Update
tailscale-operator-ringtailProxyClass to reference the0108b68main-SHA build of the tailscale container. Routine post-merge cleanup so the deployed image traces to a commit that survives PR branch cleanup. -
Update the ringtail NixOS flake lockfile (
nixos/ringtail/flake.lock): bump
nixpkgs(b77b3de → 25f5383) anddisko(5ba0c95 → 115e521) to latest.
nixpkgs-serviceswas intentionally left pinned (skipped by the
flake-updatepipeline). Routine recurring maintenance per manage-lockfile. -
Upgrade native macOS Alloy on indri to v1.16.0. Built on gilbert with Go
1.26.2 + CGO (required for the macOS native DNS resolver, which Tailscale
MagicDNS depends on), scp'd to~/.local/bin/alloyon indri, codesigned,
and the LaunchAgent reloaded. Completes the v1.16.0 fleet upgrade started
in #345 — all four Alloy services (alloy-k8s, alloy-ringtail,
alloy-tracing-ringtail, alloy ansible) now run v1.16.0. -
Upgraded zot on indri from v2.1.15 to v2.1.16 (security fixes: TLS verification on metrics client, CORS Allow-Credentials suppression on wildcard origins, manifest/API-key body size limits).
Documentation
- Reviewed
replicating-blumeopstutorial: fixed "BluemeOps" typos (also incontributing.md) and addedlast-reviewedfrontmatter. - Reviewed indri reference card: added
devpi,cv, anddocsto the native-services list; widened the k8s note to reflect the growing set of apps now on ringtail and the planned indri-minikube decommission; added CPU/RAM specs. - New how-to: rotate-fly-deploy-token. Documents the 75-day rotation cadence, why we use
org-scoped tokens (silences the cosmetic metrics-token warning onfly statuswith marginal blast-radius cost given the single-app personal org), and the procedure for rotation + Forgejo Actions secret sync. - Add
docs/explanation/ai-scraper-mitigation.md— the egress-cost / AI-crawler threat model for the public Fly proxy, the tiered mitigation plan (Tier 1: mirror black-hole, shipped; Tier 2: user-agent denylist + Anubis; Tier 3: Cloudflare, rejected on principle), and the data behind it. - Fix manage-forgejo-mirrors verify step — sync button is on the repo settings page ("Synchronize now"), not the main repo page.
- Fixed the
op item editinvocation in the zot API-key rotation procedure: the previouspbpaste | op item edit ... "field[password]=-"stdin syntax is rejected by op 2.34 as "invalid JSON" (recent op versions treat piped input as a full JSON template, not a single field value). Procedure now reads the clipboard into a local fish variable and passes it as an inline assignment. - Fixed the export-filename step in run-1password-backup: 1Password's desktop app names the export
1PasswordExport-<account-uuid>-<timestamp>.1puxautomatically rather than letting you save to a fixed name, so the procedure now points the task at that glob instead of pretending the default name is1Password-export.1pux. - Refresh the contributing tutorial: add
last-reviewed, include the.ai.mdchangelog fragment type, and clarify thatprekis pinned viamise. - Review and refresh the Navidrome reference card: add
last-reviewed, correct the scanner env var name, document the current image/version, and record routing and runtime details from the manifests. - Review and refresh the Ollama reference card: add
last-reviewed, bump the documented image tag to 0.20.4, and add the twoqwen3.5models now declared inmodels.txt. - Reviewed 1password reference card: added the
blumeopsvsPersonalvault split, noted thatonepassword-connectruns on both indri and ringtail (not just one cluster), and pulled theop readvsop item get --fieldsguidance up from agent memory into the card. - Reviewed
index.md; added ringtail to the infrastructure overview and stampedlast-reviewed. - Reviewed transmission card: corrected storage layout (
/config/is emptyDir, watch dir disabled) and noted the Prometheus exporter sidecar. - rotate-fly-deploy-token: combine mint+store into one command with both fish and bash forms; document the
op item edit"Password item requires ps value" validator gotcha and the placeholder-password workaround.
AI Assistance
- Adopt
AGENTS.mdas the canonical agent instruction file, keepCLAUDE.mdas a compatibility shim, and update docs to reference the neutral file and the correct agent-change-process path. - CLAUDE.md now imports AGENTS.md via
@AGENTS.mdinstead of telling agents to go read it. Claude Code only auto-loads CLAUDE.md, so the prose shim was easy to skip; the import inlines AGENTS.md into the session prompt unconditionally.
Miscellaneous
- Removed the dead minikube manifests, container builds, and tooling shims left behind after the cv + docs migration to indri-native (#342). Deletes
argocd/{apps,manifests}/{cv,docs}/,containers/{cv,quartz}/, and thequartz→docsmapping inmise-tasks/container-version-check. Bumpsdocs.current-versiontov1.16.0(the blumeops release tag) now that the legacy nginx-base version pin is gone. - Rebuild shower v1.1.0 container from main HEAD (
3c7967e) and bump the
kustomization tag tov1.1.0-3c7967e-nix. The PR was squash-merged, so
the branch commit444ff91baked into the prior tag isn't reachable
from main's history. The new tag points at a commit that exists on
main; image content is byte-identical because the FOD output is content
addressed and the inputs didn't change. - Rebuild shower v1.1.2 from main HEAD (
a33fa47) and retag — PR #358 was squash-merged so the branch SHA baked into the prior image tag isn't reachable from main. FOD is content-addressed, so image bytes are identical; only provenance changes. - Remove the duplicate Homepage tiles for Mealie, Paperless, Immich, and
TeslaMate. Homepage runs on ringtail and autodiscovers ringtail Ingresses via
gethomepage.dev/*annotations; once these services migrated to ringtail they
were discovered automatically, making their leftover staticservices.yaml
entries (needed only while they lived on minikube) redundant. - Removed the now-unused
containers/devpi/Dagger build artifact. Devpi runs natively on indri via uv venv; the container image is no longer referenced anywhere. Doc examples indocs/reference/tools/dagger.mdupdated to useminifluxas the example container name. container-build-and-releasenow prints the specificmise run runner-logs <N>command after dispatching, polling the Forgejo API to resolve the run number for the commit it just triggered.mise run runner-logs <run> -j <n>now reports a clear error when the log file doesn't exist on indri (e.g. a runner crash that leftaction_task.log_in_storage = 0). Previously it printed only the header and exited 0, becausezstdcatexits 0 with a "can't stat … -- ignored" stderr message and ssh+fish on indri swallows the remote exit code.
Documentation
Download
docs-v1.17.0.tar.gzdirectly, or bumpdocs_version
inansible/roles/docs/defaults/main.ymland run:mise run provision-indri -- --tags docsDownloads
-
Source code (ZIP)
0 downloads
-
Source code (TAR.GZ)
0 downloads
-
docs-v1.17.0.tar.gz
0 downloads ·
2026-06-03 21:52:21 -07:00 · 1.9 MiB
-
-
BlumeOps v1.16.0 Stable
released this
2026-04-18 10:00:51 -07:00 | 134 commits to main since this releaseBlumeOps release v1.16.0
What's Changed
Infrastructure
- Route Fly.io proxy through Caddy on indri with direct WireGuard peering, reducing public-facing latency from 20+ seconds (DERP relay) to sub-second. Fixed Beyla eBPF tracing on ringtail (memlock rlimit + BPF permissions). Restored trace collection to Tempo.
Documentation
Download
docs-v1.16.0.tar.gzand configure the quartz container with:DOCS_RELEASE_URL=https://forge.eblu.me/eblume/blumeops/releases/download/v1.16.0/docs-v1.16.0.tar.gzDownloads
-
Source code (ZIP)
0 downloads
-
Source code (TAR.GZ)
0 downloads
-
docs-v1.16.0.tar.gz
15 downloads ·
2026-04-18 10:00:53 -07:00 · 1.8 MiB
-
BlumeOps v1.15.7 Stable
released this
2026-04-18 08:14:51 -07:00 | 141 commits to main since this releaseBlumeOps release v1.15.7
What's Changed
Bug Fixes
- Fix borgmatic LaunchAgent failing silently due to macOS TCC permission dialogs. LaunchAgents now call borgmatic directly instead of routing through
mise x, which triggered "wants to access Documents" dialogs that hung headless sessions. The ansible role now also manages borgmatic installation viamise install.
Infrastructure
- Automate verification of Prowler MANUAL findings (kubelet file perms, kubelet config, etcd CA, RBAC cluster-admin) in
review-compliance-reportsand mute them withnode-config-automated-verificationcompensating control. - Migrate transmission and transmission-exporter containers from Dockerfile to native Dagger builds (
container.py). Updates base images to Alpine 3.23 and Python 3.14, pins uv to 0.11.6. - Switched Fly proxy to upstream keepalive pools, reducing forge.eblu.me latency from 35s+ p50 to sub-second. Added
mise run fly-reloadfor DNS re-resolution without redeploy. - Upgrade Prowler from 5.22.0 to 5.23.0; remove init container workaround for broken
--registryflag (upstream fix in PR #10470). - Added
robots.txttoforge.eblu.meblocking crawlers from/mirrors/to reduce load from Facebook scraping. - Container builds are now manual-only via
mise run container-build-and-release. Removed auto-trigger on push to main — shared Dagger helpers made path-based detection unreliable. - Migrate devpi container from Dockerfile to native Dagger build; bump devpi-server 6.19.1→6.19.3 and devpi-web 5.0.1→5.0.2.
- Migrated kiwix-serve container from Dockerfile to native Dagger build, bumping Alpine base from 3.22 to 3.23.
- Mitigated Forgejo archive endpoint DoS: redirect public archive requests to tailnet, expanded robots.txt, enabled archive cleanup cron, cached release downloads at proxy.
- Refactored Dagger container pipelines: extended
go_build()helper withbuildmodeandextra_envparams, migrated miniflux and forgejo-runner to use it, and standardized all Alpine bases from 3.22 to 3.23.
Miscellaneous
- Review compensating control
sso-gated-admin-tools: tightened scope to ArgoCD only, removed Grafana reference. - container-build-and-release now verifies the commit exists on the remote before dispatching a build.
Documentation
Download
docs-v1.15.7.tar.gzand configure the quartz container with:DOCS_RELEASE_URL=https://forge.eblu.me/eblume/blumeops/releases/download/v1.15.7/docs-v1.15.7.tar.gzDownloads
-
Source code (ZIP)
0 downloads
-
Source code (TAR.GZ)
0 downloads
-
docs-v1.15.7.tar.gz
2 downloads ·
2026-04-18 08:14:57 -07:00 · 1.8 MiB
- Fix borgmatic LaunchAgent failing silently due to macOS TCC permission dialogs. LaunchAgents now call borgmatic directly instead of routing through
-
BlumeOps v1.15.6 Stable
released this
2026-04-14 11:46:28 -07:00 | 174 commits to main since this releaseBlumeOps release v1.15.6
What's Changed
Bug Fixes
- Rotate ArgoCD workflow-bot token and admin password after DR rebuild invalidated signing keys, fixing build-blumeops workflow failures.
Documentation
Download
docs-v1.15.6.tar.gzand configure the quartz container with:DOCS_RELEASE_URL=https://forge.eblu.me/eblume/blumeops/releases/download/v1.15.6/docs-v1.15.6.tar.gzDownloads
-
Source code (ZIP)
0 downloads
-
Source code (TAR.GZ)
0 downloads
-
docs-v1.15.6.tar.gz
4 downloads ·
2026-04-14 11:46:41 -07:00 · 1.8 MiB
-
BlumeOps v1.15.5 Stable
released this
2026-04-14 11:29:22 -07:00 | 176 commits to main since this releaseBlumeOps release v1.15.5
What's Changed
Features
- Deploy Paperless-ngx document management system at paperless.ops.eblu.me with OCR, Authentik SSO, and NFS storage on sifaka.
- Add
ty(Astral) Python typechecker to prek hooks, configured for Dagger SDK and container.py modules. Addtype: miseto service-versions.yaml for tracking development tool versions (dagger, ansible-core, prek, pulumi, ty) through the standard service review process. - Upgrade grafana-sidecar from 1.28.0 to 2.6.0, adding health probes and porting build to native Dagger container.py.
- Upgrade Navidrome to v0.61.1 — major artwork overhaul with per-disc cover art, rebuilt search engine (SQLite FTS5), server-managed transcoding, and WebP performance fix.
- Add
mise run review-compliance-reportstask for weekly compliance report review with muted/unmuted distinction and week-over-week delta
Bug Fixes
- Add paperless database to borgmatic backup configuration. Previously the only service DB not included in nightly pg_dump backups.
- Fix Fly.io proxy rate limiting to key on real client IP instead of Fly's internal proxy IP, so crawlers no longer consume the shared rate limit bucket for all clients.
- Fix UnPoller (UniFi) Grafana dashboards failing to load due to UID exceeding Grafana 12's 40-character limit.
- Fix blumeops-tasks swallowing wiki-link brackets in task descriptions (rich markup escaping)
- Fix dagger flake-update pipeline: replace nonexistent
--excludeflag with dynamic input discovery - Fix services-check to display all firing alerts for a given alert name, not just the first one.
- Pin Fly.io proxy Tailscale to v1.94.1 — the
:stabletag pulled v1.96.5 which has a MagicDNS regression (SERVFAIL on tailnet names), breaking all public routing through forge.eblu.me, docs.eblu.me, and cv.eblu.me. - Rewrite
mise run runner-logsCLI: list runs by run number (not task ID), drill into jobs per run, fetch logs via Forgejo web API instead of SSH+filesystem. Fixes broken log retrieval caused by incorrect hex path calculation and stale data directory. Added--repoto query any forge repo (e.g. sporks) and--limit/-nto control listing size (0 for all). - Route Dagger build telemetry to Tempo, fixing OTEL metrics exporter warnings.
- Switch paperless redis sidecar from amd64-only nix-built
authentik-redisimage to upstreamvalkey:8.1-alpine(multi-arch). The nix image was previously running under QEMU emulation on arm64 minikube.
Infrastructure
- Build forgejo-runner container locally via native Dagger pipeline instead of pulling from upstream.
- Build kube-state-metrics container locally (Dockerfile + nix) from forge mirror, replacing upstream registry.k8s.io image on both indri and ringtail.
- Upgrade miniflux from 2.2.17 to 2.2.19 and migrate from Dockerfile to native Dagger container.py build (second container after navidrome). Refactor
alpine_runtime()withcreate_userparameter to support Alpine's built-in nobody user. Pin all mise.toml tool versions to explicit versions instead of "latest". - Migrate Dagger module from .dagger/ to repo root (src/blumeops/) and replace docker_build() with native Dagger pipelines for container builds. Navidrome is the first container migrated, with full build error visibility.
- Migrate teslamate container build from legacy Dockerfile to native Dagger container.py.
- Add seccomp RuntimeDefault profiles to alloy-k8s and immich pods, resolving 4 unmuted Prowler findings
- Full DR recovery from power loss and minikube cluster rebuild. Validated bootstrap procedure, identified circular dependencies (forge.eblu.me, Zot/Authentik OIDC), Tailscale device name collision issues, and documented recovery steps for restart-indri.
- Set Frigate preview quality to CRF 8 (from default 1) to reduce preview file sizes and improve review timeline loading over NFS.
- Track Fly.io proxy component versions (Tailscale, nginx, Alloy) in service-versions.yaml with new
flyservice type. - Upgrade ArgoCD from v3.3.2 to v3.3.6 (bug-fix patches), SHA-pin install manifest
- Upgrade authentik 2026.2.0 → 2026.2.2 (bug-fix patch release)
- Upgrade ollama from 0.17.5 to 0.20.4 (adds Gemma 4 support, benchmark tooling, Apple Silicon perf improvements)
Documentation
- Delete outdated install-dagger-on-nix-runner card; add service-versions reference card; clean up zot.md and review-services.md links.
- Enhanced the adding-a-service tutorial with kustomization setup, corrected Tailscale ingress format, updated ArgoCD repoURL, and added a step for creating service reference cards.
- Review gandi.md: add missing forge.eblu.me CNAME, fix program description, stamp review date.
Documentation
Download
docs-v1.15.5.tar.gzand configure the quartz container with:DOCS_RELEASE_URL=https://forge.eblu.me/eblume/blumeops/releases/download/v1.15.5/docs-v1.15.5.tar.gzDownloads
-
Source code (ZIP)
0 downloads
-
Source code (TAR.GZ)
0 downloads
-
docs-v1.15.5.tar.gz
2 downloads ·
2026-04-14 11:29:27 -07:00 · 1.8 MiB
-
BlumeOps v1.15.4 Stable
released this
2026-04-06 07:53:51 -07:00 | 230 commits to main since this releaseBlumeOps release v1.15.4
What's Changed
Infrastructure
- Migrate 1Password Connect from Helm to kustomize (1.8.1 → 1.8.2), completing the no-helm-policy migration.
Documentation
- Rewrite observability stack tutorial: replace Helm instructions with actual kustomize/ArgoCD patterns, fix typos, document Alloy as core component
Documentation
Download
docs-v1.15.4.tar.gzand configure the quartz container with:DOCS_RELEASE_URL=https://forge.eblu.me/eblume/blumeops/releases/download/v1.15.4/docs-v1.15.4.tar.gzDownloads
-
Source code (ZIP)
0 downloads
-
Source code (TAR.GZ)
0 downloads
-
docs-v1.15.4.tar.gz
11 downloads ·
2026-04-06 07:53:53 -07:00 · 1.8 MiB
-
BlumeOps v1.15.3 Stable
released this
2026-04-05 21:24:21 -07:00 | 234 commits to main since this releaseBlumeOps release v1.15.3
What's Changed
Infrastructure
- Build Tempo container from source via forge mirror; bump 2.10.1 → 2.10.3
- Pin NixOS service versions (forgejo-runner, snowflake, k3s) via
nixpkgs-servicesoverlay in ringtail flake, preventing silent upgrades fromnix flake update. Add k3s and minikube to service-versions.yaml tracking. Fix stale nix-container-builder version (was 12.6.4, actually running 12.7.2). - Migrate Immich from Helm chart to kustomize manifests and upgrade from v2.5.6 to v2.6.3
- Upgrade Grafana from 12.3.3 to 12.4.2 — patches 7 CVEs including an unauthenticated DoS (CVE-2026-27880).
Documentation
- First compensating control review: verified
single-user-clusterstill in effect. Added aspirational how-to card for PCI DSS evidence collection. - Prowler
--registryfix merged upstream (PR #10470); initContainer workaround documented as pending release.
Documentation
Download
docs-v1.15.3.tar.gzand configure the quartz container with:DOCS_RELEASE_URL=https://forge.eblu.me/eblume/blumeops/releases/download/v1.15.3/docs-v1.15.3.tar.gzDownloads
-
Source code (ZIP)
0 downloads
-
Source code (TAR.GZ)
0 downloads
-
docs-v1.15.3.tar.gz
2 downloads ·
2026-04-05 21:24:24 -07:00 · 1.8 MiB
-
BlumeOps v1.15.2 Stable
released this
2026-03-30 17:48:36 -07:00 | 253 commits to main since this releaseBlumeOps release v1.15.2
What's Changed
Features
- Build custom Kingfisher container from sporked deploy branch, replacing upstream image with locally-built version including --clone-url-base patch.
- Add Kingfisher secret scanner as a weekly CronJob scanning all Forgejo repos, with HTML and JSON reports written to sifaka NFS.
- Add MongoDB Kingfisher secret scanner as a prek hook alongside TruffleHog for comparative coverage evaluation.
- Add spork strategy: floating-branch soft-fork tooling (
mise run spork-create) and documentation for maintaining local patches against upstream projects.
Infrastructure
- Add compensating controls framework: tracking file, review mise task, and how-to doc. Map all Prowler mutelist entries to named controls with CC: prefixes.
- Add Prowler mutelist to suppress expected findings from system components, operator-managed pods, and accepted operational needs. Fix missing seccomp profile on kube-state-metrics.
- Borgmatic photos backup: restrict to library/ and upload/ (skip regenerable dirs), add SSH keepalives and checkpoint interval to prevent broken pipe failures on large initial syncs.
- Upgrade forgejo-runner from 12.7.0 to 12.7.3 (bug fixes, security dep update). Add service reference card.
Documentation
- Add service reference documentation for Kingfisher secret scanner.
- Review and update Ansible reference doc: add missing roles, sibling playbooks, and clarify Ansible's role in the IaC stack.
Documentation
Download
docs-v1.15.2.tar.gzand configure the quartz container with:DOCS_RELEASE_URL=https://forge.eblu.me/eblume/blumeops/releases/download/v1.15.2/docs-v1.15.2.tar.gzDownloads
-
Source code (ZIP)
0 downloads
-
Source code (TAR.GZ)
0 downloads
-
docs-v1.15.2.tar.gz
4 downloads ·
2026-03-30 17:48:40 -07:00 · 1.8 MiB
-
BlumeOps v1.15.1 Stable
released this
2026-03-28 09:15:18 -07:00 | 285 commits to main since this releaseBlumeOps release v1.15.1
What's Changed
Features
- Add Tor Snowflake proxy on ringtail as a systemd service to support anti-censorship efforts.
- Add offsite backup for immich photo library to BorgBase, running daily at 4 AM from indri via sifaka SMB mount.
- Add QArt Tuner — a Go tool that generates QR codes whose data modules form a recognizable image, with an interactive web UI for parameter tuning. Based on the QArt technique by Russ Cox. Lives in
utils/qart/.
Infrastructure
- Migrate Forgejo from Homebrew to source build with mcquack LaunchAgent, matching the pattern used by zot, caddy, and alloy. Upgrades to v14.0.3 (7 security fixes including PKCE bypass and OAuth scope bypass).
- Add borgmatic pg_dump backups for authentik and immich databases. Authentik uses the existing blumeops-pg cluster on port 5432. Immich requires a new borgmatic role on the immich-pg cluster, a Tailscale service, and Caddy L4 proxy on port 5433.
- Upgrade External Secrets Operator from v1.3.2 to v2.2.0 and migrate from Helm chart to static kustomize manifests.
- Add post-deploy maintenance docs and generation pruning task for ringtail.
- Fix Immich Helm values: resource limits and probe timeouts were silently ignored due to wrong value keys. Resources now actually apply to pods, and liveness/readiness probe timeouts increased from 1s to 5s to prevent kubelet from killing pods during ML inference.
- Reduce PodNotReady alert lookback window from 5m to 60s to clear faster after rollouts.
- Tighten ArgoCDAppOutOfSync alert: reduce pending duration from 30m to 5m and lookback window from 5m to 1m so alerts clear faster after sync.
- Update ringtail flake inputs (nixpkgs, home-manager).
- Upgrade Homepage dashboard from v1.10.1 to v1.11.0
- Upgrade nvidia-device-plugin from v0.18.2 to v0.19.0
Documentation
- Review and fix CV service doc (correct URL, forge domain, container tag link) and add private forge repo review guidance to review-services process.
- Review tailscale-setup tutorial: fix macOS install steps, add
--accept-routestip, correct tag name, add ACL apply instructions, add[[tailscale-operator]]cross-reference.
Miscellaneous
- Add
preserve/*branch prefix exclusion tobranch-cleanuptask; document Pyroscope profiling work and blockers in observability reference.
Documentation
Download
docs-v1.15.1.tar.gzand configure the quartz container with:DOCS_RELEASE_URL=https://forge.eblu.me/eblume/blumeops/releases/download/v1.15.1/docs-v1.15.1.tar.gzDownloads
-
Source code (ZIP)
0 downloads
-
Source code (TAR.GZ)
0 downloads
-
docs-v1.15.1.tar.gz
30 downloads ·
2026-03-28 09:15:21 -07:00 · 1.7 MiB
-
BlumeOps v1.15.0 Stable
released this
2026-03-24 19:50:58 -07:00 | 307 commits to main since this releaseBlumeOps release v1.15.0
What's Changed
Features
- Deploy Prowler CIS scanner as a weekly CronJob on minikube-indri, with reports written to sifaka NFS share.
- Add Grafana "Alerts" dashboard showing currently firing alerts and recent state changes.
- Add IaC scanning via Prowler IaC provider (Saturday 2am, Dockerfiles and K8s manifests).
- Add container image vulnerability scanning via Prowler image provider (Saturday 3am, all blumeops/* images).
Bug Fixes
- Fix authentik worker OOMKill by setting AUTHENTIK_WORKER_CONCURRENCY=2 (was defaulting to 16 based on CPU count).
- Remove
group: ""from tailscale-operator ignoreDifferences — ArgoCD normalizes away the empty string, causing permanent OutOfSync on the apps app.
Infrastructure
- Decommission JobSync service — removed ArgoCD app, k8s manifests, container build, Caddy proxy, Homepage entry, docs, and forge mirror. Replaced by datasette-based job tracking (coming soon).
- Localize authentik-redis container: replace upstream
redis:7-alpinewith nix-built image from nixpkgs (Redis 8.2.3). Introduces attached service pattern withparentfield in service-versions.yaml and version assertion in default.nix to prevent silent version drift. - Unified Dockerfile and Nix container build workflows into a single workflow that auto-classifies containers by build type and routes to the correct runner (k8s for Dockerfile, nix-container-builder for Nix). Removed nettest container (outgrown). Nix builds now require an explicit
version = "..."declaration — no implicit nixpkgs fallback. - Monthly tooling dependency update: bump prek hooks (trufflehog 3.94.0, ruff 0.15.7, shfmt 3.13.0), Fly.io images (nginx 1.29.6, Alloy 1.14.1), actions/checkout v4.3.1→v6.0.2, tighten mise task Python lower bounds (rich 14, typer 0.24, httpx 0.28.1, pyyaml 6.0.2), and bump ansible-lint/ansible-core floors.
- Upgrade ntfy v2.17.0 → v2.19.2 (adds experimental PostgreSQL support, read replicas, web push fixes)
- Revert Tailscale operator to v1.94.2 (v1.96.3 images not yet published); keep Fly proxy
tailscale waitimprovement - Add RuntimeDefault seccomp profiles to all managed deployments, statefulsets, and cronjobs.
- Upgrade Frigate from 0.17.0-rc2 to 0.17.1 (security fixes, bugfixes). Add motion retention tier (365 days), reduce continuous retention from 180 to 30 days.
Documentation
- Review and fix ArgoCD config tutorial: correct sync policy example, fix typo, add missing cross-references and frontmatter.
- Review and update 12 reference docs: fix stale image references to point at kustomization manifests instead of hardcoded tags, correct Prometheus scrape target, expand external-secrets stub, add cross-references between backup/disaster-recovery docs, and remove misleading
.ts.netURLs from Quick Reference tables.
Documentation
Download
docs-v1.15.0.tar.gzand configure the quartz container with:DOCS_RELEASE_URL=https://forge.eblu.me/eblume/blumeops/releases/download/v1.15.0/docs-v1.15.0.tar.gzDownloads
-
Source code (ZIP)
0 downloads
-
Source code (TAR.GZ)
0 downloads
-
docs-v1.15.0.tar.gz
2 downloads ·
2026-03-24 19:51:16 -07:00 · 1.6 MiB