Adds the heph-pwa redirect URIs to the Authentik `heph` OAuth2 provider so the new browser **Login with Authentik** flow (Authorization Code + PKCE, hephaestus PR #9) can redirect back and exchange the code:
- `https://heph.ops.eblu.me/` (the PWA origin)
- `http://localhost:8787/` (local dev: `hephd --web-root`)
Authentik also keys token-endpoint CORS off these origins, so they're required for the browser token exchange. Additive (the provider was `redirect_uris: []`); harmless until the PWA feature deploys.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: #370
Makes indri the canonical **heph** hub for the hub-and-spoke task/context system, deployed as a self-updating LaunchAgent managed by Ansible. Other devices (gilbert) attach as offline-capable spokes.
## What's here
- **`ansible/roles/heph`** (tag `heph`) — bootstrap `cargo install hephd` (only if absent; `--self-update` keeps it current after), version-pinned `heph-pwa` checkout served via `--web-root`, launchagent `mcquack.eblume.heph`:
```
hephd --mode server --http-addr 0.0.0.0:8787 --db … --web-root …
--oidc-issuer …/o/heph/ --oidc-audience heph
--self-update --self-update-interval-secs 600
```
`~/.cargo/bin` is on the agent `PATH` so self-update's `cargo install` works.
- **Caddy** — `heph.ops.eblu.me → localhost:8787` (TLS for the PWA secure context).
- **Authentik** — new `heph` **public device-code** OIDC app + `default-device-code-flow` bound to the default brand's `flow_device_code` (verified live: brand `authentik-default`, field currently unset → additive).
- **Docs** — `services/hephaestus.md` (Path-A seeding runbook + spoke caveat), `indri.md`, changelog fragment.
## Three features requested
- **Autoupdate** — 10-min interval (`--self-update-interval-secs 600`).
- **PWA** — `--web-root` (confirmed shipped in v1.2.0).
- **Spoke** — gilbert reconfig documented (post-merge step).
## Deploy plan (not done yet — awaiting review)
1. Seed from gilbert (Path A): `heph daemon stop` → copy `heph.db` → `DELETE FROM meta WHERE key='origin'`.
2. Sync Authentik `apps`/blueprint; verify blueprint status via API (not just logs).
3. `provision-indri --tags heph,caddy` from this branch.
4. Point gilbert at the hub + `heph auth login`.
## Known follow-ups (heph-side, tracked in the Hephaestus project)
- `heph daemon` can't bake hub/spoke config or pass `--self-update-interval-secs` → worked around by the ansible plist.
- Path-A seeding lacks a clean `hephd --owner-id`/seed command → manual `meta.origin` reset for now.
- Self-update moves hephd ahead of the ansible-pinned PWA shell over time (drift; tolerated by the SW cache, revisit on next release).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: #369
indri -> v2.2.0-13895bb (arm64), ringtail -> v2.2.0-13895bb-nix (amd64).
Both deployed images now trace to main commit 13895bb instead of earlier
branch builds.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Follow-up to #367. That PR localized external-secrets but the Dagger build (on indri's Apple Silicon runner) only produces an **arm64** image — and external-secrets also runs on **ringtail (amd64)** via the same shared manifest. This completes the localization so both clusters run the local binary on their native arch.
## Approach (matches the kube-state-metrics dual-build pattern)
- **`containers/external-secrets/default.nix`** (new) — builds the **amd64** image on ringtail's nix-container-builder. `buildGoModule` with Go 1.26 (v2.2.0 requires ≥1.26.1; nixpkgs default is 1.25.x) and `-tags all_providers`, faithful to upstream. Same v2.2.0 source from the forge mirror.
- **`argocd/manifests/external-secrets-ringtail/`** (new) — thin kustomize overlay that reuses the shared indri manifest as a base and overrides **only** the image to the `-nix` (amd64) tag. No manifest duplication.
- **`argocd/apps/external-secrets-ringtail.yaml`** — repointed at the new overlay.
Result: indri → `v2.2.0-…` (arm64, Dagger), ringtail → `v2.2.0-…-nix` (amd64, nix).
## Build
Run #581 built both arches at the branch commit. Verified the nix image is `linux/amd64`, entrypoint = the binary, user 65534.
## Deployed from branch & verified on ringtail (k3s, amd64)
- All 3 pods rolled to the nix amd64 image, `1/1 Running` (no exec-format error → arch correct)
- Controller logs clean
- **Live secret fetch proven:** force-synced `homepage/homepage-grafana` → `refreshTime` advanced, `Ready=True`
- **All 20** ringtail ExternalSecrets remain `SecretSynced=True`
## Post-merge
The `external-secrets-ringtail` app is temporarily pointed at this branch + overlay path (apps app left on `main`, manual-sync, untouched). After merge:
```
argocd app sync apps # picks up the new Application path on main
argocd app set external-secrets-ringtail --revision main && argocd app sync external-secrets-ringtail
```
I'll also rebuild off `main` so both clusters land on stable main-sha tags (as done for indri in #367).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: #368
Repoint to the main-branch-built image so the deployed tag traces to a main
commit rather than the merged feature branch. Same v2.2.0 source, stable
provenance.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Knocks out the weekly "pick one non-local container and make it local" task by moving **external-secrets** off `ghcr.io` onto a locally-built image, under our own supply-chain control. Doubles as its overdue service review.
## What changed
- **`containers/external-secrets/container.py`** (new) — native Dagger build (the Dockerfile→container.py migration pattern). Clones the forge mirror at `v2.2.0` and builds the single `all_providers` static Go binary, faithful to upstream's `make build` (CGO off, no version ldflags upstream). ENTRYPOINT is `/bin/external-secrets` so the controller/webhook/cert-controller Deployments select their role via `args:` exactly as before.
- **`argocd/manifests/external-secrets/kustomization.yaml`** — image swapped to `registry.ops.eblu.me/blumeops/external-secrets:v2.2.0-2985007`. **Like-for-like (v2.2.0)**, not an upgrade.
- **`service-versions.yaml`** — marked reviewed (2026-06-04), noted the local build.
## Build
Built on the indri forge runner (run #579, ~4 min) → pushed to Zot. Image config verified: `Entrypoint=/bin/external-secrets`, `User=65534`, version label `v2.2.0`.
## Deployed from branch & verified
- All 3 pods (controller / webhook / cert-controller) rolled to the local image, `1/1 Running`
- Controller + webhook logs clean (no errors; webhook serving TLS)
- **End-to-end secret fetch proven:** force-synced `monitoring/grafana-admin` → `refreshTime` advanced to now, `Ready=True`
- All 10 ExternalSecrets cluster-wide remain `SecretSynced=True` — no collateral damage
- App `Healthy`
## Post-merge
`external-secrets` currently points at this branch (so `apps` reads OutOfSync — expected). After merge:
```
argocd app set external-secrets --revision main && argocd app sync external-secrets
```
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: #367
Knocks out the two daily recurring review tasks (doc review + service review) in one PR.
## Doc review (4 never-reviewed reference cards, `last-reviewed: 2026-06-04`)
- **cluster.md** — Kubernetes version v1.34.0 → **v1.35.0**; refreshed the stale ringtail workload list and noted the in-progress minikube→k3s migration (points to `[[ringtail]]` as the canonical list).
- **ntfy.md / tempo.md / alloy.md** — corrected image references: these are now **locally-built `registry.ops.eblu.me/blumeops/*` nix containers** (ntfy v2.19.2, tempo v2.10.3, alloy-k8s v1.16.0), not upstream Docker Hub. Fly.io alloy binary bumped to v1.16.1.
## Service review
- **nvidia-device-plugin** (ringtail GPU): v0.19.0 → **v0.19.2**. Upstream patch releases — CDI/Tegra fixes + dependency bumps, no breaking changes for our manifest-based CDI + RuntimeClass setup (the service-account change in the notes is helm-only).
## Not in this PR (need container rebuilds, deferred)
The other stale services are locally-built nix images, so upgrading them is a forge-runner rebuild rather than a clean tag bump — left untouched (not date-bumped, so they resurface): **prometheus** (v3.10.0→v3.12.0), **loki** (3.6.7→3.7.2), **kube-state-metrics**, **homepage**. Happy to do these as a follow-up rebuild PR.
## Deploy / verify
Not yet deployed — `nvidia-device-plugin` still points at `main`. After review:
```
argocd app set nvidia-device-plugin --revision reviews-jun4 && argocd app sync nvidia-device-plugin
# after merge:
argocd app set nvidia-device-plugin --revision main && argocd app sync nvidia-device-plugin
```
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: #366
The public forge.eblu.me now black-holes /mirrors/ at the Fly edge
(AI-scraper mitigation), so the in-cluster ArgoCD repo-server got a 403
fetching the upstream operator manifest — leaving tailscale-operator and
tailscale-operator-ringtail in Unknown sync. Use forge.ops.eblu.me.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Dagger build_docs pipeline cloned Quartz from the default branch
unpinned. Quartz v5.0.0 restructured its config layout (.quartz/plugins,
../quartz imports), breaking the docs build against our existing
quartz.config.ts / quartz.layout.ts. Pin the clone to the last v4
release (v4.5.2) to restore known-good behavior.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the Todoist-backed blumeops-tasks mise task with
`heph list --project Blumeops --json` (hephaestus, now at v1 prototype
on gilbert). Update task-discovery, rotation-reminder, and zk
references across docs; note the zk zettelkasten is migrating into
heph docs.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Mealie, Paperless, Immich, TeslaMate are now autodiscovered from their ringtail
Ingress gethomepage.dev annotations; the static services.yaml entries (from when
they were on minikube, which homepage-on-ringtail can't autodiscover) were
duplicating them.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- argocd: grant local break-glass admin the admin role (g, admin, role:admin);
previously only the Authentik admins group had access, locking out admin
once its token expired (policy.default is unset).
- alloy-k8s: repoint the teslamate blackbox probe from the deleted minikube
service to https://tesla.ops.eblu.me/ (Caddy over Tailscale), like immich.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Final step of the wave-1 indri-k8s migration. paperless, teslamate, mealie run on ringtail with data migrated, verified, and backed up (local + BorgBase offsite via PR #364).
- Remove minikube paperless/teslamate/mealie manifest dirs + ArgoCD app defs (prunes the parked Deployments/Services + redundant minikube mealie/paperless PVCs)
- Drop paperless/teslamate roles + ExternalSecrets from the minikube blumeops-pg cluster
- miniflux + authentik stay on minikube (later waves)
Finalization after merge: sync apps + databases to prune, then DROP DATABASE paperless/teslamate on indri's blumeops-pg (fresh safety dump taken first).
Reviewed-on: #365
Prereq for the wave-1 decommission. The cutover moved paperless+teslamate (postgres) and mealie (SQLite) to ringtail, but borgmatic and the Grafana TeslaMate datasource still pointed at the minikube copies — the migrated live data was unbacked since cutover, and dropping the minikube DBs would break the TeslaMate dashboards.
- Tailscale Service `blumeops-pg-ringtail` + Caddy L4 route `pg.ops.eblu.me:5434`
- borgmatic: teslamate + paperless postgres → :5434; mealie SQLite → ssh:eblume@ringtail
- Grafana TeslaMate datasource → pg.ops.eblu.me:5434
Deploy: sync databases-ringtail (tailscale svc) + grafana from branch; provision-indri --tags caddy,borgmatic; verify a backup run + dashboards. Unblocks the decommission PR.
Reviewed-on: #364
Post-merge rebuild of paperless/mealie/teslamate Nix images at the main
merge commit, replacing the feature-branch -nix tags. Image content is
identical; only the commit-sha suffix changes.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
GNU lives in the overhead — the X-Clacks-Overhead header — never on the
visible page. Keep the header, drop the footer.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A $29.60 Fly bill traced to ~1.25 TB/30d egress on forge.eblu.me (99.95% of
all proxy egress), ~71% of it AI scrapers (Meta meta-externalagent, OpenAI
GPTBot, Amazonbot, Bytespider) crawling the public mirror repos' infinite
git-history URL space and timing out Forgejo. robots.txt already disallowed
/mirrors/ but those agents ignore it, so enforce at the edge: return 403 (^~
to beat the regex asset locations), served as a roll-of-dishonour page with an
X-Naughty-Scrapers header. Mirrors stay reachable on the tailnet via
forge.ops.eblu.me. Tier 2 (UA denylist + Anubis) and the Cloudflare rejection
are documented in docs/explanation/ai-scraper-mitigation.md.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Image tags from PR #362 (v8.1.7-02859c5{,-nix}) referenced a branch
SHA that no longer exists on main after squash-merge. Rebuilt both
the dagger arm64 and nix amd64 variants from the squashed commit
(ecded30) and updated paperless + immich-ringtail to the new tags.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Summary
Weekly "make one non-local container local" pickup: immich-ringtail still pulled `docker.io/valkey/valkey:8.1.6` because the existing `containers/valkey/container.py` build was arm64-only.
- Adds `containers/valkey/default.nix` — nix-built amd64 valkey image, packaged by the ringtail nix-container-builder runner using `pkgs.dockerTools.buildLayeredImage`. Mirrors the existing `containers/authentik-redis/default.nix` pattern.
- `containers/valkey/container.py` keeps building the Alpine arm64 image for paperless on indri. Bumped both builds to upstream valkey 8.1.7 (Alpine 3.22 now ships `8.1.7-r0`; nixpkgs has 8.1.7).
- Splits `VERSION` (upstream app) from `ALPINE_PIN` (apk pin) in `container.py` so both build files can declare the same upstream version and pass `container-version-check`.
- Updates `service-versions.yaml`: current-version 8.1.7, refreshed last-reviewed, upstream-source now points at the canonical valkey-io releases page.
- Switches kustomizations:
- `immich-ringtail/kustomization.yaml`: `docker.io/valkey/valkey:8.1.6` → `registry.ops.eblu.me/blumeops/valkey:v8.1.7-02859c5-nix`, comment updated.
- `paperless/kustomization.yaml`: `v8.1.6-r0-fabca04` → `v8.1.7-02859c5`.
## Build
build-container run #563 — both jobs succeeded after a transient runner crash on the first dispatch (#562 build-nix), which surfaced two separate bugs that landed in a separate C0 on main:
- `runner-logs` silently returned 0 with no output when the log file didn't exist on indri
- `ssh indri` swallowing remote exit codes (fish login shell), which the wrapper now works around via a stdout marker
## Test plan
- [ ] `argocd app set immich-ringtail --revision valkey-nix && argocd app sync immich-ringtail`
- [ ] `argocd app set paperless --revision valkey-nix && argocd app sync paperless`
- [ ] Both valkey pods come Ready and start serving on :6379
- [ ] Immich app + paperless can read/write their respective cache
- [ ] After merge: rebuild from squashed main commit + update kustomization tags (squash-tag follow-up)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: #362
`mise run runner-logs <run> -j <n>` previously silently succeeded with
no output when forgejo had no log for the task. Two layered causes:
1. zstdcat exits 0 even when the file is missing (writes "can't stat
… -- ignored" to stderr).
2. ssh to indri runs fish, which silently drops the remote exit code so
the subprocess returncode is always 0.
Probe `test -f` over SSH and parse a stdout marker (EXISTS / MISSING) to
detect the missing-log case, then report it explicitly with the indri
path and a hint about action_task.log_in_storage = 0 so the operator
knows where to look next.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Image was previously tagged with the unpoller-v3 branch SHA (1b27242),
which doesn't exist in main's history after squash-merge. Rebuilt from
the squashed commit so the tag references a reachable commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Summary
- Service Review pickup: unpoller (last reviewed 73 days ago).
- Upgrades unpoller from v2.34.0 to v3.2.0 (major version bump).
- Migrates the container build from a Dockerfile to a native Dagger pipeline (`containers/unpoller/container.py`) following the navidrome / miniflux pattern.
- Refreshes `service-versions.yaml` (last-reviewed, current-version).
## Breaking changes (upstream)
- **v3.0.0** — UniFi network API shifts (later 10.x). Some metric / event / log names and labels may have changed. Worth a follow-up sweep of the unpoller Grafana dashboard for missing series.
- **v3.2.0** — defaults to a 60s background poll feeding cached Prometheus scrapes (was on-demand poll per scrape). To restore previous behavior, set `interval = 0` in `up.conf`. Leaving the new default in this PR — every-15s scrapes will simply serve from cache, which is fine for our use.
## Build
- Image: `registry.ops.eblu.me/blumeops/unpoller:v3.2.0-1b27242`
- Built by build-container workflow run #559 from this branch.
## Test plan
- [ ] `argocd app set unpoller --revision unpoller-v3 && argocd app sync unpoller`
- [ ] Pod comes Ready
- [ ] Verify metrics exported (`Site/Client/UAP/USG/USW` counts in logs, `unpoller_*` series in Prometheus)
- [ ] Spot-check unpoller Grafana dashboard for missing series after the v3 API shift
- [ ] After merge: `argocd app set unpoller --revision main && argocd app sync unpoller`
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: #361
Bluegreen kept timing out — the new green machine couldn't reach
"started" within Fly's 5-minute deploy budget. The cold-start sequence
(tailscaled → tailscale up → wait-for-MagicDNS → nginx startup) eats
most of that, leaving no headroom for healthcheck propagation.
For a single-machine proxy, bluegreen offers little benefit anyway:
no warm second instance, so trading 5-10s of downtime for predictable
completion is the right call.
Switch ringtail.yml from forge.eblu.me (Fly proxy, WAN) to
forge.ops.eblu.me (Caddy on indri, tailnet). Ringtail is always
on the tailnet — the WAN round-trip was overhead and made
provision-ringtail fail any time Fly was slow or down.
The OMEN 27i IPS pumps brightness when its refresh swings into the low
VRR range during low-framerate content (game cutscenes), producing a
~20Hz flicker that compounds over a session until a reboot. GPU health
is clean (no Xid/ECC/thermal); pinning fixed 165Hz eliminates it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wine/Proton game segfaults (e.g. Diablo IV) produced multi-GB cores that
systemd-coredump spent minutes compressing to disk, pinning the CPU and
freezing the desktop. Cap ProcessSizeMax/ExternalSizeMax at 1G (oversized
cores logged but skipped) and MaxUse at 2G to bound the store.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The verify step pointed to the main repo page, but the "Synchronize now"
button is in the Mirror settings section of the settings page.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The pbpaste | op item edit ... "field[password]=-" stdin syntax is
rejected by op 2.34 as "invalid JSON" — recent op versions treat
piped input as a full JSON template, not a single field value.
Procedure now uses an inline assignment via a local fish variable.
1Password's desktop app names exports as
1PasswordExport-<uuid>-<timestamp>.1pux automatically — you can't
choose the name. Procedure now points the task at that glob.
Added vault split (blumeops vs Personal), noted onepassword-connect
runs on both indri and ringtail, and lifted op CLI guidance from
agent memory into the card. Bumped last-reviewed.
## Summary
Removes the compensating-controls (CC) framework. Prowler and Kingfisher continue to run weekly and produce reports; the Prowler mutelist YAML files stay in place but no longer carry \`CC: <id>\` prefixes — each entry now just keeps a free-form \`Description\` of why it's muted.
The CC review cadence proved to be more process overhead than this single-operator homelab needed.
## What changed
**Deleted**
- \`compensating-controls.yaml\` — the CC registry
- \`mise-tasks/review-compensating-controls\` — the staleness-review task
- \`docs/how-to/operations/review-compensating-controls.md\`
- \`docs/how-to/operations/record-review-evidence.md\` (was aspirational)
- \`docs/explanation/compliance-mute-categories.md\` (proposed-future CC/NA/RA work)
- 5 orphan \`+review-cc-*\` / \`+compliance-mute-categories\` changelog fragments
**Modified**
- 6 mutelist YAML files: stripped \`CC: <id>.\` prefix from every \`Description\` / \`statement\` field, kept the free-form text
- \`mise-tasks/review-compliance-reports\`: removed CC mentions from docstrings, panel text, and the node-verification table title. Node-verification logic itself is unchanged.
- \`docs/reference/operations/security.md\`: removed the "Compensating controls" section
- \`docs/how-to/operations/read-compliance-reports.md\`: rewrote step 3 of "Acting on findings" to point at the mutelist YAML directly
- \`docs/changelog.d/prowler-iac-mutelist.infra.md\`: rewrote to drop the "two new compensating controls" framing
## What did not change
- All Prowler manifests (cronjobs, RBAC, PVs, kustomization) — scans still run on the same schedule
- The Kingfisher deployment
- The trivy-shim in the Prowler container — that's about Trivy ignorefile plumbing, independent of the CC concept
- The mutelist entries themselves — each \`Resources\` list is unchanged; only the prose of \`Description\` was edited
- \`CHANGELOG.md\` — historical releases are left as-is
## Test plan
- [ ] Wait for human review before deploying — once merged, re-point ArgoCD: \`argocd app set prowler --revision main && argocd app sync prowler\` (no manifest changes besides the ConfigMap, so impact is limited to muted-finding descriptions in next week's report)
- [ ] Confirm next weekly Prowler K8s CIS run (Sunday 3am) still completes and produces a report on sifaka
- [ ] Confirm next weekly Prowler IaC run still honors \`trivyignore.yaml\` (the trivy shim is untouched but the ignorefile content was rewritten)
- [ ] \`mise run review-compliance-reports\` — verify node-verification block still runs and prints the renamed table title
Reviewed-on: #359
Grafana uses an RWO PVC for SQLite + Bleve search index. RollingUpdate
spawns the new pod before terminating the old one, so the new pod
crashloops on the index lock until rollout timeout. Recreate terminates
the old pod first, letting the new pod acquire the lock cleanly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Immich migrated to ringtail's k3s cluster but the probe still targeted
the in-cluster service DNS on indri's minikube, firing ServiceProbeFailure
indefinitely. Moved the target into alloy-ringtail's config so the probe
runs in the cluster where immich actually lives.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Image v1.1.3-3645098-nix was built directly on ringtail and pushed via
skopeo, bypassing the Forgejo runner: indri was severely overloaded
(load avg 24.92, minikube VM at 344% CPU) and the workflow-dispatch
endpoint timed out. The image content is identical to what the runner
would have produced — same default.nix at commit 3645098 (on main),
same NIX_PATH (current nixpkgs flake), same skopeo invocation. Tag
short-sha matches the commit that defines the recipe so we aren't
pinning to a ghost.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wheel/sdist + FOD hashes probed on ringtail. Full nix-build verified
end-to-end before commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
UE5 writes Saved/running.dat as a "session in progress" marker. If
the previous session exited uncleanly (SIGKILL, crash), it lingers,
and SN2 pops up an invisible 0×0 Error dialog at next launch that
the GameThread blocks on forever — visible only as a black screen
with a spinning loader. Wrap the Steam command to clear the marker
files before each launch.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #358 was squash-merged so the branch commit b8c7783 baked into the
prior image tag isn't reachable from main's history. Rebuild from main
HEAD (a33fa47) and retag. Image content is byte-identical (FOD is
content-addressed, inputs unchanged); only the SHA in the tag changes
so future provenance tracing stays on main.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Summary
Deploys `adelaide-baby-shower-app` **v1.1.2** to ringtail k3s.
- Bumps `containers/shower/default.nix` `version` to 1.1.2.
- Refreshes sdist + wheel `fetchurl` hashes against the forge PyPI artifacts.
- Re-probed FOD `outputHash` on the nix-container-builder runner (ringtail) and pinned the new closure hash.
- Bumps kustomize `newTag` to `v1.1.2-b8c7783-nix` (built from this branch's tip).
- Bumps `service-versions.yaml` entry for shower to `1.1.2` / `last-reviewed: 2026-05-15`.
## Build provenance
Built by Forgejo Actions run #553 on `nix-container-builder` (ringtail) at commit `b8c7783`. After merge a C0 follow-on will rebuild from main and retag so future provenance points at main history.
## Test plan
- [ ] `argocd app set shower --revision shower-v1.1.2 && argocd app sync shower` deploys cleanly
- [ ] Pod migrates the SQLite PV and serves at `shower.ops.eblu.me` / `shower.eblu.me`
- [ ] No new errors in pod logs after `collectstatic` + gunicorn boot
Reviewed-on: #358
Lets Subnautica 2 (and any other game) opt into the GE-Proton
build via Steam's per-game compatibility tool override, as a
workaround for the Proton Experimental + DXVK D3D12 Mercuna hang.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Summary
Nightly borgmatic backups have been failing for 2 days. Root cause: the
shower SQLite dump `before_backup` hook (added in PR #349) referenced
`kubectl --context=k3s-ringtail`, but indri's kubeconfig deliberately
doesn't carry the ringtail credentials. The hook's failure aborted the
entire run, taking out *both* the local sifaka repo and the BorgBase
offsite. Verified the last good archive was `indri-2026-05-11T02:00`.
## Approach
ssh into ringtail and run `k3s kubectl` there — no indri-side
kubeconfig needed. `/etc/rancher/k3s/k3s.yaml` is mode 644 so no sudo
required, and the existing ssh access from indri to ringtail works.
Inline-shell quoting got hairy fast (fish on ringtail rejected `POD=...`
bash syntax; the nix shower image lacks `tar` so `kubectl cp` fails).
Pulled the dump logic into `~/bin/borgmatic-k8s-sqlite-dump`, deployed
by the ansible role. Each dump entry now declares a `target`:
- `local:<context>` — local kubectl with explicit context (mealie)
- `ssh:<user@host>` — ssh + `k3s kubectl` on the cluster host (shower)
Bytes come back via `kubectl exec ... -- cat` instead of `kubectl cp`
since `cp` needs `tar` in the pod (nix-built containers don't bundle it).
## Test plan
- [x] `mise run provision-indri -- --tags borgmatic --check --diff` shows expected diff
- [x] Apply, helper script deployed at `~/bin/borgmatic-k8s-sqlite-dump`
- [x] Helper invoked directly with `ssh:eblume@ringtail` produces a valid 288 KB SQLite file
- [x] Full `borgmatic create` completes without errors — both mealie.db (1.7 MB) and shower.db (288 KB) appear in `~/.local/share/borgmatic/k8s-dumps/`, archive `indri-2026-05-13T17:31:02` written to sifaka borg repo
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: #357