Compare commits

..

52 commits

Author SHA1 Message Date
bc34b601be Merge pull request 'heph Authentik: grant offline_access scope (fixes spoke sync refresh-token 400)' (#371) from heph-offline-access into main 2026-06-06 18:29:47 -07:00
50a36ff93a heph Authentik: grant offline_access scope (fixes spoke sync refresh-token 400)
The heph CLI requests scope "openid offline_access", but the Authentik
heph OAuth2 provider only mapped openid/email/profile. Without the
offline_access mapping the issued refresh token is bound to the login
session rather than the 30-day refresh-token window; once the session
lapses, hephd's refresh_token grant returns 400 Bad Request and spoke
sync silently degrades (heph sync --status -> auth_failure: true).

Add the built-in offline_access scope mapping to the provider's
property_mappings and document the requirement in the service reference.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 18:07:13 -07:00
cf63fcb5b5 C0: track heph in service-versions (self-updating; note drift task)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 08:22:46 -07:00
3abe80523a C0: bump indri heph hub to v1.2.1 (PWA Authentik login + /config)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:40:51 -07:00
6576880b0e heph Authentik: register heph-pwa redirect URIs (PKCE login) (#370)
Adds the heph-pwa redirect URIs to the Authentik `heph` OAuth2 provider so the new browser **Login with Authentik** flow (Authorization Code + PKCE, hephaestus PR #9) can redirect back and exchange the code:

- `https://heph.ops.eblu.me/` (the PWA origin)
- `http://localhost:8787/` (local dev: `hephd --web-root`)

Authentik also keys token-endpoint CORS off these origins, so they're required for the browser token exchange. Additive (the provider was `redirect_uris: []`); harmless until the PWA feature deploys.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #370
2026-06-05 07:30:31 -07:00
a2f1e06224 Add hephaestus sync hub to indri (launchagent, PWA, device-code OIDC) (#369)
Makes indri the canonical **heph** hub for the hub-and-spoke task/context system, deployed as a self-updating LaunchAgent managed by Ansible. Other devices (gilbert) attach as offline-capable spokes.

## What's here
- **`ansible/roles/heph`** (tag `heph`) — bootstrap `cargo install hephd` (only if absent; `--self-update` keeps it current after), version-pinned `heph-pwa` checkout served via `--web-root`, launchagent `mcquack.eblume.heph`:
  ```
  hephd --mode server --http-addr 0.0.0.0:8787 --db … --web-root …
        --oidc-issuer …/o/heph/ --oidc-audience heph
        --self-update --self-update-interval-secs 600
  ```
  `~/.cargo/bin` is on the agent `PATH` so self-update's `cargo install` works.
- **Caddy** — `heph.ops.eblu.me → localhost:8787` (TLS for the PWA secure context).
- **Authentik** — new `heph` **public device-code** OIDC app + `default-device-code-flow` bound to the default brand's `flow_device_code` (verified live: brand `authentik-default`, field currently unset → additive).
- **Docs** — `services/hephaestus.md` (Path-A seeding runbook + spoke caveat), `indri.md`, changelog fragment.

## Three features requested
- **Autoupdate** — 10-min interval (`--self-update-interval-secs 600`).
- **PWA** — `--web-root` (confirmed shipped in v1.2.0).
- **Spoke** — gilbert reconfig documented (post-merge step).

## Deploy plan (not done yet — awaiting review)
1. Seed from gilbert (Path A): `heph daemon stop` → copy `heph.db` → `DELETE FROM meta WHERE key='origin'`.
2. Sync Authentik `apps`/blueprint; verify blueprint status via API (not just logs).
3. `provision-indri --tags heph,caddy` from this branch.
4. Point gilbert at the hub + `heph auth login`.

## Known follow-ups (heph-side, tracked in the Hephaestus project)
- `heph daemon` can't bake hub/spoke config or pass `--self-update-interval-secs` → worked around by the ansible plist.
- Path-A seeding lacks a clean `hephd --owner-id`/seed command → manual `meta.origin` reset for now.
- Self-update moves hephd ahead of the ansible-pinned PWA shell over time (drift; tolerated by the SW cache, revisit on next release).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #369
2026-06-05 06:46:58 -07:00
f6c926f1f5 C0: rebuild external-secrets off main, repoint both clusters to stable tags
indri -> v2.2.0-13895bb (arm64), ringtail -> v2.2.0-13895bb-nix (amd64).
Both deployed images now trace to main commit 13895bb instead of earlier
branch builds.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 16:19:20 -07:00
13895bb04a Localize external-secrets on ringtail (amd64 nix build) (#368)
Follow-up to #367. That PR localized external-secrets but the Dagger build (on indri's Apple Silicon runner) only produces an **arm64** image — and external-secrets also runs on **ringtail (amd64)** via the same shared manifest. This completes the localization so both clusters run the local binary on their native arch.

## Approach (matches the kube-state-metrics dual-build pattern)
- **`containers/external-secrets/default.nix`** (new) — builds the **amd64** image on ringtail's nix-container-builder. `buildGoModule` with Go 1.26 (v2.2.0 requires ≥1.26.1; nixpkgs default is 1.25.x) and `-tags all_providers`, faithful to upstream. Same v2.2.0 source from the forge mirror.
- **`argocd/manifests/external-secrets-ringtail/`** (new) — thin kustomize overlay that reuses the shared indri manifest as a base and overrides **only** the image to the `-nix` (amd64) tag. No manifest duplication.
- **`argocd/apps/external-secrets-ringtail.yaml`** — repointed at the new overlay.

Result: indri → `v2.2.0-…` (arm64, Dagger), ringtail → `v2.2.0-…-nix` (amd64, nix).

## Build
Run #581 built both arches at the branch commit. Verified the nix image is `linux/amd64`, entrypoint = the binary, user 65534.

## Deployed from branch & verified on ringtail (k3s, amd64)
- All 3 pods rolled to the nix amd64 image, `1/1 Running` (no exec-format error → arch correct)
- Controller logs clean
- **Live secret fetch proven:** force-synced `homepage/homepage-grafana` → `refreshTime` advanced, `Ready=True`
- **All 20** ringtail ExternalSecrets remain `SecretSynced=True`

## Post-merge
The `external-secrets-ringtail` app is temporarily pointed at this branch + overlay path (apps app left on `main`, manual-sync, untouched). After merge:
```
argocd app sync apps                       # picks up the new Application path on main
argocd app set external-secrets-ringtail --revision main && argocd app sync external-secrets-ringtail
```
I'll also rebuild off `main` so both clusters land on stable main-sha tags (as done for indri in #367).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #368
2026-06-04 15:37:42 -07:00
30c82079b9 C0: rebuild external-secrets image off main (v2.2.0-0e70a1b)
Repoint to the main-branch-built image so the deployed tag traces to a main
commit rather than the merged feature branch. Same v2.2.0 source, stable
provenance.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 14:59:17 -07:00
0e70a1b524 Localize external-secrets container (native container.py build) (#367)
Knocks out the weekly "pick one non-local container and make it local" task by moving **external-secrets** off `ghcr.io` onto a locally-built image, under our own supply-chain control. Doubles as its overdue service review.

## What changed
- **`containers/external-secrets/container.py`** (new) — native Dagger build (the Dockerfile→container.py migration pattern). Clones the forge mirror at `v2.2.0` and builds the single `all_providers` static Go binary, faithful to upstream's `make build` (CGO off, no version ldflags upstream). ENTRYPOINT is `/bin/external-secrets` so the controller/webhook/cert-controller Deployments select their role via `args:` exactly as before.
- **`argocd/manifests/external-secrets/kustomization.yaml`** — image swapped to `registry.ops.eblu.me/blumeops/external-secrets:v2.2.0-2985007`. **Like-for-like (v2.2.0)**, not an upgrade.
- **`service-versions.yaml`** — marked reviewed (2026-06-04), noted the local build.

## Build
Built on the indri forge runner (run #579, ~4 min) → pushed to Zot. Image config verified: `Entrypoint=/bin/external-secrets`, `User=65534`, version label `v2.2.0`.

## Deployed from branch & verified
- All 3 pods (controller / webhook / cert-controller) rolled to the local image, `1/1 Running`
- Controller + webhook logs clean (no errors; webhook serving TLS)
- **End-to-end secret fetch proven:** force-synced `monitoring/grafana-admin` → `refreshTime` advanced to now, `Ready=True`
- All 10 ExternalSecrets cluster-wide remain `SecretSynced=True` — no collateral damage
- App `Healthy`

## Post-merge
`external-secrets` currently points at this branch (so `apps` reads OutOfSync — expected). After merge:
```
argocd app set external-secrets --revision main && argocd app sync external-secrets
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #367
2026-06-04 14:55:55 -07:00
bb55fa9566 Recurring review sweep: 4 doc cards + nvidia-device-plugin v0.19.2 (#366)
Knocks out the two daily recurring review tasks (doc review + service review) in one PR.

## Doc review (4 never-reviewed reference cards, `last-reviewed: 2026-06-04`)
- **cluster.md** — Kubernetes version v1.34.0 → **v1.35.0**; refreshed the stale ringtail workload list and noted the in-progress minikube→k3s migration (points to `[[ringtail]]` as the canonical list).
- **ntfy.md / tempo.md / alloy.md** — corrected image references: these are now **locally-built `registry.ops.eblu.me/blumeops/*` nix containers** (ntfy v2.19.2, tempo v2.10.3, alloy-k8s v1.16.0), not upstream Docker Hub. Fly.io alloy binary bumped to v1.16.1.

## Service review
- **nvidia-device-plugin** (ringtail GPU): v0.19.0 → **v0.19.2**. Upstream patch releases — CDI/Tegra fixes + dependency bumps, no breaking changes for our manifest-based CDI + RuntimeClass setup (the service-account change in the notes is helm-only).

## Not in this PR (need container rebuilds, deferred)
The other stale services are locally-built nix images, so upgrading them is a forge-runner rebuild rather than a clean tag bump — left untouched (not date-bumped, so they resurface): **prometheus** (v3.10.0→v3.12.0), **loki** (3.6.7→3.7.2), **kube-state-metrics**, **homepage**. Happy to do these as a follow-up rebuild PR.

## Deploy / verify
Not yet deployed — `nvidia-device-plugin` still points at `main`. After review:
```
argocd app set nvidia-device-plugin --revision reviews-jun4 && argocd app sync nvidia-device-plugin
# after merge:
argocd app set nvidia-device-plugin --revision main && argocd app sync nvidia-device-plugin
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #366
2026-06-04 13:37:02 -07:00
02ea1cc72a C0: point tailscale-operator base mirror fetch at tailnet forge
The public forge.eblu.me now black-holes /mirrors/ at the Fly edge
(AI-scraper mitigation), so the in-cluster ArgoCD repo-server got a 403
fetching the upstream operator manifest — leaving tailscale-operator and
tailscale-operator-ringtail in Unknown sync. Use forge.ops.eblu.me.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 12:40:21 -07:00
Forgejo Actions
8f72f04d5c Update docs release to v1.17.0
- Built changelog from towncrier fragments

[skip ci]
2026-06-03 21:52:22 -07:00
29e0f012cd C0: pin Quartz docs build to v4.5.2 (v5.0.0 broke build)
The Dagger build_docs pipeline cloned Quartz from the default branch
unpinned. Quartz v5.0.0 restructured its config layout (.quartz/plugins,
../quartz imports), breaking the docs build against our existing
quartz.config.ts / quartz.layout.ts. Pin the clone to the last v4
release (v4.5.2) to restore known-good behavior.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 21:39:41 -07:00
2148714584 C0: retire Todoist blumeops-tasks; point task discovery at heph
Replace the Todoist-backed blumeops-tasks mise task with
`heph list --project Blumeops --json` (hephaestus, now at v1 prototype
on gilbert). Update task-discovery, rotation-reminder, and zk
references across docs; note the zk zettelkasten is migrating into
heph docs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 21:32:10 -07:00
308c8e3dad C0: drop duplicate Homepage static entries for ringtail-migrated services
Mealie, Paperless, Immich, TeslaMate are now autodiscovered from their ringtail
Ingress gethomepage.dev annotations; the static services.yaml entries (from when
they were on minikube, which homepage-on-ringtail can't autodiscover) were
duplicating them.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 15:31:59 -07:00
eaa899cfc6 C0: wave-1 decommission follow-ups (argocd admin RBAC, teslamate probe)
- argocd: grant local break-glass admin the admin role (g, admin, role:admin);
  previously only the Authentik admins group had access, locking out admin
  once its token expired (policy.default is unset).
- alloy-k8s: repoint the teslamate blackbox probe from the deleted minikube
  service to https://tesla.ops.eblu.me/ (Caddy over Tailscale), like immich.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:02:05 -07:00
46f0002178 Decommission wave-1 minikube services (paperless, teslamate, mealie) (#365)
Final step of the wave-1 indri-k8s migration. paperless, teslamate, mealie run on ringtail with data migrated, verified, and backed up (local + BorgBase offsite via PR #364).

- Remove minikube paperless/teslamate/mealie manifest dirs + ArgoCD app defs (prunes the parked Deployments/Services + redundant minikube mealie/paperless PVCs)
- Drop paperless/teslamate roles + ExternalSecrets from the minikube blumeops-pg cluster
- miniflux + authentik stay on minikube (later waves)

Finalization after merge: sync apps + databases to prune, then DROP DATABASE paperless/teslamate on indri's blumeops-pg (fresh safety dump taken first).

Reviewed-on: #365
2026-06-03 12:36:06 -07:00
44798a6429 C0: mealie-ringtail image rebuilt from main (e0057b4-nix)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 12:26:55 -07:00
e0057b46e4 Wire ringtail blumeops-pg into backups + Grafana (#364)
Prereq for the wave-1 decommission. The cutover moved paperless+teslamate (postgres) and mealie (SQLite) to ringtail, but borgmatic and the Grafana TeslaMate datasource still pointed at the minikube copies — the migrated live data was unbacked since cutover, and dropping the minikube DBs would break the TeslaMate dashboards.

- Tailscale Service `blumeops-pg-ringtail` + Caddy L4 route `pg.ops.eblu.me:5434`
- borgmatic: teslamate + paperless postgres → :5434; mealie SQLite → ssh:eblume@ringtail
- Grafana TeslaMate datasource → pg.ops.eblu.me:5434

Deploy: sync databases-ringtail (tailscale svc) + grafana from branch; provision-indri --tags caddy,borgmatic; verify a backup run + dashboards. Unblocks the decommission PR.
Reviewed-on: #364
2026-06-03 12:25:30 -07:00
92b54e7ba9 C0: ringtail wave-1 images rebuilt from main (fcac8e5-nix tags)
Post-merge rebuild of paperless/mealie/teslamate Nix images at the main
merge commit, replacing the feature-branch -nix tags. Image content is
identical; only the commit-sha suffix changes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 10:36:15 -07:00
fcac8e5a72 Wave 1 indri→ringtail migration: paperless, teslamate, mealie (#363)
Migrate paperless, teslamate, and mealie off the OOM-saturated minikube-indri node onto ringtail k3s, shedding ~1.1 GiB of resident load. Second chain in the indri-k8s decommission after immich.

**Containers ported to Nix (default.nix), build-verified on ringtail:**
- paperless → wraps nixpkgs paperless-ngx 2.20.15 (pinned unstable); runs as web/worker/beat/consumer
- mealie → wraps nixpkgs mealie 3.16.0 (forward 4-minor bump, breaking-change reviewed); single gunicorn, SQLite
- teslamate → from-scratch beamPackages mixRelease (not in nixpkgs); erlang_27+elixir_1_18, npm assets, ex_cldr locales pre-fetched

**Data:** cold downtime-tolerant cutover. paperless+teslamate postgres dump/restore from quiesced source into a new ringtail blumeops-pg CNPG cluster; mealie SQLite PVC copied. Source DBs untouched until verified (rollback = repoint).

**Also:** ringtail blumeops-pg cluster + ExternalSecrets scaffold; fixes pre-existing shower version-check drift.

Runbook: docs/how-to/ringtail/migrate-wave1-ringtail.md. Deploy-from-branch + cutover happens before merge; container images rebuilt from main after merge.
Reviewed-on: #363
2026-06-03 10:34:00 -07:00
40bd929820 C0: remove visible GNU Terry Pratchett from naughty.html body
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 37s
GNU lives in the overhead — the X-Clacks-Overhead header — never on the
visible page. Keep the header, drop the footer.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 20:55:05 -07:00
a36a18aaa6 C0: black-hole /mirrors/* at Fly edge + name-and-shame scrapers
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 35s
A $29.60 Fly bill traced to ~1.25 TB/30d egress on forge.eblu.me (99.95% of
all proxy egress), ~71% of it AI scrapers (Meta meta-externalagent, OpenAI
GPTBot, Amazonbot, Bytespider) crawling the public mirror repos' infinite
git-history URL space and timing out Forgejo. robots.txt already disallowed
/mirrors/ but those agents ignore it, so enforce at the edge: return 403 (^~
to beat the regex asset locations), served as a roll-of-dishonour page with an
X-Naughty-Scrapers header. Mirrors stay reachable on the tailnet via
forge.ops.eblu.me. Tier 2 (UA denylist + Anubis) and the Cloudflare rejection
are documented in docs/explanation/ai-scraper-mitigation.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 20:52:20 -07:00
e0064de83d C0: update ringtail flake inputs (nixpkgs, disko)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 15:52:09 -07:00
f588638331 C0: rebuild valkey from squashed main commit
Image tags from PR #362 (v8.1.7-02859c5{,-nix}) referenced a branch
SHA that no longer exists on main after squash-merge. Rebuilt both
the dagger arm64 and nix amd64 variants from the squashed commit
(ecded30) and updated paperless + immich-ringtail to the new tags.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 14:53:21 -07:00
ecded30073 Make valkey local on ringtail (nix amd64) + bump to 8.1.7 (#362)
## Summary

Weekly "make one non-local container local" pickup: immich-ringtail still pulled `docker.io/valkey/valkey:8.1.6` because the existing `containers/valkey/container.py` build was arm64-only.

- Adds `containers/valkey/default.nix` — nix-built amd64 valkey image, packaged by the ringtail nix-container-builder runner using `pkgs.dockerTools.buildLayeredImage`. Mirrors the existing `containers/authentik-redis/default.nix` pattern.
- `containers/valkey/container.py` keeps building the Alpine arm64 image for paperless on indri. Bumped both builds to upstream valkey 8.1.7 (Alpine 3.22 now ships `8.1.7-r0`; nixpkgs has 8.1.7).
- Splits `VERSION` (upstream app) from `ALPINE_PIN` (apk pin) in `container.py` so both build files can declare the same upstream version and pass `container-version-check`.
- Updates `service-versions.yaml`: current-version 8.1.7, refreshed last-reviewed, upstream-source now points at the canonical valkey-io releases page.
- Switches kustomizations:
  - `immich-ringtail/kustomization.yaml`: `docker.io/valkey/valkey:8.1.6` → `registry.ops.eblu.me/blumeops/valkey:v8.1.7-02859c5-nix`, comment updated.
  - `paperless/kustomization.yaml`: `v8.1.6-r0-fabca04` → `v8.1.7-02859c5`.

## Build

build-container run #563 — both jobs succeeded after a transient runner crash on the first dispatch (#562 build-nix), which surfaced two separate bugs that landed in a separate C0 on main:

- `runner-logs` silently returned 0 with no output when the log file didn't exist on indri
- `ssh indri` swallowing remote exit codes (fish login shell), which the wrapper now works around via a stdout marker

## Test plan

- [ ] `argocd app set immich-ringtail --revision valkey-nix && argocd app sync immich-ringtail`
- [ ] `argocd app set paperless --revision valkey-nix && argocd app sync paperless`
- [ ] Both valkey pods come Ready and start serving on :6379
- [ ] Immich app + paperless can read/write their respective cache
- [ ] After merge: rebuild from squashed main commit + update kustomization tags (squash-tag follow-up)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #362
2026-05-28 14:51:09 -07:00
1ce381cb6e C0: surface missing-log failures in runner-logs
`mise run runner-logs <run> -j <n>` previously silently succeeded with
no output when forgejo had no log for the task. Two layered causes:

1. zstdcat exits 0 even when the file is missing (writes "can't stat
   … -- ignored" to stderr).
2. ssh to indri runs fish, which silently drops the remote exit code so
   the subprocess returncode is always 0.

Probe `test -f` over SSH and parse a stdout marker (EXISTS / MISSING) to
detect the missing-log case, then report it explicitly with the indri
path and a hint about action_task.log_in_storage = 0 so the operator
knows where to look next.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 14:36:33 -07:00
e703d25efe C0: rebuild unpoller container from squashed main commit
Image was previously tagged with the unpoller-v3 branch SHA (1b27242),
which doesn't exist in main's history after squash-merge. Rebuilt from
the squashed commit so the tag references a reachable commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 10:10:21 -07:00
4d1f4af25b Upgrade unpoller v2.34.0 → v3.2.0, migrate to container.py (#361)
## Summary

- Service Review pickup: unpoller (last reviewed 73 days ago).
- Upgrades unpoller from v2.34.0 to v3.2.0 (major version bump).
- Migrates the container build from a Dockerfile to a native Dagger pipeline (`containers/unpoller/container.py`) following the navidrome / miniflux pattern.
- Refreshes `service-versions.yaml` (last-reviewed, current-version).

## Breaking changes (upstream)

- **v3.0.0** — UniFi network API shifts (later 10.x). Some metric / event / log names and labels may have changed. Worth a follow-up sweep of the unpoller Grafana dashboard for missing series.
- **v3.2.0** — defaults to a 60s background poll feeding cached Prometheus scrapes (was on-demand poll per scrape). To restore previous behavior, set `interval = 0` in `up.conf`. Leaving the new default in this PR — every-15s scrapes will simply serve from cache, which is fine for our use.

## Build

- Image: `registry.ops.eblu.me/blumeops/unpoller:v3.2.0-1b27242`
- Built by build-container workflow run #559 from this branch.

## Test plan

- [ ] `argocd app set unpoller --revision unpoller-v3 && argocd app sync unpoller`
- [ ] Pod comes Ready
- [ ] Verify metrics exported (`Site/Client/UAP/USG/USW` counts in logs, `unpoller_*` series in Prometheus)
- [ ] Spot-check unpoller Grafana dashboard for missing series after the v3 API shift
- [ ] After merge: `argocd app set unpoller --revision main && argocd app sync unpoller`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #361
2026-05-28 09:59:46 -07:00
f6febb1f77 C0: switch fly proxy deploy strategy to immediate
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 34s
Bluegreen kept timing out — the new green machine couldn't reach
"started" within Fly's 5-minute deploy budget. The cold-start sequence
(tailscaled → tailscale up → wait-for-MagicDNS → nginx startup) eats
most of that, leaving no headroom for healthcheck propagation.

For a single-machine proxy, bluegreen offers little benefit anyway:
no warm second instance, so trading 5-10s of downtime for predictable
completion is the right call.
2026-05-28 07:59:22 -07:00
4e25180b0a C0: clone blumeops via tailnet on ringtail provision
Switch ringtail.yml from forge.eblu.me (Fly proxy, WAN) to
forge.ops.eblu.me (Caddy on indri, tailnet). Ringtail is always
on the tailnet — the WAN round-trip was overhead and made
provision-ringtail fail any time Fly was slow or down.
2026-05-28 07:13:40 -07:00
c00d7db507 Recurring maintenance batch (2026-05-27) (#360)
Some checks failed
Deploy Fly.io Proxy / deploy (push) Failing after 14m10s
Bundle of recurring overdue tasks:

- Ringtail flake update
- Security & compliance report review
- Tooling deps bump (prek, fly, mise, forgejo workflows)
- Top stale doc review
- Top stale service review (if trivial)

Larger items (service version bumps requiring upgrades, non-local container migration) split out as separate PRs.

Reviewed-on: #360
2026-05-28 06:01:57 -07:00
Erich Blume
753fa9cb63 C0: disable VRR on ringtail DP-1 to stop OMEN panel flicker
The OMEN 27i IPS pumps brightness when its refresh swings into the low
VRR range during low-framerate content (game cutscenes), producing a
~20Hz flicker that compounds over a session until a reboot. GPU health
is clean (no Xid/ECC/thermal); pinning fixed 165Hz eliminates it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 12:59:29 -07:00
Erich Blume
c09bd5b612 C0: cap systemd-coredump on ringtail to stop game-crash lockups
Wine/Proton game segfaults (e.g. Diablo IV) produced multi-GB cores that
systemd-coredump spent minutes compressing to disk, pinning the CPU and
freezing the desktop. Cap ProcessSizeMax/ExternalSizeMax at 1G (oversized
cores logged but skipped) and MaxUse at 2G to bound the store.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 11:54:32 -07:00
35ae171783 C0: fix sync button location in manage-forgejo-mirrors
The verify step pointed to the main repo page, but the "Synchronize now"
button is in the Mirror settings section of the settings page.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 07:15:07 -07:00
57fd88b269 C0: fix op item edit syntax in zot key rotation
The pbpaste | op item edit ... "field[password]=-" stdin syntax is
rejected by op 2.34 as "invalid JSON" — recent op versions treat
piped input as a full JSON template, not a single field value.
Procedure now uses an inline assignment via a local fish variable.
2026-05-22 21:50:43 -07:00
08a1cb164a C0: fix 1password export filename in backup how-to
1Password's desktop app names exports as
1PasswordExport-<uuid>-<timestamp>.1pux automatically — you can't
choose the name. Procedure now points the task at that glob.
2026-05-22 21:36:13 -07:00
d02bf062af C0: review 1password reference card
Added vault split (blumeops vs Personal), noted onepassword-connect
runs on both indri and ringtail, and lifted op CLI guidance from
agent memory into the card. Bumped last-reviewed.
2026-05-22 21:29:11 -07:00
ee51bcafb4 Rip out compensating-controls framework (#359)
## Summary

Removes the compensating-controls (CC) framework. Prowler and Kingfisher continue to run weekly and produce reports; the Prowler mutelist YAML files stay in place but no longer carry \`CC: <id>\` prefixes — each entry now just keeps a free-form \`Description\` of why it's muted.

The CC review cadence proved to be more process overhead than this single-operator homelab needed.

## What changed

**Deleted**
- \`compensating-controls.yaml\` — the CC registry
- \`mise-tasks/review-compensating-controls\` — the staleness-review task
- \`docs/how-to/operations/review-compensating-controls.md\`
- \`docs/how-to/operations/record-review-evidence.md\` (was aspirational)
- \`docs/explanation/compliance-mute-categories.md\` (proposed-future CC/NA/RA work)
- 5 orphan \`+review-cc-*\` / \`+compliance-mute-categories\` changelog fragments

**Modified**
- 6 mutelist YAML files: stripped \`CC: <id>.\` prefix from every \`Description\` / \`statement\` field, kept the free-form text
- \`mise-tasks/review-compliance-reports\`: removed CC mentions from docstrings, panel text, and the node-verification table title. Node-verification logic itself is unchanged.
- \`docs/reference/operations/security.md\`: removed the "Compensating controls" section
- \`docs/how-to/operations/read-compliance-reports.md\`: rewrote step 3 of "Acting on findings" to point at the mutelist YAML directly
- \`docs/changelog.d/prowler-iac-mutelist.infra.md\`: rewrote to drop the "two new compensating controls" framing

## What did not change

- All Prowler manifests (cronjobs, RBAC, PVs, kustomization) — scans still run on the same schedule
- The Kingfisher deployment
- The trivy-shim in the Prowler container — that's about Trivy ignorefile plumbing, independent of the CC concept
- The mutelist entries themselves — each \`Resources\` list is unchanged; only the prose of \`Description\` was edited
- \`CHANGELOG.md\` — historical releases are left as-is

## Test plan

- [ ] Wait for human review before deploying — once merged, re-point ArgoCD: \`argocd app set prowler --revision main && argocd app sync prowler\` (no manifest changes besides the ConfigMap, so impact is limited to muted-finding descriptions in next week's report)
- [ ] Confirm next weekly Prowler K8s CIS run (Sunday 3am) still completes and produces a report on sifaka
- [ ] Confirm next weekly Prowler IaC run still honors \`trivyignore.yaml\` (the trivy shim is untouched but the ignorefile content was rewritten)
- [ ] \`mise run review-compliance-reports\` — verify node-verification block still runs and prints the renamed table title

Reviewed-on: #359
2026-05-22 21:08:53 -07:00
2fae0f7161 C0: switch grafana deployment to Recreate strategy
Grafana uses an RWO PVC for SQLite + Bleve search index. RollingUpdate
spawns the new pod before terminating the old one, so the new pod
crashloops on the index lock until rollout timeout. Recreate terminates
the old pod first, letting the new pod acquire the lock cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 06:33:26 -07:00
1897eb1c5b C0: move immich blackbox probe to ringtail alloy
Immich migrated to ringtail's k3s cluster but the probe still targeted
the in-cluster service DNS on indri's minikube, firing ServiceProbeFailure
indefinitely. Moved the target into alloy-ringtail's config so the probe
runs in the cluster where immich actually lives.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 08:46:22 -07:00
e222d47d45 C0: deploy shower v1.1.3 (kustomize newTag bump)
Image v1.1.3-3645098-nix was built directly on ringtail and pushed via
skopeo, bypassing the Forgejo runner: indri was severely overloaded
(load avg 24.92, minikube VM at 344% CPU) and the workflow-dispatch
endpoint timed out. The image content is identical to what the runner
would have produced — same default.nix at commit 3645098 (on main),
same NIX_PATH (current nixpkgs flake), same skopeo invocation. Tag
short-sha matches the commit that defines the recipe so we aren't
pinning to a ghost.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 20:09:54 -07:00
3645098bf1 C0: bump shower to v1.1.3
Wheel/sdist + FOD hashes probed on ringtail. Full nix-build verified
end-to-end before commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 19:57:37 -07:00
Erich Blume
96dbbb3cbe C0: add sn2-prelaunch wrapper to clear SN2 stale lockfiles
UE5 writes Saved/running.dat as a "session in progress" marker. If
the previous session exited uncleanly (SIGKILL, crash), it lingers,
and SN2 pops up an invisible 0×0 Error dialog at next launch that
the GameThread blocks on forever — visible only as a black screen
with a spinning loader. Wrap the Steam command to clear the marker
files before each launch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 12:26:10 -07:00
815a0cc6e6 C0: shower — rebuild from main SHA (post-merge retag)
PR #358 was squash-merged so the branch commit b8c7783 baked into the
prior image tag isn't reachable from main's history. Rebuild from main
HEAD (a33fa47) and retag. Image content is byte-identical (FOD is
content-addressed, inputs unchanged); only the SHA in the tag changes
so future provenance tracing stays on main.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 06:57:24 -07:00
a33fa47b80 C1: deploy shower v1.1.2 (#358)
## Summary

Deploys `adelaide-baby-shower-app` **v1.1.2** to ringtail k3s.

- Bumps `containers/shower/default.nix` `version` to 1.1.2.
- Refreshes sdist + wheel `fetchurl` hashes against the forge PyPI artifacts.
- Re-probed FOD `outputHash` on the nix-container-builder runner (ringtail) and pinned the new closure hash.
- Bumps kustomize `newTag` to `v1.1.2-b8c7783-nix` (built from this branch's tip).
- Bumps `service-versions.yaml` entry for shower to `1.1.2` / `last-reviewed: 2026-05-15`.

## Build provenance

Built by Forgejo Actions run #553 on `nix-container-builder` (ringtail) at commit `b8c7783`. After merge a C0 follow-on will rebuild from main and retag so future provenance points at main history.

## Test plan

- [ ] `argocd app set shower --revision shower-v1.1.2 && argocd app sync shower` deploys cleanly
- [ ] Pod migrates the SQLite PV and serves at `shower.ops.eblu.me` / `shower.eblu.me`
- [ ] No new errors in pod logs after `collectstatic` + gunicorn boot

Reviewed-on: #358
2026-05-15 06:50:46 -07:00
Erich Blume
12314857d8 C0: add GE-Proton to ringtail Steam extraCompatPackages
Lets Subnautica 2 (and any other game) opt into the GE-Proton
build via Steam's per-game compatibility tool override, as a
workaround for the Proton Experimental + DXVK D3D12 Mercuna hang.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 06:27:43 -07:00
4d2bc9975f C0: deploy shower v1.1.1 (kustomize newTag bump)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 20:51:10 -07:00
4e117dc921 C0: pin shower v1.1.1 FOD outputHash (probed on ringtail)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 20:40:22 -07:00
6e90c4c363 C0: bump shower to v1.1.1 (probe FOD hash)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 20:12:00 -07:00
dc69b8c68b C1: fix borgmatic shower SQLite dump (ssh to ringtail) (#357)
## Summary

Nightly borgmatic backups have been failing for 2 days. Root cause: the
shower SQLite dump `before_backup` hook (added in PR #349) referenced
`kubectl --context=k3s-ringtail`, but indri's kubeconfig deliberately
doesn't carry the ringtail credentials. The hook's failure aborted the
entire run, taking out *both* the local sifaka repo and the BorgBase
offsite. Verified the last good archive was `indri-2026-05-11T02:00`.

## Approach

ssh into ringtail and run `k3s kubectl` there — no indri-side
kubeconfig needed. `/etc/rancher/k3s/k3s.yaml` is mode 644 so no sudo
required, and the existing ssh access from indri to ringtail works.

Inline-shell quoting got hairy fast (fish on ringtail rejected `POD=...`
bash syntax; the nix shower image lacks `tar` so `kubectl cp` fails).
Pulled the dump logic into `~/bin/borgmatic-k8s-sqlite-dump`, deployed
by the ansible role. Each dump entry now declares a `target`:

- `local:<context>` — local kubectl with explicit context (mealie)
- `ssh:<user@host>` — ssh + `k3s kubectl` on the cluster host (shower)

Bytes come back via `kubectl exec ... -- cat` instead of `kubectl cp`
since `cp` needs `tar` in the pod (nix-built containers don't bundle it).

## Test plan

- [x] `mise run provision-indri -- --tags borgmatic --check --diff` shows expected diff
- [x] Apply, helper script deployed at `~/bin/borgmatic-k8s-sqlite-dump`
- [x] Helper invoked directly with `ssh:eblume@ringtail` produces a valid 288 KB SQLite file
- [x] Full `borgmatic create` completes without errors — both mealie.db (1.7 MB) and shower.db (288 KB) appear in `~/.local/share/borgmatic/k8s-dumps/`, archive `indri-2026-05-13T17:31:02` written to sifaka borg repo

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #357
2026-05-13 18:55:50 -07:00
203 changed files with 2435 additions and 1962 deletions

View file

@ -65,7 +65,7 @@ See [[agent-change-process]] for the full methodology.
./pulumi/ # Pulumi IaC (tailnet ACLs, dns, cloud)
~/.config/{nvim,fish} # user's shell config, managed by chezmoi
~/code/personal/ # user's projects
~/code/personal/zk # user's Obsidian-sync managed zettelkasten. Potential source for reference data.
~/code/personal/zk # user's zettelkasten (Obsidian-sync). Reference-data source; migrating into heph docs (hephaestus).
~/code/3rd/ # mirrored external projects
~/code/work # FORBIDDEN
```
@ -147,10 +147,16 @@ Create a new spork: `mise run spork-create <mirror-name>`
## Task Discovery
BlumeOps tasks live in [hephaestus](https://github.com/eblume/hephaestus) (`heph`),
the user's self-hosted context/task system. Fetch them with the CLI:
```fish
mise run blumeops-tasks # fetch from Todoist, sorted by priority
heph list --project Blumeops --json # outstanding Blumeops tasks as JSON
```
Most tasks are stored in `./mise-tasks/`. For scripts with any logic or
(This replaced the retired `blumeops-tasks` mise task, which read from Todoist.)
Most operational scripts are stored in `./mise-tasks/`. For scripts with any logic or
complexity, use uv run --script 's with explicit dependencies. Complex
workflows with artifacts should become dagger pipelines. Mise tasks are for
development processes and operations - tools for the user or the agent.

View file

@ -12,6 +12,259 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
<!-- towncrier release notes start -->
## [v1.17.0] - 2026-06-03
### Features
- Deploy the Adelaide / Heidi / Addie baby shower app — guest splash, raffle
picker, and prize assignment console — on ringtail k3s with `shower.eblu.me`
as the public entry and `shower.ops.eblu.me` as the tailnet admin host. App
source: [`adelaide-baby-shower-app`](https://forge.eblu.me/eblume/adelaide-baby-shower-app).
- Deploy adelaide-baby-shower-app v1.1.0 to ringtail k3s. Replaces the
boolean lock with a four-phase `ShowerState` (`pre_event``party`
`prizes_locked``event_locked`), adds an append-only "guest memories"
panel where guests can leave photos and comments for the baby, and
polishes the admin and QR views. Three Django migrations
(`0009_shower_phase`, `0010_guest_memories`, `0011_book_description`)
run automatically in the entrypoint against the SQLite PV. No config
or env-var changes.
Container build also gains a Forgejo-PyPI workaround: Forgejo's simple
index returns absolute file URLs hardcoded to the public ROOT_URL
(`forge.eblu.me`), which the Fly edge 403s on `/api/packages/*`. The
wheel and sdist are now both pulled via direct `fetchurl` against
`forge.ops.eblu.me` (tailnet-only) and the wheel is handed to pip as
a local path.
- `review-compliance-reports` now also fetches and summarizes the weekly Prowler container-image and IaC scans (previously only the K8s CIS in-cluster scan was processed). For each scan it shows status counts, severity breakdown, week-over-week delta, and — for the high-volume image/IaC scans — top-N tables grouped by check ID and resource instead of per-finding listings.
- runner-logs now authenticates with Forgejo API token and auto-detects the repo from git remote. Job logs are fetched via SSH to indri (reading Forgejo's on-disk zstd log files) instead of the web endpoint, which doesn't support token auth for private repos.
### Bug Fixes
- Fix nightly borgmatic backups failing for 2 days. The shower SQLite
dump hook referenced `kubectl --context=k3s-ringtail`, but indri's
kubeconfig deliberately doesn't carry the ringtail credentials. The
`before_backup` hook's failure aborted the entire run, taking out
*both* the local sifaka repo and the BorgBase offsite. Replaced
the inline-shell dump with a `~/bin/borgmatic-k8s-sqlite-dump`
helper deployed by the ansible role. Each dump entry now declares a
`target` of either `local:<context>` (mealie — kubectl uses indri's
kubeconfig) or `ssh:<user@host>` (shower — ssh into ringtail and
run `k3s kubectl` there, no indri-side kubeconfig needed; k3s.yaml
on ringtail is mode 644 so no sudo required). Bytes stream back via
`kubectl exec ... -- cat` rather than `kubectl cp`, since `kubectl
cp` requires `tar` inside the pod and nix-built images like shower
don't bundle it.
- Shower app container now bakes the wheel + Python deps into the image
at build time via `buildPythonPackage` instead of pip-installing on
first boot. Boots are deterministic and don't depend on forge PyPI
being reachable from the pod. The `wheelHash` in
`containers/shower/default.nix` is the sha256 sourced from the
[forge PyPI simple index](https://forge.eblu.me/api/packages/eblume/pypi/simple/adelaide-baby-shower-app/);
bumping the version means bumping that hash too.
Borgmatic now covers the shower app: SQLite is dumped from the live
pod via `kubectl exec` (mirroring the existing mealie entry, with
`context: k3s-ringtail`), and the prize-photo media share is picked up
through `/Volumes/shower` (sifaka SMB mount on indri, same pattern as
`/Volumes/photos`).
- Disabled adaptive sync (VRR) on ringtail's DP-1 output. The OMEN 27i IPS panel pumps brightness when its refresh rate swings into the low VRR range during low-framerate content (e.g. game cutscenes), producing a flicker that worsened over a session until a reboot. Pinning the panel to a fixed 165Hz eliminates it.
- Fixed forge.eblu.me static assets (CSS, JS, images, fonts) not loading — the proxy's static asset cache block was missing the `Host` header, so Caddy couldn't route the requests.
- Fixed homepage container EACCES on cold start: the nix-built image now chowns
`/app/config` to uid 1000 at build time via `fakeRootCommands`, matching the
behavior of the old Dockerfile. Without this, homepage couldn't seed missing
skeleton configs (proxmox.yaml etc.) or create `/app/config/logs`, crashing on
its first uncached request. Caught during the ringtail cutover.
- Fixed sway keybindings on ringtail — the home-manager `keybindings` block was replacing the module's defaults entirely, leaving only explicit overrides (no workspace switching, focus, move, splits, resize mode, etc). Switched to `lib.mkOptionDefault` with `lib.mkForce` on the conflicting custom binds (`Mod+Return`, `Mod+d`, `Mod+space`, `Mod+l`) so defaults merge back in. Also added `Mod+F1` to show a filterable fuzzel list of current keybindings.
Fixed fuzzel config errors on launch — `border-radius` and `border-width` were under `[main]`, but fuzzel expects them as `radius`/`width` under a `[border]` section.
- Pin the Quartz docs build to v4.5.2. The Dagger `build_docs` pipeline cloned Quartz from the default branch unpinned; Quartz v5.0.0 restructured its config layout (`.quartz/plugins`, `../quartz` imports) and broke the docs build against our existing `quartz.config.ts`/`quartz.layout.ts`.
### Infrastructure
- Wire the ringtail `blumeops-pg` cluster (which holds the wave-1-migrated
paperless + teslamate databases) into backups and Grafana. Adds a Tailscale
LoadBalancer Service (`blumeops-pg-ringtail.tail8d86e.ts.net`) and a Caddy L4
route (`pg.ops.eblu.me:5434`), then repoints borgmatic's `teslamate` +
`paperless` postgres dumps and the `mealie` SQLite dump at ringtail, and the
Grafana TeslaMate datasource at the ringtail DB. Closes the backup gap that
opened at cutover (the migrated live data was still being backed up from the
now-frozen minikube copies) and unblocks the wave-1 decommission.
- Migrated homepage dashboard from minikube (indri/arm64) to k3s (ringtail/amd64).
The container is now built via nix (`containers/homepage/default.nix`), adapted
from nixpkgs `homepage-dashboard` with the upstream Next.js cache patches and
wrapped with `dockerTools.buildLayeredImage`. Autodiscovery shifts: services on
minikube (ArgoCD, Immich, Kiwix, Mealie, Miniflux, Grafana, Prometheus,
Navidrome, Paperless, TeslaMate, Transmission) become explicit static entries
in `services.yaml`; ringtail services (Authentik, Frigate/NVR, Ntfy, Ollama)
auto-populate via Ingress annotations.
- Migrated CV (`cv.eblu.me`) and Docs (`docs.eblu.me`) from minikube Deployments to indri-native ansible roles. Caddy now serves the extracted release tarballs directly via a new `kind: static` service-block in the Caddy template — no daemon, no container — replacing the prior nginx-in-a-pod layer. Removes a network hop on every request and shrinks minikube's footprint. See [[cv-on-indri]] and [[docs-on-indri]]. Part of the broader minikube wind-down.
- Migrated devpi (PyPI mirror at `pypi.ops.eblu.me`) from a minikube StatefulSet to a launchd-managed service on indri. devpi-server now runs in a uv-managed venv with pinned `devpi-server` and `devpi-web` versions, listens on `127.0.0.1:3141`, and is fronted by Caddy. The minikube StatefulSet was crash-looping under memory pressure (and breaking the Python toolchain everywhere); the new layout removes a layer of dependency on cluster health for critical-path tooling. See [[devpi-on-indri]].
- Move the entire Immich stack — server, machine-learning, valkey,
and the PostgreSQL+VectorChord cluster — off `minikube-indri` and
onto `k3s-ringtail`. Postgres data migrated zero-loss via CNPG
`pg_basebackup` (replica catch-up then promote); row counts on
`asset`, `user`, `album`, `smart_search`, `activity`, `asset_face`
verified equal between source and replica before cutover. The ML
pod now uses ringtail's RTX 4080 via the nvidia-device-plugin
(time-slicing bumped 2 → 4 to share with frigate + ollama). Caddy
routing at `photos.ops.eblu.me` is unchanged (still
`photos.tail8d86e.ts.net`, the device just lives on ringtail now).
Borgmatic backups continue against the same `immich-pg` tailnet
hostname. First concrete chain in the broader indri-k8s
decommission effort.
- Add local nix container build for `tailscale` (`containers/tailscale/default.nix`) so ringtail's tailscale-operator ProxyClass proxy pods pull from the forge mirror instead of `docker.io/tailscale/tailscale`. Pinned at v1.94.2 to match `service-versions.yaml`. Indri's tailscale-operator continues to use upstream during the k8s-to-ringtail migration.
- Address the 6 critical Prowler IaC findings against `argocd/manifests/`. Prowler's IaC provider hardcodes `self._mutelist = None` and delegates filtering to Trivy, but doesn't plumb `--ignorefile` through — so the documented "use Trivy filtering" path is actually broken. Added a shim around `trivy` in the Prowler image that injects `--ignorefile $TRIVY_IGNOREFILE` for `trivy fs` invocations when the env var points at a real file. The IaC cronjob now mounts `mutelist/trivyignore.yaml` (Trivy's per-path schema) and sets the env var, muting the `external-secrets` and `kube-state-metrics` Secret-access findings (KSV-0041, KSV-0114). Separately, `grafana-clusterrole` is tightened to remove `secrets` access entirely: the dashboard sidecar already only consumes ConfigMap-labeled dashboards, so its `RESOURCE` env var is now `configmap` instead of `both`.
- Pin ringtail's wired IP to `192.168.1.21` via NixOS scripted networking; NetworkManager no longer manages `enp5s0`. Removes DHCP lease renewal as a failure mode after a silent lease teardown took ringtail offline. Also explicitly enables `net.ipv4.ip_forward` (previously set implicitly by scripted-DHCP) so k3s pod networking and Tailscale routing continue to work with static networking.
- Ripped out the compensating-controls (CC) framework: deleted `compensating-controls.yaml`, the `review-compensating-controls` mise task, and the associated how-to / explanation docs. Prowler and Kingfisher continue to run weekly and produce reports; the Prowler mutelist YAML files remain in place but no longer carry `CC: <id>` prefixes — each entry just keeps a free-form `Description` of why the finding is muted. The CC review cadence proved to be more overhead than this single-operator homelab needed.
- Wire shower app for public exposure: fly nginx `shower.eblu.me` server
block as a guest-only surface — splash page, `/prizes/<token>/`, static
assets, media. Everything authenticated (`/admin/`, `/host/`,
`/accounts/`) returns 403 with a "tailnet only" pointer. Staff hit
`shower.ops.eblu.me` for the operator console + admin; the app's
v1.0.1 `DJANGO_PUBLIC_URL_BASE` setting makes QR codes generated on
the tailnet point back at the WAN host for guests. Plus a Caddy route
on indri, Pulumi Gandi CNAME, and a Grafana APM dashboard tracking
request rate, error rate, latency, bandwidth, and access logs.
- Mirror Valkey 8.1 locally as `registry.ops.eblu.me/blumeops/valkey`. Replaces direct pulls of `docker.io/valkey/valkey:8.1-alpine` for paperless and immich sidecars. Built via native Dagger pipeline on Alpine 3.22. Stateless swap — no data migration. Authentik's nix-built Redis remains separate.
- Add nix-built amd64 valkey for ringtail (`containers/valkey/default.nix`) so immich-ringtail can stop pulling the upstream multi-arch `docker.io/valkey/valkey` image. Existing `container.py` continues to build Alpine arm64 for paperless on indri. Both bump to valkey 8.1.7 (Alpine 3.22 8.1.7-r0 / nixpkgs 8.1.7).
- Upgrade Grafana Alloy v1.14.0 → v1.16.0 across all four service deployments
(alloy-k8s, alloy-ringtail, alloy-tracing-ringtail on k8s; alloy native on
indri). Pulls in stable database observability (v1.15) and the OTel Collector
v0.147.0 bump. Container build also migrated from Dockerfile to native Dagger
`container.py` per the build-container-image migration playbook.
- Upgraded Dagger from v0.20.1 to v0.20.6 (engine, CLI pin, and SDK regen) and migrated `runner-job-image` from a Debian-based Dockerfile to a native Dagger `container.py` on Alpine 3.23, reusing the shared `alpine_runtime` helper.
- Decommission the wave-1 services on minikube-indri now that paperless,
teslamate, and mealie run on ringtail with their data backed up. Removes the
minikube `paperless`/`teslamate`/`mealie` manifest dirs + ArgoCD app
definitions (pruning the parked Deployments, Services, and the redundant
minikube mealie/paperless PVCs), and drops the `paperless`/`teslamate` roles
from the minikube `blumeops-pg` cluster. The `paperless` and `teslamate`
databases are dropped from indri's blumeops-pg as the finalization step.
miniflux + authentik remain on the minikube cluster (later waves).
- Upgraded the k8s Forgejo runner to the v12.8 line, switched it from first-boot registration to declarative `server.connections` credentials from 1Password, and consolidated the supporting runner how-to documentation.
- Move paperless, teslamate, and mealie off `minikube-indri` onto
`k3s-ringtail`, shedding ~1.1 GiB of resident load from the
OOM-thrashing 8 GiB minikube node (the kernel OOM killer had been
killing `kube-apiserver`/`dockerd`/argocd, flapping every
minikube-hosted service at once). paperless + teslamate databases
move into a fresh CNPG `blumeops-pg` cluster on ringtail via a cold
`pg_dump`/`pg_restore` from the quiesced source — row counts verified
equal before any routing flip; source DBs dropped only after the
ringtail side serves traffic. mealie's SQLite PVC is copied as-is.
paperless media stays on sifaka NFS. Downtime-tolerant cold cutover
(no streaming replication); rollback is repoint-and-scale-up with the
source untouched. Second chain in the indri-k8s decommission after
[[migrate-immich-to-ringtail]].
- Recurring maintenance batch:
- Ringtail flake inputs refreshed (`disko`, `home-manager`, `nixpkgs`).
- Tooling deps bumped: prek hooks (trufflehog v3.95.3, kingfisher v1.101.0, ruff v0.15.14, `ansible-core` 2.21.0); fly proxy base images (nginx 1.30.1-alpine, alloy v1.16.1); `typer==0.26.2` in mise tasks.
- Updated `nixos/ringtail/flake.lock` (weekly cadence): `disko`, `home-manager`, and `nixpkgs` inputs refreshed. `nixpkgs-services` skipped per overlay convention.
- Reviewed `mealie` service version freshness; upstream is 5 minor versions ahead (v3.17.0 vs deployed v3.12.0). Marked reviewed; upgrade deferred.
- Deploy shower v1.1.2 — bump container build to new app release.
- Upgrade unpoller v2.34.0 → v3.2.0 and migrate container build from Dockerfile to native Dagger (container.py). v3.0.0 carries breaking UniFi API changes; v3.2.0 introduces a 60s background poll (cached scrapes) by default — set `interval = 0` in `up.conf` to restore on-demand polling.
- Monthly tooling dependency refresh: prek hooks (trufflehog, kingfisher, ruff, shfmt, prettier, actionlint, ansible-lint), fly proxy base images (nginx 1.30.0, tailscale v1.94.2, alloy v1.16.0), normalize pyyaml lower bound in mise-tasks.
- Add GE-Proton (`pkgs.proton-ge-bin`) to `programs.steam.extraCompatPackages`
on ringtail. Subnautica 2 hangs at Mercuna plugin init under Proton
Experimental + DXVK D3D12; GE-Proton is available as a Steam per-game
compatibility option to work around it.
- Add `sn2-prelaunch` Steam launch wrapper on ringtail that removes
Subnautica 2's stale `Saved/running.dat` and `Saved/beforelobby.dat`
lockfiles before each launch. SN2 pops up an invisible (0×0-sized)
Error dialog when it detects an unclean exit, blocking GameThread
forever; this is observable only as a black screen with a spinning
loader. Use via Steam launch option: `sn2-prelaunch %command%`.
- Add local nix container build for `frigate-notify` (`containers/frigate-notify/default.nix`) so the Frigate→ntfy bridge is rebuilt on ringtail from the forge mirror instead of pulled from `ghcr.io/0x2142/frigate-notify`.
- Add resource limits to all ArgoCD pods to prevent unbounded resource consumption during node-wide pressure events.
- Black-hole the `/mirrors/*` repositories at the Fly proxy edge (`return 403``forge.ops.eblu.me`). A surprise $29.60 Fly bill traced to ~1.24 TB/30d of egress on `forge.eblu.me`, 99.95% of all proxy egress — of which ~71% was AI scrapers (Meta `meta-externalagent`, OpenAI `GPTBot`, Amazonbot) crawling the near-infinite git-history URL space of the public mirror repos and timing out Forgejo in the process. Mirrors exist for supply-chain control and are consumed over the tailnet, so their public web UI had no legitimate audience. `robots.txt` already disallowed `/mirrors/`, but the offending agents ignore it. Tier-2 mitigations (user-agent denylist, Anubis proof-of-work gateway) are documented in `docs/explanation/ai-scraper-mitigation.md`.
- Bump paperless and immich kustomizations to the main-SHA-built valkey tag (`v8.1.6-r0-fabca04`). Routine post-merge follow-up to keep production manifests pointing at images built from a commit on main.
- Bump shower container to v1.1.1 (probe FOD hash).
- Bumped shower app to v1.1.3 (wheel/sdist + FOD hashes probed on ringtail).
- Cap systemd-coredump on ringtail (ProcessSizeMax/ExternalSizeMax 1G, MaxUse 2G) so multi-GB Wine/Proton game crash dumps no longer thrash the disk and lock up the desktop.
- Deploy shower v1.1.1 to ringtail (kustomize newTag bump).
- Deployed shower v1.1.3 to ringtail (image built and pushed from ringtail; runner bypassed due to indri overload).
- Fix three follow-ups from the wave-1 decommission: grant the local
break-glass `admin` account ArgoCD admin rights (`g, admin, role:admin`
previously only the Authentik `admins` group had access, so admin was
locked out whenever its token expired), and repoint the alloy blackbox
probe for teslamate from the deleted minikube service to
`https://tesla.ops.eblu.me/` (through Caddy over Tailscale). The orphaned
paperless/teslamate roles + ExternalSecrets left on the minikube
blumeops-pg are also cleaned up.
- Moved the Immich blackbox health probe from indri's alloy to ringtail's alloy. After the immich migration to ringtail, the probe still targeted `immich-server.immich.svc.cluster.local` on indri's cluster where the service no longer exists, causing a persistent `ServiceProbeFailure` alert.
- Pin shower v1.1.1 FOD outputHash (probed locally on ringtail).
- Rebuild Prowler container against main HEAD (v5.23.0-495e45d) after merging the IaC mutelist Dockerfile changes.
- Rebuild and retag alloy v1.16.0 container images from the main-branch SHA
following the squash-merge of #345, per the build-container-image
squash-merge convention. Both images (`registry.ops.eblu.me/blumeops/alloy`)
now reference `9564435` rather than the branch SHA `26a3ab5`, restoring
source traceability after branch cleanup.
- Rebuild shower from the post-merge commit on main so the container's
SHA tag points at a commit that will still exist after the 30-day
branch-cleanup window. Functionally identical to the branch-tag image
already deployed, just preserves source traceability per
[[build-container-image#Squash-merge and container tags]].
- Rebuild unpoller container from squashed main commit so the image SHA tag matches a commit in main's history (was tagged with the pre-squash branch SHA).
- Rebuild valkey container from squashed main commit (both arm64 dagger and amd64 nix variants), and update paperless + immich-ringtail kustomizations to the main-SHA tags `v8.1.7-ecded30` and `v8.1.7-ecded30-nix`.
- Retired the `blumeops-tasks` mise task (Todoist API) in favor of `heph list --project Blumeops --json` from the self-hosted [hephaestus](https://github.com/eblume/hephaestus) system. Updated docs to point task discovery and rotation reminders at heph, and noted that the `~/code/personal/zk` zettelkasten is migrating into heph docs.
- Switch the Fly proxy deploy strategy from `bluegreen` to `immediate` in `fly/fly.toml`. With a single proxy machine, bluegreen offers little benefit — the green machine routinely failed to reach "started" inside Fly's default 5-minute deploy timeout (the cold-start sequence of `tailscaled``tailscale up` → wait-for-MagicDNS → nginx startup eats most of the budget), and the failed deploys would roll back. `immediate` replaces the machine in place with a brief downtime (~510s) but actually completes.
- Switch the ringtail provisioning playbook's blumeops clone URL from `forge.eblu.me` (public, via Fly proxy) to `forge.ops.eblu.me` (tailnet, direct via Caddy on indri). Ringtail is always on the tailnet, so the WAN round-trip is pure overhead — it also made `provision-ringtail` brittle whenever the Fly proxy was slow or down.
- Switched Grafana's deployment strategy from `RollingUpdate` to `Recreate`. With an RWO PVC holding the SQLite database and Bleve search index, `RollingUpdate` reliably crashloops the new pod on the index lock until rollout timeout. `Recreate` terminates the old pod first so the new one acquires the lock cleanly.
- Update `tailscale-operator-ringtail` ProxyClass to reference the `0108b68` main-SHA build of the tailscale container. Routine post-merge cleanup so the deployed image traces to a commit that survives PR branch cleanup.
- Update the ringtail NixOS flake lockfile (`nixos/ringtail/flake.lock`): bump
`nixpkgs` (b77b3de → 25f5383) and `disko` (5ba0c95 → 115e521) to latest.
`nixpkgs-services` was intentionally left pinned (skipped by the
`flake-update` pipeline). Routine recurring maintenance per [[manage-lockfile]].
- Upgrade native macOS Alloy on indri to v1.16.0. Built on gilbert with Go
1.26.2 + CGO (required for the macOS native DNS resolver, which Tailscale
MagicDNS depends on), scp'd to `~/.local/bin/alloy` on indri, codesigned,
and the LaunchAgent reloaded. Completes the v1.16.0 fleet upgrade started
in #345 — all four Alloy services (alloy-k8s, alloy-ringtail,
alloy-tracing-ringtail, alloy ansible) now run v1.16.0.
- Upgraded zot on indri from v2.1.15 to v2.1.16 (security fixes: TLS verification on metrics client, CORS Allow-Credentials suppression on wildcard origins, manifest/API-key body size limits).
### Documentation
- Reviewed `replicating-blumeops` tutorial: fixed "BluemeOps" typos (also in `contributing.md`) and added `last-reviewed` frontmatter.
- Reviewed [[indri]] reference card: added `devpi`, `cv`, and `docs` to the native-services list; widened the k8s note to reflect the growing set of apps now on ringtail and the planned indri-minikube decommission; added CPU/RAM specs.
- New how-to: rotate-fly-deploy-token. Documents the 75-day rotation cadence, why we use `org`-scoped tokens (silences the cosmetic metrics-token warning on `fly status` with marginal blast-radius cost given the single-app personal org), and the procedure for rotation + Forgejo Actions secret sync.
- Add `docs/explanation/ai-scraper-mitigation.md` — the egress-cost / AI-crawler threat model for the public Fly proxy, the tiered mitigation plan (Tier 1: mirror black-hole, shipped; Tier 2: user-agent denylist + Anubis; Tier 3: Cloudflare, rejected on principle), and the data behind it.
- Fix manage-forgejo-mirrors verify step — sync button is on the repo settings page ("Synchronize now"), not the main repo page.
- Fixed the `op item edit` invocation in the [[zot]] API-key rotation procedure: the previous `pbpaste | op item edit ... "field[password]=-"` stdin syntax is rejected by op 2.34 as "invalid JSON" (recent op versions treat piped input as a full JSON template, not a single field value). Procedure now reads the clipboard into a local fish variable and passes it as an inline assignment.
- Fixed the export-filename step in [[run-1password-backup]]: 1Password's desktop app names the export `1PasswordExport-<account-uuid>-<timestamp>.1pux` automatically rather than letting you save to a fixed name, so the procedure now points the task at that glob instead of pretending the default name is `1Password-export.1pux`.
- Refresh the contributing tutorial: add `last-reviewed`, include the `.ai.md` changelog fragment type, and clarify that `prek` is pinned via `mise`.
- Review and refresh the Navidrome reference card: add `last-reviewed`, correct the scanner env var name, document the current image/version, and record routing and runtime details from the manifests.
- Review and refresh the Ollama reference card: add `last-reviewed`, bump the documented image tag to 0.20.4, and add the two `qwen3.5` models now declared in `models.txt`.
- Reviewed [[1password]] reference card: added the `blumeops` vs `Personal` vault split, noted that `onepassword-connect` runs on both indri and ringtail (not just one cluster), and pulled the `op read` vs `op item get --fields` guidance up from agent memory into the card.
- Reviewed `index.md`; added ringtail to the infrastructure overview and stamped `last-reviewed`.
- Reviewed transmission card: corrected storage layout (`/config/` is emptyDir, watch dir disabled) and noted the Prometheus exporter sidecar.
- rotate-fly-deploy-token: combine mint+store into one command with both fish and bash forms; document the `op item edit` "Password item requires ps value" validator gotcha and the placeholder-password workaround.
### AI Assistance
- Adopt `AGENTS.md` as the canonical agent instruction file, keep `CLAUDE.md` as a compatibility shim, and update docs to reference the neutral file and the correct agent-change-process path.
- CLAUDE.md now imports AGENTS.md via `@AGENTS.md` instead of telling agents to go read it. Claude Code only auto-loads CLAUDE.md, so the prose shim was easy to skip; the import inlines AGENTS.md into the session prompt unconditionally.
### Miscellaneous
- Removed the dead minikube manifests, container builds, and tooling shims left behind after the cv + docs migration to indri-native (#342). Deletes `argocd/{apps,manifests}/{cv,docs}/`, `containers/{cv,quartz}/`, and the `quartz``docs` mapping in `mise-tasks/container-version-check`. Bumps `docs.current-version` to `v1.16.0` (the blumeops release tag) now that the legacy nginx-base version pin is gone.
- Rebuild shower v1.1.0 container from main HEAD (`3c7967e`) and bump the
kustomization tag to `v1.1.0-3c7967e-nix`. The PR was squash-merged, so
the branch commit `444ff91` baked into the prior tag isn't reachable
from main's history. The new tag points at a commit that exists on
main; image content is byte-identical because the FOD output is content
addressed and the inputs didn't change.
- Rebuild shower v1.1.2 from main HEAD (a33fa47) and retag — PR #358 was squash-merged so the branch SHA baked into the prior image tag isn't reachable from main. FOD is content-addressed, so image bytes are identical; only provenance changes.
- Remove the duplicate Homepage tiles for Mealie, Paperless, Immich, and
TeslaMate. Homepage runs on ringtail and autodiscovers ringtail Ingresses via
`gethomepage.dev/*` annotations; once these services migrated to ringtail they
were discovered automatically, making their leftover static `services.yaml`
entries (needed only while they lived on minikube) redundant.
- Removed the now-unused `containers/devpi/` Dagger build artifact. Devpi runs natively on indri via uv venv; the container image is no longer referenced anywhere. Doc examples in `docs/reference/tools/dagger.md` updated to use `miniflux` as the example container name.
- `container-build-and-release` now prints the specific `mise run runner-logs <N>` command after dispatching, polling the Forgejo API to resolve the run number for the commit it just triggered.
- `mise run runner-logs <run> -j <n>` now reports a clear error when the log file doesn't exist on indri (e.g. a runner crash that left `action_task.log_in_storage = 0`). Previously it printed only the header and exited 0, because `zstdcat` exits 0 with a "can't stat … -- ignored" stderr message and ssh+fish on indri swallows the remote exit code.
## [v1.16.0] - 2026-04-18
### Infrastructure

View file

@ -260,5 +260,7 @@
tags: cv
- role: docs
tags: docs
- role: heph
tags: heph
- role: caddy
tags: caddy

View file

@ -57,7 +57,7 @@
tasks:
- name: Ensure blumeops repo is present
ansible.builtin.git:
repo: "https://forge.eblu.me/eblume/blumeops.git"
repo: "https://forge.ops.eblu.me/eblume/blumeops.git"
dest: /etc/blumeops
version: "{{ ringtail_commit | default('main') }}"
force: true

View file

@ -56,8 +56,9 @@ borgmatic_k8s_sqlite_dumps:
namespace: mealie
label_selector: app=mealie
db_path: /app/data/mealie.db
# local kubectl, --context=minikube (indri's only configured ctx)
target: local:minikube
# migrated to ringtail (wave-1); ssh to ringtail and run k3s kubectl
# there, same as shower below.
target: ssh:eblume@ringtail
- name: shower
namespace: shower
label_selector: app=shower
@ -102,17 +103,18 @@ borgmatic_postgresql_databases:
hostname: pg.ops.eblu.me
port: 5432
username: borgmatic
- name: teslamate
hostname: pg.ops.eblu.me
port: 5432
username: borgmatic
- name: authentik
hostname: pg.ops.eblu.me
port: 5432
username: borgmatic
# migrated to ringtail blumeops-pg (wave-1); port 5434 = Caddy L4 route
- name: teslamate
hostname: pg.ops.eblu.me
port: 5434
username: borgmatic
- name: paperless
hostname: pg.ops.eblu.me
port: 5432
port: 5434
username: borgmatic
# immich-pg cluster (VectorChord) via Caddy L4 on port 5433
- name: immich

View file

@ -19,8 +19,10 @@
ansible.builtin.copy:
content: |
# Managed by ansible (borgmatic role) - k8s PostgreSQL backup credentials
# 5432 = minikube blumeops-pg, 5433 = immich-pg, 5434 = ringtail blumeops-pg
pg.ops.eblu.me:5432:*:borgmatic:{{ borgmatic_db_password }}
pg.ops.eblu.me:5433:*:borgmatic:{{ borgmatic_db_password }}
pg.ops.eblu.me:5434:*:borgmatic:{{ borgmatic_db_password }}
dest: ~/.pgpass
mode: '0600'
no_log: true

View file

@ -28,7 +28,9 @@ db_path=${4:?missing db path}
name=${5:?missing name}
dump_target=${6:?missing dump target}
pod_tmp="/tmp/${name}-backup.db"
# Stage the backup next to the source DB (a guaranteed-writable volume);
# minimal nix images (e.g. mealie) have no /tmp.
pod_tmp="$(dirname "$db_path")/.borgmatic-backup-${name}.db"
python_backup='import sqlite3; sqlite3.connect("'"$db_path"'").backup(sqlite3.connect("'"$pod_tmp"'"))'

View file

@ -52,6 +52,9 @@ caddy_services:
- name: devpi
host: "pypi.{{ caddy_domain }}"
backend: "http://localhost:3141"
- name: heph
host: "heph.{{ caddy_domain }}"
backend: "http://localhost:8787" # hephaestus hub (server mode) + PWA shell
- name: kiwix
host: "kiwix.{{ caddy_domain }}"
backend: "https://kiwix.tail8d86e.ts.net"
@ -117,6 +120,8 @@ caddy_tcp_services:
backend: "pg.tail8d86e.ts.net:5432" # PostgreSQL (blumeops-pg)
- port: 5433
backend: "immich-pg.tail8d86e.ts.net:5432" # PostgreSQL (immich-pg)
- port: 5434
backend: "blumeops-pg-ringtail.tail8d86e.ts.net:5432" # PostgreSQL (blumeops-pg on ringtail)
- port: "{{ sifaka_node_exporter_port }}"
backend: "sifaka:{{ sifaka_node_exporter_port }}" # Sifaka node_exporter
- port: "{{ sifaka_smartctl_exporter_port }}"

View file

@ -3,9 +3,8 @@
# Caddy serves docs_content_dir directly via the static-kind service block,
# with Quartz-style try_files (path → path/ → path.html → 404).
docs_version: "v1.16.0"
docs_version: "v1.17.0"
docs_release_url: "https://forge.eblu.me/eblume/blumeops/releases/download/{{ docs_version }}/docs-{{ docs_version }}.tar.gz"
docs_home: /Users/erichblume/blumeops/docs
docs_content_dir: "{{ docs_home }}/content"
docs_version_sentinel: "{{ docs_home }}/.installed-version"

View file

@ -0,0 +1,49 @@
---
# hephaestus hub — the canonical heph replica (server mode) on indri.
# Other devices (e.g. gilbert) are spokes that sync against this hub.
# See [[set-up-sync-hub]] and [[host-heph-pwa]] in the hephaestus repo.
# Pinned release used for the initial `cargo install` and the PWA shell.
# After bootstrap, hephd's own --self-update keeps the binary current; this
# pin only governs the first install and the bundled PWA shell version.
heph_version: v1.2.1
# Anonymous public HTTPS clone — matches hephd's INSTALL_GIT_URL so the initial
# install and unattended self-update build from the same source (no ssh-agent).
heph_repo_url: https://forge.eblu.me/eblume/hephaestus.git
heph_bin_dir: /Users/erichblume/.cargo/bin
heph_binary: "{{ heph_bin_dir }}/hephd"
# rustc/cargo here are rustup shims. The bare (non-mise) environment that the
# launchagent and ansible run in falls back to rustup's *default* toolchain,
# which can lag behind heph's rust-version floor (Cargo.toml: 1.89). Pin the
# channel explicitly so both the bootstrap build and unattended self-update
# always use a current toolchain regardless of the host's rustup default.
heph_rust_toolchain: stable
heph_data_dir: /Users/erichblume/.local/share/heph
heph_db: "{{ heph_data_dir }}/heph.db"
heph_socket: "{{ heph_data_dir }}/hephd.sock"
heph_log_dir: /Users/erichblume/Library/Logs
# Version-pinned source checkout; the PWA static shell is served directly from
# its heph-pwa/ subdir (no copy), keeping shell and hub in lockstep at heph_version.
heph_pwa_src_dir: /Users/erichblume/.cache/heph-pwa-src
heph_web_root: "{{ heph_pwa_src_dir }}/heph-pwa"
# Hub listens on all interfaces so tailnet spokes can reach it directly
# (http://indri.tail8d86e.ts.net:8787) and Caddy can proxy heph.ops.eblu.me.
# Access is gated by Authentik OIDC regardless — tailnet reachability is not
# enough (this is the owner's most sensitive data).
heph_http_addr: 0.0.0.0:8787
heph_port: 8787
heph_external_url: https://heph.ops.eblu.me
# Authentik OIDC — issuer + audience together turn hub auth on. The audience is
# the device-code client id (see argocd/manifests/authentik heph blueprint).
heph_oidc_issuer: https://authentik.ops.eblu.me/application/o/heph/
heph_oidc_audience: heph
# Self-update poll interval (seconds). 10 minutes.
heph_self_update_interval_secs: 600

View file

@ -0,0 +1,6 @@
---
- name: Restart heph
ansible.builtin.shell: |
launchctl unload ~/Library/LaunchAgents/mcquack.eblume.heph.plist 2>/dev/null || true
launchctl load ~/Library/LaunchAgents/mcquack.eblume.heph.plist
changed_when: true

View file

@ -0,0 +1,82 @@
---
# hephaestus hub (server mode) on indri.
#
# DATA SEEDING (one-time, Path A — do this BEFORE the first provision so the hub
# adopts gilbert's existing data instead of being born empty):
#
# 1. On the seed device (gilbert): heph daemon stop
# 2. Copy its store to indri: scp ~/.local/share/heph/heph.db \
# indri:~/.local/share/heph/heph.db
# 3. On indri, give the hub its OWN device origin (keeps gilbert's owner_id +
# data; hephd regenerates a fresh origin on next start when it is missing):
# sqlite3 ~/.local/share/heph/heph.db "DELETE FROM meta WHERE key='origin';"
# 4. Run this role (installs hephd, stages the PWA, loads the launchagent).
#
# hephd auto-creates an empty store on first start if none exists, so seeding is
# optional — skip it only if you intend a fresh, empty hub.
- name: Ensure heph data directory exists
ansible.builtin.file:
path: "{{ heph_data_dir }}"
state: directory
mode: '0700'
- name: Check for installed hephd binary
ansible.builtin.stat:
path: "{{ heph_binary }}"
register: heph_binary_stat
# Bootstrap install only when hephd is absent. Thereafter hephd's own
# --self-update keeps it current; ansible must not fight (or downgrade) it.
# This builds from source and can take several minutes on a cold cargo cache.
- name: Bootstrap-install heph + hephd from the forge ({{ heph_version }})
ansible.builtin.command:
cmd: >-
{{ heph_bin_dir }}/cargo install --locked
--git {{ heph_repo_url }}
--tag {{ heph_version }}
heph hephd
environment:
PATH: "{{ heph_bin_dir }}:/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin"
RUSTUP_TOOLCHAIN: "{{ heph_rust_toolchain }}"
when: not heph_binary_stat.stat.exists
changed_when: true
notify: Restart heph
# Checkout provides the PWA shell at {{ heph_web_root }} (heph-pwa/ subdir),
# served directly by hephd. Static files are read from disk per request, so a
# version bump needs no restart; the service worker (CACHE = "heph-pwa-vN")
# evicts stale assets on next load.
- name: Ensure heph cache parent directory exists
ansible.builtin.file:
path: "{{ heph_pwa_src_dir | dirname }}"
state: directory
mode: '0755'
- name: Stage heph-pwa source at {{ heph_version }}
ansible.builtin.git:
repo: "{{ heph_repo_url }}"
dest: "{{ heph_pwa_src_dir }}"
version: "{{ heph_version }}"
depth: 1
single_branch: true
force: true
- name: Deploy heph LaunchAgent plist
ansible.builtin.template:
src: heph.plist.j2
dest: ~/Library/LaunchAgents/mcquack.eblume.heph.plist
mode: '0644'
notify: Restart heph
- name: Check if heph LaunchAgent is loaded
ansible.builtin.command: launchctl list mcquack.eblume.heph
register: heph_launchctl_check
changed_when: false
failed_when: false
- name: Load heph LaunchAgent if not loaded
ansible.builtin.command: launchctl load ~/Library/LaunchAgents/mcquack.eblume.heph.plist
when: heph_launchctl_check.rc != 0
changed_when: true
failed_when: false

View file

@ -0,0 +1,50 @@
<?xml version="1.0" encoding="UTF-8"?>
<!-- {{ ansible_managed }} -->
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>mcquack.eblume.heph</string>
<key>ProgramArguments</key>
<array>
<string>{{ heph_binary }}</string>
<string>--mode</string>
<string>server</string>
<string>--http-addr</string>
<string>{{ heph_http_addr }}</string>
<string>--db</string>
<string>{{ heph_db }}</string>
<string>--socket</string>
<string>{{ heph_socket }}</string>
<string>--web-root</string>
<string>{{ heph_web_root }}</string>
<string>--oidc-issuer</string>
<string>{{ heph_oidc_issuer }}</string>
<string>--oidc-audience</string>
<string>{{ heph_oidc_audience }}</string>
<string>--self-update</string>
<string>--self-update-interval-secs</string>
<string>{{ heph_self_update_interval_secs }}</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>EnvironmentVariables</key>
<dict>
<!-- cargo + toolchain on PATH so --self-update can run `cargo install`. -->
<key>PATH</key>
<string>{{ heph_bin_dir }}:/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin</string>
<key>HOME</key>
<string>/Users/erichblume</string>
<!-- Pin the rustup channel: the launchagent runs without mise, so a bare
cargo shim would otherwise use rustup's (stale) default toolchain. -->
<key>RUSTUP_TOOLCHAIN</key>
<string>{{ heph_rust_toolchain }}</string>
</dict>
<key>StandardOutPath</key>
<string>{{ heph_log_dir }}/mcquack.heph.out.log</string>
<key>StandardErrorPath</key>
<string>{{ heph_log_dir }}/mcquack.heph.err.log</string>
</dict>
</plist>

View file

@ -15,7 +15,7 @@ spec:
source:
repoURL: ssh://forgejo@forge.ops.eblu.me:2222/eblume/blumeops.git
targetRevision: main
path: argocd/manifests/external-secrets
path: argocd/manifests/external-secrets-ringtail
destination:
server: https://ringtail.tail8d86e.ts.net:6443
namespace: external-secrets

View file

@ -0,0 +1,26 @@
# Mealie on ringtail k3s.
#
# Wave-1 indri-k8s decommission. Staging deployment; the minikube `mealie`
# app stays in parallel until cutover (copy SQLite PVC, drop the minikube
# tailscale ingress, flip Caddy). See [[migrate-wave1-ringtail]].
#
# Prerequisites:
# - external-secrets-ringtail (onepassword-blumeops ClusterSecretStore)
# - mealie-data PVC contents copied from minikube at cutover
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: mealie-ringtail
namespace: argocd
spec:
project: default
source:
repoURL: ssh://forgejo@forge.ops.eblu.me:2222/eblume/blumeops.git
targetRevision: main
path: argocd/manifests/mealie-ringtail
destination:
server: https://ringtail.tail8d86e.ts.net:6443
namespace: mealie
syncPolicy:
syncOptions:
- CreateNamespace=true

View file

@ -1,17 +0,0 @@
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: mealie
namespace: argocd
spec:
project: default
source:
repoURL: ssh://forgejo@forge.ops.eblu.me:2222/eblume/blumeops.git
targetRevision: main
path: argocd/manifests/mealie
destination:
server: https://kubernetes.default.svc
namespace: mealie
syncPolicy:
syncOptions:
- CreateNamespace=true

View file

@ -0,0 +1,28 @@
# Paperless-ngx on ringtail k3s.
#
# Wave-1 indri-k8s decommission. Staging deployment; the minikube
# `paperless` app stays in parallel until cutover (drop the minikube
# tailscale ingress to free the name, then flip Caddy). See
# [[migrate-wave1-ringtail]].
#
# Prerequisites:
# - databases-ringtail blumeops-pg (paperless database + role)
# - external-secrets-ringtail (onepassword-blumeops ClusterSecretStore)
# - sifaka NFS rule granting ringtail access to /volume1/paperless
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: paperless-ringtail
namespace: argocd
spec:
project: default
source:
repoURL: ssh://forgejo@forge.ops.eblu.me:2222/eblume/blumeops.git
targetRevision: main
path: argocd/manifests/paperless-ringtail
destination:
server: https://ringtail.tail8d86e.ts.net:6443
namespace: paperless
syncPolicy:
syncOptions:
- CreateNamespace=true

View file

@ -1,17 +0,0 @@
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: paperless
namespace: argocd
spec:
project: default
source:
repoURL: ssh://forgejo@forge.ops.eblu.me:2222/eblume/blumeops.git
targetRevision: main
path: argocd/manifests/paperless
destination:
server: https://kubernetes.default.svc
namespace: paperless
syncPolicy:
syncOptions:
- CreateNamespace=true

View file

@ -0,0 +1,28 @@
# TeslaMate on ringtail k3s.
#
# Wave-1 indri-k8s decommission. Staging deployment; the minikube
# `teslamate` app stays in parallel until cutover (migrate the teslamate
# database, drop the minikube tailscale ingress, flip Caddy). See
# [[migrate-wave1-ringtail]].
#
# Prerequisites:
# - databases-ringtail blumeops-pg (teslamate database + role; cube +
# earthdistance extensions created by superuser at cutover)
# - external-secrets-ringtail (onepassword-blumeops ClusterSecretStore)
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: teslamate-ringtail
namespace: argocd
spec:
project: default
source:
repoURL: ssh://forgejo@forge.ops.eblu.me:2222/eblume/blumeops.git
targetRevision: main
path: argocd/manifests/teslamate-ringtail
destination:
server: https://ringtail.tail8d86e.ts.net:6443
namespace: teslamate
syncPolicy:
syncOptions:
- CreateNamespace=true

View file

@ -1,32 +0,0 @@
# TeslaMate Tesla Data Logger
# Requires: CloudNativePG PostgreSQL cluster and manual secret setup
#
# Before syncing, create the namespace and secrets:
# kubectl create namespace teslamate
# op inject -i argocd/manifests/databases/secret-teslamate.yaml.tpl | kubectl apply -f -
# op inject -i argocd/manifests/teslamate/secret-encryption-key.yaml.tpl | kubectl apply -f -
# op inject -i argocd/manifests/teslamate/secret-db.yaml.tpl | kubectl apply -f -
#
# Then create the database:
# PGPASSWORD=$(op read "op://blumeops/postgres/password") \
# psql -h pg.ops.eblu.me -U eblume -c "CREATE DATABASE teslamate OWNER teslamate;"
#
# After syncing, access the TeslaMate UI at https://tesla.tail8d86e.ts.net to complete
# Tesla API authentication via OAuth flow.
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: teslamate
namespace: argocd
spec:
project: default
source:
repoURL: ssh://forgejo@forge.ops.eblu.me:2222/eblume/blumeops.git
targetRevision: main
path: argocd/manifests/teslamate
destination:
server: https://kubernetes.default.svc
namespace: teslamate
syncPolicy:
syncOptions:
- CreateNamespace=true

View file

@ -191,14 +191,9 @@ prometheus.exporter.blackbox "services" {
}
target {
// Migrated to ringtail (wave-1); probe through Caddy over Tailscale.
name = "teslamate"
address = "http://teslamate.teslamate.svc.cluster.local:4000/"
module = "http_2xx"
}
target {
name = "immich"
address = "http://immich-server.immich.svc.cluster.local:2283/api/server/ping"
address = "https://tesla.ops.eblu.me/"
module = "http_2xx"
}

View file

@ -45,6 +45,26 @@ prometheus.scrape "kube_state_metrics" {
forward_to = [prometheus.remote_write.prometheus.receiver]
}
// ============== SERVICE HEALTH PROBES ==============
// Blackbox-style HTTP probes for in-cluster services on ringtail
prometheus.exporter.blackbox "services" {
config = "{ modules: { http_2xx: { prober: http, timeout: 5s } } }"
target {
name = "immich"
address = "http://immich-server.immich.svc.cluster.local:2283/api/server/ping"
module = "http_2xx"
}
}
// Scrape blackbox probe results
prometheus.scrape "blackbox" {
targets = prometheus.exporter.blackbox.services.targets
scrape_interval = "30s"
forward_to = [prometheus.remote_write.prometheus.receiver]
}
// Push metrics to indri Prometheus
prometheus.remote_write "prometheus" {
external_labels = { cluster = "ringtail" }

View file

@ -2,6 +2,9 @@
#
# - workflow-bot: minimal CI/CD permissions (sync, get)
# - admins: Authentik admins group mapped to ArgoCD admin role
# - admin: local break-glass account — keeps ArgoCD admin rights for when
# Authentik SSO is unavailable (without this it has no permissions, since
# policy.default is unset)
#
apiVersion: v1
kind: ConfigMap
@ -14,3 +17,4 @@ data:
p, role:workflow-bot, applications, get, *, allow
g, workflow-bot, role:workflow-bot
g, admins, role:admin
g, admin, role:admin

View file

@ -434,3 +434,93 @@ data:
provider: !KeyOf mealie-provider
meta_launch_url: https://meals.ops.eblu.me
policy_engine_mode: all
heph.yaml: |
version: 1
metadata:
name: BlumeOps Heph SSO
labels:
blueprints.goauthentik.io/description: "Hephaestus hub OIDC (device-code) provider, application, and device-code flow"
entries:
# Device-code flow (RFC 8628). authentik ships no default for this, so we
# create one and bind it to the brand below. An empty stage_configuration
# flow is sufficient: the already-authenticated user just confirms the code.
- model: authentik_flows.flow
id: device-code-flow
identifiers:
slug: default-device-code-flow
attrs:
name: Device code flow
title: Device code flow
slug: default-device-code-flow
designation: stage_configuration
authentication: require_authenticated
# Enable the device-code grant globally by binding the flow to the default
# brand (domain authentik-default). Partial update — only sets this field.
- model: authentik_brands.brand
identifiers:
domain: authentik-default
attrs:
flow_device_code: !KeyOf device-code-flow
# OAuth2 provider for heph — PUBLIC client (device-code + PKCE, no secret).
# client_id doubles as the token audience the hub verifies (--oidc-audience heph),
# and the app slug 'heph' is the issuer path (/application/o/heph/).
- model: authentik_providers_oauth2.oauth2provider
id: heph-provider
identifiers:
name: Heph
attrs:
name: Heph
authorization_flow: !Find [authentik_flows.flow, [slug, default-provider-authorization-implicit-consent]]
invalidation_flow: !Find [authentik_flows.flow, [slug, default-provider-invalidation-flow]]
client_type: public
client_id: heph
# CLI/TUI use the device-code grant (no redirect). The heph-pwa browser
# login uses Authorization Code + PKCE, which DOES redirect back to the
# app's origin — register those here (Authentik also keys token-endpoint
# CORS off these origins). Trailing slash matters: the PWA's redirect_uri
# is its base dir, e.g. https://heph.ops.eblu.me/.
redirect_uris:
- matching_mode: strict
url: https://heph.ops.eblu.me/
- matching_mode: strict
url: http://localhost:8787/ # local dev (hephd --web-root)
signing_key: !Find [authentik_crypto.certificatekeypair, [name, authentik Self-signed Certificate]]
property_mappings:
- !Find [authentik_providers_oauth2.scopemapping, [scope_name, openid]]
- !Find [authentik_providers_oauth2.scopemapping, [scope_name, email]]
- !Find [authentik_providers_oauth2.scopemapping, [scope_name, profile]]
# offline_access: heph CLI requests "openid offline_access"; without
# this mapping the refresh token is session-bound and hephd's
# refresh_token grant 400s once the session lapses (spoke sync dies).
- !Find [authentik_providers_oauth2.scopemapping, [scope_name, offline_access]]
sub_mode: hashed_user_id
include_claims_in_id_token: true
# Heph application — linked to the OAuth2 provider
- model: authentik_core.application
id: heph-app
identifiers:
slug: heph
attrs:
name: Hephaestus
slug: heph
provider: !KeyOf heph-provider
meta_launch_url: https://heph.ops.eblu.me
policy_engine_mode: any
# Policy binding — restrict heph to admins group (single-owner, sensitive data)
- model: authentik_policies.policybinding
identifiers:
order: 0
target: !KeyOf heph-app
group: !Find [authentik_core.group, [name, admins]]
attrs:
target: !KeyOf heph-app
group: !Find [authentik_core.group, [name, admins]]
order: 0
enabled: true
negate: false
timeout: 30

View file

@ -0,0 +1,97 @@
# PostgreSQL Cluster for blumeops services on ringtail k3s.
#
# Wave-1 indri-k8s decommission target (see [[migrate-wave1-ringtail]]).
# Holds the paperless and teslamate databases migrated off the minikube
# blumeops-pg via cold pg_dump/pg_restore at cutover. miniflux + authentik
# stay where they are for now (later waves), so this cluster only carries
# the wave-1 roles.
#
# Apps reach this in-cluster at blumeops-pg-rw.databases.svc.cluster.local
# — the same name they used on minikube, so teslamate's DATABASE_HOST is
# unchanged.
#
# Database creation is deferred to cutover, mirroring the minikube cluster
# (where only the bootstrap database is declared and the rest were created
# out-of-band):
# - paperless: the bootstrap database below (restored into at cutover).
# - teslamate: created at its cutover by the eblume superuser, because the
# dump's `earthdistance` extension is untrusted and CREATE EXTENSION
# needs superuser. (cube + earthdistance ownership then transferred to
# the teslamate role so it can ALTER EXTENSION UPDATE.)
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: blumeops-pg
namespace: databases
spec:
instances: 1
imageName: ghcr.io/cloudnative-pg/postgresql:18.3
storage:
size: 10Gi
storageClass: local-path
bootstrap:
initdb:
database: paperless
owner: paperless
managed:
roles:
# eblume superuser for admin + privileged restore steps (extensions)
- name: eblume
login: true
superuser: true
createdb: true
createrole: true
connectionLimit: -1
ensure: present
inherit: true
passwordSecret:
name: blumeops-pg-eblume
# borgmatic read-only user for backups
- name: borgmatic
login: true
connectionLimit: -1
ensure: present
inherit: true
inRoles:
- pg_read_all_data
passwordSecret:
name: blumeops-pg-borgmatic
# paperless user (also the bootstrap database owner above; the
# managed role sets its password from the 1Password-backed secret)
- name: paperless
login: true
connectionLimit: -1
ensure: present
inherit: true
passwordSecret:
name: blumeops-pg-paperless
# teslamate user. Extension ownership (cube, earthdistance) is
# transferred to this role at cutover so it can ALTER EXTENSION UPDATE.
- name: teslamate
login: true
connectionLimit: -1
ensure: present
inherit: true
passwordSecret:
name: blumeops-pg-teslamate
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "500m"
postgresql:
parameters:
max_connections: "50"
shared_buffers: "128MB"
password_encryption: "scram-sha-256"
pg_hba:
# Password auth from anywhere; network security is via Tailscale.
- host all all 0.0.0.0/0 scram-sha-256
- host all all ::/0 scram-sha-256

View file

@ -0,0 +1,30 @@
# ExternalSecret for borgmatic backup user password
#
# Replaces the manual op inject workflow from secret-borgmatic.yaml.tpl
#
# 1Password item: "borgmatic" in blumeops vault
# Field: "db-password"
#
apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
name: blumeops-pg-borgmatic
namespace: databases
spec:
refreshInterval: 1h
secretStoreRef:
kind: ClusterSecretStore
name: onepassword-blumeops
target:
name: blumeops-pg-borgmatic
creationPolicy: Owner
template:
type: kubernetes.io/basic-auth
data:
username: borgmatic
password: "{{ .password }}"
data:
- secretKey: password
remoteRef:
key: borgmatic
property: db-password

View file

@ -0,0 +1,30 @@
# ExternalSecret for eblume superuser password
#
# Replaces the manual op inject workflow from secret-eblume.yaml.tpl
#
# 1Password item: "postgres" in blumeops vault
# Field: "password"
#
apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
name: blumeops-pg-eblume
namespace: databases
spec:
refreshInterval: 1h
secretStoreRef:
kind: ClusterSecretStore
name: onepassword-blumeops
target:
name: blumeops-pg-eblume
creationPolicy: Owner
template:
type: kubernetes.io/basic-auth
data:
username: eblume
password: "{{ .password }}"
data:
- secretKey: password
remoteRef:
key: postgres
property: password

View file

@ -7,3 +7,10 @@ resources:
- immich-pg.yaml
- external-secret-immich-borgmatic.yaml
- service-immich-pg-tailscale.yaml
# wave-1 indri-k8s decommission: blumeops-pg (paperless + teslamate)
- blumeops-pg.yaml
- service-blumeops-pg-tailscale.yaml
- external-secret-eblume.yaml
- external-secret-borgmatic.yaml
- external-secret-paperless.yaml
- external-secret-teslamate.yaml

View file

@ -0,0 +1,24 @@
# Tailscale LoadBalancer for the ringtail blumeops-pg cluster.
# Canonical hostname: blumeops-pg-ringtail.tail8d86e.ts.net (distinct from
# the minikube blumeops-pg, which still owns pg.tail8d86e.ts.net until the
# wave-1 decommission). Borgmatic on indri and the Grafana TeslaMate
# datasource reach it via the Caddy L4 route pg.ops.eblu.me:5434.
apiVersion: v1
kind: Service
metadata:
name: blumeops-pg-tailscale
namespace: databases
annotations:
tailscale.com/hostname: "blumeops-pg-ringtail"
tailscale.com/proxy-class: "default"
spec:
type: LoadBalancer
loadBalancerClass: tailscale
selector:
cnpg.io/cluster: blumeops-pg
role: primary
ports:
- name: postgresql
port: 5432
targetPort: 5432
protocol: TCP

View file

@ -44,18 +44,9 @@ spec:
- pg_read_all_data
passwordSecret:
name: blumeops-pg-borgmatic
# teslamate user for TeslaMate Tesla data logger
# Superuser removed. Extension ownership (cube, earthdistance)
# transferred manually so teslamate can ALTER EXTENSION UPDATE.
# earthdistance is untrusted — DROP+CREATE needs temporary
# superuser escalation during upgrades.
- name: teslamate
login: true
connectionLimit: -1
ensure: present
inherit: true
passwordSecret:
name: blumeops-pg-teslamate
# teslamate + paperless roles removed: migrated to ringtail blumeops-pg
# (wave-1 decommission). Their databases were dropped from this cluster
# after the cutover was verified and backed up.
# authentik user for Authentik identity provider (runs on ringtail)
- name: authentik
login: true
@ -65,14 +56,6 @@ spec:
createdb: true
passwordSecret:
name: blumeops-pg-authentik
# paperless user for Paperless-ngx document management
- name: paperless
login: true
connectionLimit: -1
ensure: present
inherit: true
passwordSecret:
name: blumeops-pg-paperless
# Resource limits for minikube environment
resources:

View file

@ -9,6 +9,4 @@ resources:
- service-metrics-tailscale.yaml
- external-secret-eblume.yaml
- external-secret-borgmatic.yaml
- external-secret-teslamate.yaml
- external-secret-authentik.yaml
- external-secret-paperless.yaml

View file

@ -0,0 +1,16 @@
# Ringtail (amd64) overlay for external-secrets.
#
# Reuses the shared indri manifest as a base and only overrides the controller
# image to the nix-built amd64 variant (`-nix` tag). The base sets the arm64
# image (built via containers/external-secrets/container.py on indri's Dagger
# runner); ringtail's k3s is amd64 and needs the image built by
# containers/external-secrets/default.nix on the nix-container-builder.
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../external-secrets
images:
- name: registry.ops.eblu.me/blumeops/external-secrets
newTag: v2.2.0-13895bb-nix

View file

@ -12,4 +12,5 @@ resources:
images:
- name: ghcr.io/external-secrets/external-secrets
newTag: v2.2.0
newName: registry.ops.eblu.me/blumeops/external-secrets
newTag: v2.2.0-13895bb

View file

@ -63,5 +63,7 @@ datasources:
password: $TESLAMATE_DB_PASSWORD
type: postgres
uid: TeslaMate
url: blumeops-pg-rw.databases.svc.cluster.local:5432
# teslamate DB migrated to ringtail blumeops-pg (wave-1); reached via the
# Caddy L4 route on indri (pg.ops.eblu.me:5434 -> blumeops-pg-ringtail).
url: pg.ops.eblu.me:5434
user: teslamate

View file

@ -14,7 +14,9 @@ spec:
app.kubernetes.io/name: grafana
app.kubernetes.io/instance: grafana
strategy:
type: RollingUpdate
# RWO PVC for SQLite + Bleve index — RollingUpdate spawns the new pod
# before the old one terminates, and it crashloops on the index lock.
type: Recreate
template:
metadata:
labels:

View file

@ -71,10 +71,6 @@
enableBlocks: true
enableNowPlaying: false
fields: ["movies", "series", "episodes"]
- Mealie:
href: https://meals.ops.eblu.me
icon: mealie.png
description: Recipe manager
- DJ:
href: https://dj.ops.eblu.me
icon: navidrome.png
@ -85,15 +81,7 @@
user: "{{HOMEPAGE_VAR_NAVIDROME_USER}}"
token: "{{HOMEPAGE_VAR_NAVIDROME_TOKEN}}"
salt: "{{HOMEPAGE_VAR_NAVIDROME_SALT}}"
- Paperless:
href: https://paperless.ops.eblu.me
icon: paperless-ngx.png
description: Document management
- Content:
- Immich:
href: https://photos.ops.eblu.me
icon: immich.png
description: Photo management
- Kiwix:
href: https://kiwix.ops.eblu.me
icon: kiwix.png
@ -138,10 +126,6 @@
href: https://docs.eblu.me
icon: mdi-book-open-page-variant
description: BlumeOps Documentation
- TeslaMate:
href: https://tesla.ops.eblu.me
icon: teslamate.png
description: Tesla data logger
- Transmission:
href: https://torrent.ops.eblu.me
icon: transmission.png

View file

@ -21,8 +21,9 @@ images:
- name: ghcr.io/immich-app/immich-machine-learning
# CUDA variant of the same release — ringtail has an RTX 4080
newTag: v2.6.3-cuda
# Using upstream multi-arch valkey image directly; the
# registry.ops.eblu.me/blumeops/valkey mirror is arm64-only (built
# on indri) and would crashloop on ringtail.
# amd64 valkey built via nix on the ringtail nix-container-builder
# (see containers/valkey/default.nix). The Alpine container.py build
# is arm64-only and serves paperless on indri.
- name: docker.io/valkey/valkey
newTag: "8.1.6"
newName: registry.ops.eblu.me/blumeops/valkey
newTag: v8.1.7-ecded30-nix

View file

@ -1,3 +1,9 @@
# Mealie on ringtail k3s — Nix image.
#
# Single gunicorn process (the Nix image's default `mealie-run` entrypoint
# runs init_db then gunicorn), serving the prebuilt frontend. DB is SQLite
# on the mealie-data PVC; its contents are copied from the minikube PVC at
# cutover. See [[migrate-wave1-ringtail]].
apiVersion: apps/v1
kind: Deployment
metadata:
@ -5,6 +11,8 @@ metadata:
namespace: mealie
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
app: mealie

View file

@ -12,4 +12,4 @@ resources:
images:
- name: registry.ops.eblu.me/blumeops/mealie
newTag: v3.12.0-613f05d
newTag: v3.16.0-e0057b4-nix

View file

@ -1,4 +1,5 @@
---
# SQLite data volume for Mealie on ringtail. Contents copied from the
# minikube mealie-data PVC at cutover (recipes, meal plans, uploaded media).
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
@ -7,7 +8,7 @@ metadata:
spec:
accessModes:
- ReadWriteOnce
storageClassName: standard
storageClassName: local-path
resources:
requests:
storage: 2Gi

View file

@ -10,4 +10,4 @@ resources:
images:
- name: nvcr.io/nvidia/k8s-device-plugin
newTag: v0.19.0
newTag: v0.19.2

View file

@ -1,3 +1,17 @@
# Paperless-ngx on ringtail k3s — Nix image, multi-process.
#
# The upstream s6 image ran web + worker + scheduler + consumer (and DB
# migrations) in one container. The Nix image (containers/paperless/
# default.nix) ships the binaries but no supervisor, so we run those as
# four containers in one pod, sharing the local data/consume dirs
# (emptyDir) and the NFS media volume; redis is colocated so
# PAPERLESS_REDIS=localhost works for all. A migrate initContainer runs
# DB migrations once before the app containers start.
#
# DB points in-cluster at the ringtail blumeops-pg (was pg.ops.eblu.me on
# indri). PAPERLESS_{DATA_DIR,MEDIA_ROOT,CONSUMPTION_DIR} are set
# explicitly because the Nix package does not default to the upstream
# /usr/src/paperless paths.
apiVersion: apps/v1
kind: Deployment
metadata:
@ -5,6 +19,8 @@ metadata:
namespace: paperless
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
app: paperless
@ -16,27 +32,38 @@ spec:
securityContext:
seccompProfile:
type: RuntimeDefault
containers:
- name: paperless
image: registry.ops.eblu.me/blumeops/paperless:kustomized
initContainers:
# redis as a native sidecar (restartPolicy: Always): starts before
# the migrate init and stays running for the app containers, so all
# of them reach PAPERLESS_REDIS=localhost:6379.
- name: redis
image: docker.io/library/redis:kustomized
restartPolicy: Always
ports:
- containerPort: 8000
name: http
env:
- containerPort: 6379
volumeMounts:
- name: redis-data
mountPath: /data
resources:
requests:
memory: "32Mi"
cpu: "10m"
limits:
memory: "128Mi"
- name: migrate
image: registry.ops.eblu.me/blumeops/paperless:kustomized
command: ["paperless-ngx", "migrate", "--no-input"]
env: &paperless-env
- name: PAPERLESS_URL
value: "https://paperless.ops.eblu.me"
- name: PAPERLESS_REDIS
value: "redis://localhost:6379"
- name: PAPERLESS_DBHOST
value: "pg.ops.eblu.me"
value: "blumeops-pg-rw.databases.svc.cluster.local"
- name: PAPERLESS_DBPORT
value: "5432"
- name: PAPERLESS_DBNAME
value: "paperless"
# Explicit port to override k8s-injected PAPERLESS_PORT env var
# (k8s sets PAPERLESS_PORT=tcp://... for a service named 'paperless')
- name: PAPERLESS_PORT
value: "8000"
- name: PAPERLESS_DBUSER
value: "paperless"
- name: PAPERLESS_DBPASS
@ -44,6 +71,16 @@ spec:
secretKeyRef:
name: paperless-secrets
key: db-password
# Explicit port to override the k8s-injected PAPERLESS_PORT
# (service named 'paperless' would set PAPERLESS_PORT=tcp://...)
- name: PAPERLESS_PORT
value: "8000"
- name: PAPERLESS_DATA_DIR
value: "/usr/src/paperless/data"
- name: PAPERLESS_MEDIA_ROOT
value: "/usr/src/paperless/media"
- name: PAPERLESS_CONSUMPTION_DIR
value: "/usr/src/paperless/consume"
- name: PAPERLESS_SECRET_KEY
valueFrom:
secretKeyRef:
@ -55,7 +92,6 @@ spec:
value: "eng"
- name: PAPERLESS_TASK_WORKERS
value: "1"
# Admin account (created on first startup)
- name: PAPERLESS_ADMIN_USER
value: "eblume"
- name: PAPERLESS_ADMIN_PASSWORD
@ -65,8 +101,6 @@ spec:
key: admin-password
- name: PAPERLESS_ADMIN_MAIL
value: "blume.erich@gmail.com"
# OIDC via Authentik
# Full JSON blob pulled from 1Password (includes client secret)
- name: PAPERLESS_APPS
value: "allauth.socialaccount.providers.openid_connect"
- name: PAPERLESS_SOCIALACCOUNT_PROVIDERS
@ -82,19 +116,27 @@ spec:
value: "false"
- name: PAPERLESS_REDIRECT_LOGIN_TO_SSO
value: "false"
volumeMounts:
volumeMounts: &paperless-mounts
- name: data
mountPath: /usr/src/paperless/data
- name: media
mountPath: /usr/src/paperless/media
- name: consume
mountPath: /usr/src/paperless/consume
containers:
- name: web
image: registry.ops.eblu.me/blumeops/paperless:kustomized
ports:
- containerPort: 8000
name: http
env: *paperless-env
volumeMounts: *paperless-mounts
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "2Gi"
memory: "1Gi"
cpu: "1000m"
livenessProbe:
httpGet:
@ -109,16 +151,42 @@ spec:
initialDelaySeconds: 30
periodSeconds: 10
- name: redis
image: docker.io/library/redis:kustomized
ports:
- containerPort: 6379
- name: worker
image: registry.ops.eblu.me/blumeops/paperless:kustomized
command: ["celery", "--app", "paperless", "worker", "--loglevel", "INFO"]
env: *paperless-env
volumeMounts: *paperless-mounts
resources:
requests:
memory: "32Mi"
cpu: "10m"
memory: "256Mi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "1000m"
- name: beat
image: registry.ops.eblu.me/blumeops/paperless:kustomized
command: ["celery", "--app", "paperless", "beat", "--loglevel", "INFO"]
env: *paperless-env
volumeMounts: *paperless-mounts
resources:
requests:
memory: "64Mi"
cpu: "20m"
limits:
memory: "256Mi"
- name: consumer
image: registry.ops.eblu.me/blumeops/paperless:kustomized
command: ["paperless-ngx", "document_consumer"]
env: *paperless-env
volumeMounts: *paperless-mounts
resources:
requests:
memory: "128Mi"
cpu: "50m"
limits:
memory: "512Mi"
volumes:
- name: data
@ -128,3 +196,6 @@ spec:
claimName: paperless-media
- name: consume
emptyDir: {}
- name: redis-data
emptyDir:
sizeLimit: 1Gi

View file

@ -13,7 +13,9 @@ resources:
images:
- name: registry.ops.eblu.me/blumeops/paperless
newTag: v2.20.13-07f52e9
newTag: v2.20.15-fcac8e5-nix
# amd64 valkey built via nix (the v8.1.7-ecded30 tag without -nix is the
# arm64 Alpine build for indri and fails on ringtail with exec format error)
- name: docker.io/library/redis
newName: registry.ops.eblu.me/blumeops/valkey
newTag: v8.1.6-r0-fabca04
newTag: v8.1.7-ecded30-nix

View file

@ -0,0 +1,22 @@
# NFS PersistentVolume for the Paperless document library, mounted from
# ringtail. Same sifaka export (/volume1/paperless) as the minikube PV,
# but a distinct PV name so both clusters can declare it during the
# parallel-run before cutover.
#
# Prerequisite: sifaka must have an NFS rule granting ringtail Read/Write
# (Squash=No mapping) on the paperless share — the same step done for
# immich. See [[sifaka-nfs-from-ringtail]].
apiVersion: v1
kind: PersistentVolume
metadata:
name: paperless-media-nfs-pv-ringtail
spec:
capacity:
storage: 500Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: ""
nfs:
server: sifaka
path: /volume1/paperless

View file

@ -1,5 +1,5 @@
# PersistentVolumeClaim for Paperless document library
# Binds to the NFS PV for sifaka:/volume1/paperless
# PersistentVolumeClaim for the Paperless document library on ringtail.
# Binds the NFS PV for sifaka:/volume1/paperless.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
@ -9,7 +9,7 @@ spec:
accessModes:
- ReadWriteMany
storageClassName: ""
volumeName: paperless-media-nfs-pv
volumeName: paperless-media-nfs-pv-ringtail
resources:
requests:
storage: 500Gi

View file

@ -1,22 +0,0 @@
# NFS PersistentVolume for Paperless document library
# Requires: NFS share on sifaka at /volume1/paperless with NFS permissions for indri
#
# To create on Synology:
# 1. Control Panel > Shared Folder > Create
# 2. Name: paperless, Location: Volume 1
# 3. Control Panel > File Services > NFS > NFS Rules
# 4. Add rule for "paperless" share: Hostname=indri, Privilege=Read/Write, Squash=No mapping
apiVersion: v1
kind: PersistentVolume
metadata:
name: paperless-media-nfs-pv
spec:
capacity:
storage: 500Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: ""
nfs:
server: sifaka
path: /volume1/paperless

View file

@ -6,48 +6,48 @@ Mutelist:
"apiserver_always_pull_images_plugin":
Regions: ["*"]
Resources: ["^kube-apiserver-minikube$"]
Description: "CC: single-user-cluster, local-registry. Only the operator has cluster access; all images pulled from private zot registry."
Description: "Only the operator has cluster access; all images pulled from private zot registry."
"apiserver_audit_log_maxage_set":
Regions: ["*"]
Resources: ["^kube-apiserver-minikube$"]
Description: "CC: observability-stack-audit. Alloy/Loki provides pod-level audit trail."
Description: "Alloy/Loki provides pod-level audit trail."
"apiserver_audit_log_maxbackup_set":
Regions: ["*"]
Resources: ["^kube-apiserver-minikube$"]
Description: "CC: observability-stack-audit. Alloy/Loki provides pod-level audit trail."
Description: "Alloy/Loki provides pod-level audit trail."
"apiserver_audit_log_maxsize_set":
Regions: ["*"]
Resources: ["^kube-apiserver-minikube$"]
Description: "CC: observability-stack-audit. Alloy/Loki provides pod-level audit trail."
Description: "Alloy/Loki provides pod-level audit trail."
"apiserver_audit_log_path_set":
Regions: ["*"]
Resources: ["^kube-apiserver-minikube$"]
Description: "CC: observability-stack-audit. Alloy/Loki provides pod-level audit trail."
Description: "Alloy/Loki provides pod-level audit trail."
"apiserver_deny_service_external_ips":
Regions: ["*"]
Resources: ["^kube-apiserver-minikube$"]
Description: "CC: tailscale-network-isolation. No external IPs routable; cluster only reachable via tailnet."
Description: "No external IPs routable; cluster only reachable via tailnet."
"apiserver_disable_profiling":
Regions: ["*"]
Resources: ["^kube-apiserver-minikube$"]
Description: "CC: tailscale-network-isolation. Profiling endpoint unreachable from public internet."
Description: "Profiling endpoint unreachable from public internet."
"apiserver_encryption_provider_config_set":
Regions: ["*"]
Resources: ["^kube-apiserver-minikube$"]
Description: "CC: tailscale-network-isolation, single-user-cluster. Etcd not network-exposed; only operator has node access."
Description: "Etcd not network-exposed; only operator has node access."
"apiserver_kubelet_cert_auth":
Regions: ["*"]
Resources: ["^kube-apiserver-minikube$"]
Description: "CC: tailscale-network-isolation. Kubelet API not exposed outside the node; minikube auto-generates certificates."
Description: "Kubelet API not exposed outside the node; minikube auto-generates certificates."
"apiserver_request_timeout_set":
Regions: ["*"]
Resources: ["^kube-apiserver-minikube$"]
Description: "CC: tailscale-network-isolation. API server only reachable via tailnet; DoS risk limited to trusted clients."
Description: "API server only reachable via tailnet; DoS risk limited to trusted clients."
"apiserver_service_account_lookup_true":
Regions: ["*"]
Resources: ["^kube-apiserver-minikube$"]
Description: "CC: single-user-cluster. Only operator manages service accounts; no revoked tokens in circulation."
Description: "Only operator manages service accounts; no revoked tokens in circulation."
"apiserver_strong_ciphers_only":
Regions: ["*"]
Resources: ["^kube-apiserver-minikube$"]
Description: "CC: tailscale-network-isolation. API server traffic encrypted by WireGuard at the network layer."
Description: "API server traffic encrypted by WireGuard at the network layer."

View file

@ -6,12 +6,12 @@ Mutelist:
"controllermanager_disable_profiling":
Regions: ["*"]
Resources: ["^kube-controller-manager-minikube$"]
Description: "CC: tailscale-network-isolation. Profiling endpoint unreachable from public internet."
Description: "Profiling endpoint unreachable from public internet."
"scheduler_profiling":
Regions: ["*"]
Resources: ["^kube-scheduler-minikube$"]
Description: "CC: tailscale-network-isolation. Profiling endpoint unreachable from public internet."
Description: "Profiling endpoint unreachable from public internet."
"kubelet_tls_cert_and_key":
Regions: ["*"]
Resources: ["^kubelet-config$"]
Description: "CC: tailscale-network-isolation, single-user-cluster. Kubelet API not exposed outside node; minikube auto-generates certificates."
Description: "Kubelet API not exposed outside node; minikube auto-generates certificates."

View file

@ -17,9 +17,8 @@ Mutelist:
- "^kindnet-"
- "^storage-provisioner$"
Description: >-
CC: tailscale-network-isolation. Control-plane and networking
pods require hostNetwork by design. Host network itself is
only reachable via tailnet.
Control-plane and networking pods require hostNetwork by design.
Host network itself is only reachable via tailnet.
"core_minimize_privileged_containers":
Regions: ["*"]
Resources:
@ -31,7 +30,6 @@ Mutelist:
# Forgejo runner
- "^forgejo-runner-"
Description: >-
CC: single-user-cluster, operator-managed-pods, trusted-ci-only.
kube-proxy: system pod, single-user cluster. ts-*/ingress-*:
Tailscale operator-managed. forgejo-runner: DinD limited to
trusted private forge repos.
@ -49,25 +47,24 @@ Mutelist:
- "^nameserver-"
- "^ingress-"
Description: >-
CC: single-user-cluster, operator-managed-pods. System pods
managed by minikube and Tailscale operator; seccomp profiles
set by upstream. Single-user cluster limits exploit surface.
System pods managed by minikube and Tailscale operator;
seccomp profiles set by upstream. Single-user cluster limits
exploit surface.
"core_minimize_hostPID_containers":
Regions: ["*"]
Resources:
- "^prowler-"
Description: >-
CC: ephemeral-privileged-jobs. Prowler CIS scanner requires
hostPID for file permission checks. Runs as CronJob with
7-day TTL, not a persistent workload.
Prowler CIS scanner requires hostPID for file permission
checks. Runs as CronJob with 7-day TTL, not a persistent
workload.
"core_minimize_root_containers_admission":
Regions: ["*"]
Resources:
- "^grafana-"
Description: >-
CC: init-container-isolation. Root limited to init-chown-data
container; all runtime containers run as UID 472 with caps
dropped.
Root limited to init-chown-data container; all runtime
containers run as UID 472 with caps dropped.
"core_minimize_containers_added_capabilities":
Regions: ["*"]
Resources:
@ -77,10 +74,9 @@ Mutelist:
# Grafana init-chown-data
- "^grafana-"
Description: >-
CC: single-user-cluster, init-container-isolation. System
pods: capabilities required by function (minikube-managed).
Grafana: CHOWN limited to init phase; runtime containers
drop ALL.
System pods: capabilities required by function
(minikube-managed). Grafana: CHOWN limited to init phase;
runtime containers drop ALL.
"core_minimize_containers_capabilities_assigned":
Regions: ["*"]
Resources:
@ -88,5 +84,4 @@ Mutelist:
- "^kindnet-"
- "^grafana-"
Description: >-
CC: single-user-cluster, init-container-isolation. See
core_minimize_containers_added_capabilities.
See core_minimize_containers_added_capabilities.

View file

@ -1,7 +1,7 @@
# Node-level and RBAC checks that Prowler reports as MANUAL because it
# cannot evaluate them from inside a pod. Compensated by automated
# verification in `mise run review-compliance-reports`, which SSHes into
# the minikube node and checks each condition directly every week.
# cannot evaluate them from inside a pod. Verified out-of-band by the
# node-verification block in `mise run review-compliance-reports`, which
# SSHes into the minikube node and checks each condition directly.
Mutelist:
Accounts:
"*":
@ -9,51 +9,51 @@ Mutelist:
"etcd_unique_ca":
Regions: ["*"]
Resources: ["^etcd-minikube$"]
Description: "CC: node-config-automated-verification. Etcd CA fingerprint verified different from cluster CA by review-compliance-reports."
Description: "Etcd CA fingerprint verified different from cluster CA by review-compliance-reports."
"kubelet_conf_file_ownership":
Regions: ["*"]
Resources: ["^kubelet-config$"]
Description: "CC: node-config-automated-verification. File ownership verified root:root by review-compliance-reports."
Description: "File ownership verified root:root by review-compliance-reports."
"kubelet_conf_file_permissions":
Regions: ["*"]
Resources: ["^kubelet-config$"]
Description: "CC: node-config-automated-verification. File permissions verified 600 by review-compliance-reports."
Description: "File permissions verified 600 by review-compliance-reports."
"kubelet_config_yaml_ownership":
Regions: ["*"]
Resources: ["^kubelet-config$"]
Description: "CC: node-config-automated-verification. File ownership verified root:root by review-compliance-reports."
Description: "File ownership verified root:root by review-compliance-reports."
"kubelet_config_yaml_permissions":
Regions: ["*"]
Resources: ["^kubelet-config$"]
Description: "CC: node-config-automated-verification. File permissions verified 644 by review-compliance-reports."
Description: "File permissions verified 644 by review-compliance-reports."
"kubelet_service_file_ownership_root":
Regions: ["*"]
Resources: ["^kubelet-config$"]
Description: "CC: node-config-automated-verification. File ownership verified root:root by review-compliance-reports."
Description: "File ownership verified root:root by review-compliance-reports."
"kubelet_service_file_permissions":
Regions: ["*"]
Resources: ["^kubelet-config$"]
Description: "CC: node-config-automated-verification. File permissions verified 644 by review-compliance-reports."
Description: "File permissions verified 644 by review-compliance-reports."
"kubelet_disable_read_only_port":
Regions: ["*"]
Resources: ["^kubelet-config$"]
Description: "CC: node-config-automated-verification. readOnlyPort absence (defaults to 0) verified by review-compliance-reports."
Description: "readOnlyPort absence (defaults to 0) verified by review-compliance-reports."
"kubelet_event_record_qps":
Regions: ["*"]
Resources: ["^kubelet-config$"]
Description: "CC: node-config-automated-verification. eventRecordQPS absence (defaults to 5) verified by review-compliance-reports."
Description: "eventRecordQPS absence (defaults to 5) verified by review-compliance-reports."
"kubelet_manage_iptables":
Regions: ["*"]
Resources: ["^kubelet-config$"]
Description: "CC: node-config-automated-verification. makeIPTablesUtilChains absence (defaults to true) verified by review-compliance-reports."
Description: "makeIPTablesUtilChains absence (defaults to true) verified by review-compliance-reports."
"kubelet_strong_ciphers_only":
Regions: ["*"]
Resources: ["^kubelet-config$"]
Description: "CC: node-config-automated-verification, tailscale-network-isolation. Go default ciphers used; all traffic WireGuard-encrypted via tailnet."
Description: "Go default ciphers used; all traffic WireGuard-encrypted via tailnet."
"rbac_cluster_admin_usage":
Regions: ["*"]
Resources:
- "^cluster-admin$"
- "^kubeadm:cluster-admins$"
- "^minikube-rbac$"
Description: "CC: node-config-automated-verification, single-user-cluster. Only built-in/minikube cluster-admin bindings present; verified by review-compliance-reports."
Description: "Only built-in/minikube cluster-admin bindings present; verified by review-compliance-reports."

View file

@ -13,9 +13,8 @@ Mutelist:
# ArgoCD
- "^argocd-"
Description: >-
CC: single-user-cluster, sso-gated-admin-tools. Built-in
K8s roles: only operator can bind them. ArgoCD: requires
broad access but is SSO-gated via Authentik OIDC.
Built-in K8s roles: only operator can bind them. ArgoCD:
requires broad access but is SSO-gated via Authentik OIDC.
"rbac_minimize_pod_creation_access":
Regions: ["*"]
Resources:
@ -26,14 +25,12 @@ Mutelist:
# CloudNativePG operator
- "^cnpg-manager$"
Description: >-
CC: single-user-cluster. Built-in K8s roles and CNPG
operator. Only the operator can assign these roles; no
untrusted users have cluster access.
Built-in K8s roles and CNPG operator. Only the operator can
assign these roles; no untrusted users have cluster access.
"rbac_minimize_service_account_token_creation":
Regions: ["*"]
Resources:
- "^system:"
Description: >-
CC: single-user-cluster. kube-controller-manager requires
token creation for SA management. Only operator manages
service accounts.
kube-controller-manager requires token creation for SA
management. Only operator manages service accounts.

View file

@ -14,26 +14,24 @@ misconfigurations:
paths:
- "argocd/manifests/external-secrets/rbac.yaml"
statement: >-
CC: operator-purpose-bound-rbac. external-secrets-operator's entire
function is to read and synthesize Secret objects; ClusterRole over
secrets is its purpose. Both the controller and cert-controller are
external-secrets-operator's entire function is to read and
synthesize Secret objects; ClusterRole over secrets is its
purpose. Both the controller and cert-controller are
upstream-defined.
- id: KSV-0041
paths:
- "argocd/manifests/kube-state-metrics/rbac.yaml"
- "argocd/manifests/kube-state-metrics-ringtail/rbac.yaml"
statement: >-
CC: kube-state-metrics-metadata-only. KSM exposes only Secret
metadata (name, namespace, type, labels), never the data field.
list/watch on secrets is required for kube_secret_info /
kube_secret_labels metrics.
KSM exposes only Secret metadata (name, namespace, type, labels),
never the data field. list/watch on secrets is required for
kube_secret_info / kube_secret_labels metrics.
- id: KSV-0114
paths:
- "argocd/manifests/external-secrets/rbac.yaml"
statement: >-
CC: operator-purpose-bound-rbac. cert-controller manages the
external-secrets validating webhook configurations to inject its
own rotating CA bundle. RBAC is scoped to two named webhooks
(secretstore-validate, externalsecret-validate) via resourceNames;
KSV-0114 doesn't see the resourceNames restriction so reports the
full ClusterRole.
cert-controller manages the external-secrets validating webhook
configurations to inject its own rotating CA bundle. RBAC is
scoped to two named webhooks (secretstore-validate,
externalsecret-validate) via resourceNames; KSV-0114 doesn't see
the resourceNames restriction so reports the full ClusterRole.

View file

@ -14,4 +14,4 @@ resources:
images:
- name: registry.ops.eblu.me/blumeops/shower
newTag: v1.1.0-3c7967e-nix
newTag: v1.1.3-3645098-nix

View file

@ -6,8 +6,11 @@ namespace: tailscale
# Upstream Tailscale operator manifest from forge mirror.
# To upgrade: update the ref in the URL AND the newTag below.
# Must use the tailnet host forge.ops.eblu.me — the public forge.eblu.me
# black-holes /mirrors/ at the Fly edge (AI-scraper mitigation), which the
# in-cluster ArgoCD repo-server would otherwise hit and fail with a 403.
resources:
- https://forge.eblu.me/mirrors/tailscale/raw/tag/v1.94.2/cmd/k8s-operator/deploy/manifests/operator.yaml
- https://forge.ops.eblu.me/mirrors/tailscale/raw/tag/v1.94.2/cmd/k8s-operator/deploy/manifests/operator.yaml
- proxyclass.yaml
- dnsconfig.yaml

View file

@ -1,3 +1,10 @@
# TeslaMate on ringtail k3s — Nix image.
#
# The Nix image's Entrypoint waits for postgres, runs migrations
# (TeslaMate.Release.migrate), then starts the release — so no command
# override is needed. Stateless; all data lives in the teslamate database
# on the ringtail blumeops-pg (DATABASE_HOST already an in-cluster name,
# unchanged from minikube). See [[migrate-wave1-ringtail]].
apiVersion: apps/v1
kind: Deployment
metadata:

View file

@ -12,4 +12,4 @@ resources:
images:
- name: registry.ops.eblu.me/blumeops/teslamate
newTag: v3.0.0-08c698e
newTag: v3.0.0-fcac8e5-nix

View file

@ -1,69 +0,0 @@
# TeslaMate
TeslaMate is a self-hosted Tesla data logger that collects and visualizes vehicle data.
## Prerequisites
### 1. Create 1Password Secrets
Create two items in the blumeops 1Password vault:
1. **TeslaMate DB Password**
- Generate a secure password for the teslamate PostgreSQL user
- Add a field named `password` with the generated value
2. **TeslaMate Encryption Key**
- Generate with: `openssl rand -base64 32`
- Add a field named `key` with the generated value
- This encrypts Tesla API tokens at rest in the database
### 2. Apply Kubernetes Secrets
```bash
# Create namespace
kubectl create namespace teslamate
# Apply database user secret (for CNPG)
op inject -i argocd/manifests/databases/secret-teslamate.yaml.tpl | kubectl apply -f -
# Apply teslamate secrets
op inject -i argocd/manifests/teslamate/secret-encryption-key.yaml.tpl | kubectl apply -f -
op inject -i argocd/manifests/teslamate/secret-db.yaml.tpl | kubectl apply -f -
```
### 3. Create Database
After the teslamate user exists in PostgreSQL (sync blumeops-pg first):
```bash
PGPASSWORD=$(op read "op://blumeops/postgres/password") \
psql -h pg.ops.eblu.me -U eblume -c "CREATE DATABASE teslamate OWNER teslamate;"
```
## Deployment
```bash
# Sync ArgoCD apps
argocd app sync apps
argocd app sync blumeops-pg teslamate grafana grafana-config
```
## Tesla API Setup
1. Access TeslaMate UI at https://tesla.tail8d86e.ts.net
2. Click "Sign in with Tesla"
3. Complete OAuth flow in browser
4. Tokens are encrypted and stored in database
5. Verify vehicle appears and data collection starts
## Grafana Dashboards
TeslaMate dashboards are available in Grafana at https://grafana.tail8d86e.ts.net
They use the "TeslaMate" PostgreSQL datasource (not Prometheus).
## Notes
- MQTT is disabled (can be enabled later for Home Assistant integration)
- Timezone is set to America/Los_Angeles
- Encryption key protects Tesla API tokens at rest

View file

@ -10,7 +10,7 @@ resources:
images:
- name: registry.ops.eblu.me/blumeops/unpoller
newTag: v2.34.0-613f05d
newTag: v3.2.0-4d1f4af
configMapGenerator:
- name: unpoller-config

View file

@ -1,210 +0,0 @@
# Compensating Controls
#
# Documents controls that mitigate risks from suppressed or accepted security
# findings. Referenced by security tools (Prowler mutelist, Kingfisher config,
# etc.) via "CC: <id>" in finding descriptions or suppression notes.
#
# Used by `mise run review-compensating-controls` to surface stale controls.
#
# Fields:
# id - kebab-case unique identifier, referenced from tool configs
# description - what the control actually does to mitigate risk
# created - date (YYYY-MM-DD) the control was documented
# last-reviewed - date (YYYY-MM-DD) or null
# notes - optional context
controls:
- id: single-user-cluster
description: >-
Only the cluster operator (eblume) has kubectl access. No untrusted
users can create pods, access cached images, or bind RBAC roles.
created: 2026-03-30
last-reviewed: 2026-04-01
notes: >-
Verify by checking kubeconfig distribution and Tailscale ACLs.
If additional users gain cluster access, re-evaluate all findings
muted under this control.
- id: tailscale-network-isolation
description: >-
Cluster is not internet-exposed. All access requires Tailscale
identity with ACL enforcement. Profiling endpoints, debug ports,
and control-plane APIs are unreachable from the public internet.
created: 2026-03-30
last-reviewed: 2026-04-06
notes: >-
Verify with 'tailscale serve status --json' on indri and review
Tailscale ACLs in pulumi/tailscale/. Only tag:flyio-target services
are publicly routable.
- id: local-registry
description: >-
Operator-built services use a private zot registry
(registry.ops.eblu.me) for supply-chain control. Remaining
images are pulled from public registries without stored
credentials. No shared registry secrets are cached on cluster
nodes.
created: 2026-03-30
last-reviewed: 2026-04-12
notes: >-
Verify by checking image prefixes in kustomization.yaml files.
Known external-image categories: (1) upstream apps not yet
mirrored — immich, ollama, frigate, frigate-notify, valkey;
(2) infrastructure components — tailscale operator/proxy,
external-secrets, 1password-connect, forgejo-runner, docker
DinD, nvidia-device-plugin; (3) utility base images — busybox,
alpine (grafana init containers). Track upstream versions in
service-versions.yaml. Goal is to progressively mirror these
into zot.
- id: sso-gated-admin-tools
description: >-
ArgoCD requires SSO authentication via Authentik OIDC. Wildcard
RBAC roles are mitigated by requiring authenticated identity
before any API access.
created: 2026-03-30
last-reviewed: 2026-04-14
notes: >-
Verify Authentik OIDC provider config for ArgoCD and that
anonymous access is disabled. Check ArgoCD --auth-token isn't
leaked. The workflow-bot API key account is scoped to sync/get
only.
- id: operator-managed-pods
description: >-
Tailscale operator manages proxy pod specs (ts-*, ingress-*,
operator-*, nameserver-*). Pod security settings are set by the
operator, not user manifests. Operator is tracked in
service-versions.yaml and regularly updated.
created: 2026-03-30
last-reviewed: 2026-04-21
notes: >-
Verify operator version is current via 'mise run service-review'.
Check Tailscale changelog for security fixes. If operator adds
seccomp support, remove these mutes. As of 2026-04-21: still no
default seccomp on operator-generated pods (upstream issue #7359
open). A ProxyClass + generic device plugin can downgrade proxies
from privileged to NET_ADMIN+NET_RAW and set seccompProfile —
potential future remediation to remove the seccomp mute without
waiting for upstream defaults.
- id: ephemeral-privileged-jobs
description: >-
Prowler CIS scanner runs as a CronJob with 7-day TTL
auto-deletion, not as a persistent privileged workload. hostPID
exposure is time-bounded to scan duration (~20s).
created: 2026-03-30
last-reviewed: 2026-04-29
notes: >-
Verify TTL is set in cronjob.yaml. Check that no persistent
pods run with hostPID on the scanned cluster (indri). The
alloy-tracing DaemonSet on ringtail also uses hostPID but is
out of scope — Prowler only scans indri. Tracked in Todoist:
"prowler scan against ringtail" — once that lands, the
DaemonSet's hostPID+privileged posture will surface as a CIS
finding and need its own CC or remediation.
- id: trusted-ci-only
description: >-
Forgejo runner only executes workflows from repos on the private
forge (forge.ops.eblu.me). No external or untrusted repos can
trigger privileged CI jobs.
created: 2026-03-30
last-reviewed: 2026-05-01
notes: >-
Verification: (1) Runner config (argocd/manifests/forgejo-runner/
config.yaml) connects only to https://forge.ops.eblu.me/. (2) Forge
app.ini has DISABLE_REGISTRATION=true and ALLOW_ONLY_EXTERNAL_REGISTRATION
=true (ansible/roles/forgejo/defaults/main.yml) — no untrusted users
can sign up or create repos. The runner registers at instance scope
(repo_id=0/owner_id=0 in action_runner table), but the instance itself
is closed, so no per-repo allow-list is needed. Re-evaluate if the
forge ever opens to additional users or if the runner is repointed
to an external forge.
- id: init-container-isolation
description: >-
Root privileges and added capabilities (CHOWN) are limited to
init containers that run once at pod startup. All runtime
containers run as non-root (UID 472) with all capabilities
dropped.
created: 2026-03-30
last-reviewed: 2026-05-04
notes: >-
Verify by inspecting grafana deployment.yaml securityContext
for both init and runtime containers. If fsGroup alone can
handle PVC ownership, remove init-chown-data and this control.
Retirement deferred until grafana lands on ringtail's k3s
(see [[indri-k8s-migration]]) — storage backend will change,
and removing init-chown-data right before that migration
trades a real safety net for marginal cleanup. Revisit
post-migration.
- id: node-config-automated-verification
description: >-
Prowler reports certain node-level checks as MANUAL because it runs
inside a pod and cannot evaluate kubelet file permissions, kubelet
config arguments, etcd CA separation, or cluster-admin RBAC bindings.
The review-compliance-reports script SSHes into the minikube node
weekly and programmatically verifies each condition, failing loudly
if any check deviates from expected values.
created: 2026-04-14
last-reviewed: 2026-04-14
notes: >-
Verification runs as part of 'mise run review-compliance-reports'.
If minikube node is unreachable, all checks report as FAIL. If new
MANUAL findings appear in Prowler, add corresponding verification
logic to the script and update the mutelist.
- id: operator-purpose-bound-rbac
description: >-
Operators whose entire function is to manage a sensitive resource
legitimately need RBAC over that resource. external-secrets-operator
manages Secret objects (its purpose) and the cert-controller mutates
its own ValidatingWebhookConfigurations to inject rotating CA bundles.
Risk is bounded by: (1) the operator code being upstream open-source
and reviewed; (2) RBAC scoped to specific named webhooks where
possible; (3) supply chain controls on the operator image (mirrored
to local registry, version tracked in service-versions.yaml).
created: 2026-04-27
last-reviewed: 2026-04-27
notes: >-
Verify by checking that the operators in question still match their
stated purpose (i.e. external-secrets is still the only consumer of
these ClusterRoles) and that upstream hasn't published advisories
for credential-handling bugs. Re-evaluate if a non-secrets-managing
ClusterRole appears under this control.
- id: kube-state-metrics-metadata-only
description: >-
kube-state-metrics holds list/watch on Secrets cluster-wide but only
exposes Secret object *metadata* (name, namespace, type, creation
timestamp, labels) via the kube_secret_info / kube_secret_labels
metrics. Secret data fields are never read into KSM's exposed
metrics by upstream design. Mitigation rests on KSM's metric
schema, the version pin in service-versions.yaml, and the metrics
endpoint being reachable only on the cluster network.
created: 2026-04-27
last-reviewed: 2026-04-27
notes: >-
Verify by inspecting the /metrics endpoint output for any series
that include secret data (only *_info and *_labels metrics should
reference secrets, and labels should be limited to user-applied
labels — never the data:). Re-evaluate on KSM version bumps.
- id: observability-stack-audit
description: >-
Alloy collects pod logs and ships them to Loki, providing an
audit trail for cluster activity. Compensates for missing
apiserver audit logging which neither minikube (indri) nor
k3s (ringtail) configures by default.
created: 2026-03-30
last-reviewed: 2026-05-11
notes: >-
Verify Alloy DaemonSet is running on each cluster (alloy-k8s on
minikube, alloy-ringtail on k3s) and Loki is receiving logs.
Note this is weaker than native apiserver audit logs — it
captures pod stdout/stderr, not API request-level auditing.
Consider enabling apiserver audit logging on k3s post-migration
(`--audit-log-path` / `--audit-policy-file`) — minikube made it
hard, k3s makes it straightforward.

View file

@ -0,0 +1,51 @@
"""External Secrets Operator — native Dagger build.
Two-stage build: Go binary (all providers), Alpine runtime.
Source cloned from forge mirror.
A single binary serves as the controller, webhook, and cert-controller; the
Deployments select the role via a subcommand passed in `args:`, so the image
ENTRYPOINT must be the binary itself (matching upstream's distroless image).
"""
import dagger
from blumeops.containers import (
alpine_runtime,
clone_from_forge,
go_build,
oci_labels,
)
VERSION = "v2.2.0"
async def build(src: dagger.Directory) -> dagger.Container:
source = clone_from_forge("external-secrets", VERSION)
# Upstream `make build` compiles every secret provider into a single
# static binary (`-tags all_providers`, CGO disabled). Mirror that so the
# local image is functionally identical to ghcr.io/.../external-secrets.
backend = go_build(
source,
"/external-secrets",
tags="all_providers",
)
runtime = alpine_runtime(
extra_apk=["ca-certificates"],
create_user=False,
)
runtime = oci_labels(
runtime,
title="External Secrets Operator",
description=(
"Kubernetes operator that integrates external secret management systems"
),
version=VERSION,
)
return (
runtime.with_file("/bin/external-secrets", backend.file("/external-secrets"))
.with_user("65534")
.with_entrypoint(["/bin/external-secrets"])
)

View file

@ -0,0 +1,56 @@
# Nix-built External Secrets Operator (amd64, for ringtail k3s).
# Builds v2.2.0 from the forge mirror with all secret providers compiled in,
# faithful to upstream's `make build` (-tags all_providers). The container.py
# sibling builds the arm64 image for indri's minikube; this default.nix builds
# the amd64 image on ringtail's nix-container-builder.
{ pkgs ? import <nixpkgs> { } }:
let
version = "2.2.0";
src = pkgs.fetchgit {
url = "https://forge.ops.eblu.me/mirrors/external-secrets.git";
rev = "v${version}";
hash = "sha256-eAocOAp5s4CFRrpKfQr2lf3Ji+6nQQ1A5/eTw5B7v9U=";
};
# external-secrets v2.2.0 requires Go >= 1.26.1; nixpkgs default go is 1.25.x.
external-secrets = (pkgs.buildGoModule.override { go = pkgs.go_1_26; }) {
inherit src version;
pname = "external-secrets";
vendorHash = "sha256-0xuBK3fjAplPLAElHvKB6d+2lDz+De/s91fV4dPZwjE=";
doCheck = false;
subPackages = [ "." ];
tags = [ "all_providers" ];
ldflags = [ "-s" "-w" ];
meta = with pkgs.lib; {
description = "Kubernetes operator that integrates external secret management systems";
homepage = "https://github.com/external-secrets/external-secrets";
license = licenses.asl20;
mainProgram = "external-secrets";
};
};
in
pkgs.dockerTools.buildLayeredImage {
name = "blumeops/external-secrets";
contents = [
external-secrets
pkgs.cacert
pkgs.tzdata
];
config = {
Entrypoint = [ "${external-secrets}/bin/external-secrets" ];
Env = [
"SSL_CERT_FILE=${pkgs.cacert}/etc/ssl/certs/ca-bundle.crt"
"TZDIR=${pkgs.tzdata}/share/zoneinfo"
];
User = "65534";
};
}

View file

@ -1,145 +0,0 @@
# Mealie — self-hosted recipe manager
# Built from source via forge mirror of mealie-recipes/mealie
# Based on upstream docker/Dockerfile (multi-stage: Node frontend + Python backend)
ARG CONTAINER_APP_VERSION=v3.12.0
###############################################
# Frontend Build
###############################################
FROM node:24-slim AS frontend-builder
ARG CONTAINER_APP_VERSION
RUN apt-get update && apt-get install --no-install-recommends -y git ca-certificates && rm -rf /var/lib/apt/lists/*
RUN git clone --depth 1 --branch ${CONTAINER_APP_VERSION} \
https://forge.ops.eblu.me/mirrors/mealie.git /src
WORKDIR /src/frontend
RUN yarn install \
--prefer-offline \
--frozen-lockfile \
--non-interactive \
--production=false \
--network-timeout 1000000
RUN yarn generate
###############################################
# Python Base
###############################################
FROM python:3.12-slim AS python-base
ENV MEALIE_HOME="/app"
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=off \
PIP_DISABLE_PIP_VERSION_CHECK=on \
PIP_DEFAULT_TIMEOUT=100 \
VENV_PATH="/opt/mealie"
ENV PATH="$VENV_PATH/bin:$PATH"
RUN useradd -u 911 -U -d $MEALIE_HOME -s /bin/bash abc \
&& usermod -G users abc \
&& mkdir $MEALIE_HOME
###############################################
# Backend Package Build
###############################################
FROM python-base AS backend-builder
ARG CONTAINER_APP_VERSION
RUN apt-get update \
&& apt-get install --no-install-recommends -y curl git ca-certificates \
&& rm -rf /var/lib/apt/lists/*
RUN pip install uv
RUN git clone --depth 1 --branch ${CONTAINER_APP_VERSION} \
https://forge.ops.eblu.me/mirrors/mealie.git /src
WORKDIR /src
COPY --from=frontend-builder /src/frontend/dist ./mealie/frontend
RUN uv build --out-dir dist
RUN uv export --no-editable --no-emit-project --extra pgsql --format requirements-txt --output-file dist/requirements.txt \
&& MEALIE_VERSION=$(python -c "import tomllib; print(tomllib.load(open('pyproject.toml', 'rb'))['project']['version'])") \
&& echo "mealie[pgsql]==${MEALIE_VERSION} \\" >> dist/requirements.txt \
&& pip hash dist/mealie-${MEALIE_VERSION}-py3-none-any.whl | tail -n1 | tr -d '\n' >> dist/requirements.txt \
&& echo " \\" >> dist/requirements.txt \
&& pip hash dist/mealie-${MEALIE_VERSION}.tar.gz | tail -n1 >> dist/requirements.txt
###############################################
# Python Venv Build
###############################################
FROM python-base AS venv-builder
RUN apt-get update \
&& apt-get install --no-install-recommends -y \
build-essential \
libpq-dev \
libwebp-dev \
ffmpeg \
libsasl2-dev libldap2-dev libssl-dev \
gnupg gnupg2 gnupg1 \
&& rm -rf /var/lib/apt/lists/*
RUN python3 -m venv --upgrade-deps $VENV_PATH
COPY --from=backend-builder /src/dist /dist
RUN . $VENV_PATH/bin/activate \
&& pip install --require-hashes -r /dist/requirements.txt --find-links /dist
###############################################
# Production Image
###############################################
FROM python-base AS production
ENV PRODUCTION=true
ENV TESTING=false
RUN apt-get update \
&& apt-get install --no-install-recommends -y \
curl \
ffmpeg \
gosu \
iproute2 \
libldap-common \
libldap2 \
&& rm -rf /var/lib/apt/lists/*
RUN mkdir -p /run/secrets
COPY --from=venv-builder $VENV_PATH $VENV_PATH
ENV NLTK_DATA="/nltk_data/"
RUN mkdir -p $NLTK_DATA
RUN python -m nltk.downloader -d $NLTK_DATA averaged_perceptron_tagger_eng
VOLUME ["$MEALIE_HOME/data/"]
ENV APP_PORT=9000
EXPOSE ${APP_PORT}
COPY --from=backend-builder /src/docker/healthcheck.sh $MEALIE_HOME/healthcheck.sh
RUN chmod +x $MEALIE_HOME/healthcheck.sh
HEALTHCHECK CMD $MEALIE_HOME/healthcheck.sh
ENV HOST=0.0.0.0
COPY --from=backend-builder /src/docker/entry.sh $MEALIE_HOME/run.sh
RUN chmod +x $MEALIE_HOME/run.sh
ARG CONTAINER_APP_VERSION
LABEL org.opencontainers.image.title="Mealie"
LABEL org.opencontainers.image.description="Self-hosted recipe manager"
LABEL org.opencontainers.image.version="${CONTAINER_APP_VERSION}"
LABEL org.opencontainers.image.source="https://forge.eblu.me/eblume/blumeops"
LABEL org.opencontainers.image.vendor="blumeops"
ENTRYPOINT ["/app/run.sh"]

View file

@ -0,0 +1,69 @@
# Nix-built Mealie for ringtail (amd64).
#
# Replaces the from-source Dockerfile build (Node frontend + Python venv)
# with nixpkgs' mealie, which ships a single `mealie` gunicorn entrypoint
# serving the prebuilt frontend + backend — so this is a clean single-
# process wrap (unlike paperless, which is multi-process).
#
# Mealie stores its DB as SQLite under DATA_DIR (the mealie-data PVC at
# /app/data); there is no postgres. The run wrapper mirrors the nixpkgs
# mealie NixOS module: run `libexec/init_db` (Alembic migrations) first,
# then exec gunicorn.
#
# Self-pins nixos-unstable: stable nixpkgs lags at 3.9.2, unstable carries
# 3.16.0. This is a forward 4-minor bump from the v3.12.0 Dockerfile build
# (the deferred upgrade) — mealie auto-migrates the SQLite DB forward on
# startup via init_db; the source PVC is retained for rollback. The version
# assertion makes nix-build fail if a pin bump changes the version.
let
nixpkgs = fetchTarball {
url = "https://github.com/NixOS/nixpkgs/archive/331800de5053fcebacf6813adb5db9c9dca22a0c.tar.gz";
sha256 = "1p54fm6dkbq62kpi55cr4wyx7b1nsajpsnjgs64cmp073fwi15f7";
};
pkgs = import nixpkgs { system = "x86_64-linux"; };
version = "3.16.0";
app = pkgs.mealie;
# Mirror the NixOS module's mealie service: init_db (Alembic) then
# gunicorn bound to the app port. DATA_DIR/env come from the image +
# k8s manifest.
mealie-run = pkgs.writeShellScriptBin "mealie-run" ''
set -e
${app}/libexec/init_db
exec ${pkgs.lib.getExe app} -b 0.0.0.0:9000
'';
in
assert app.version == version;
pkgs.dockerTools.buildLayeredImage {
name = "blumeops/mealie";
contents = [
app
mealie-run
pkgs.bashInteractive
pkgs.coreutils
pkgs.cacert
pkgs.tzdata
# python3 (stdlib sqlite3) for the borgmatic k8s-sqlite-dump helper,
# which runs `python3 -c "...sqlite3...backup..."` inside the pod.
# Same nixpkgs python mealie is built against, so ~no added closure.
pkgs.python3
];
config = {
Cmd = [ "${mealie-run}/bin/mealie-run" ];
Env = [
"DATA_DIR=/app/data"
"SSL_CERT_FILE=${pkgs.cacert}/etc/ssl/certs/ca-bundle.crt"
"PYTHONUNBUFFERED=1"
"PRODUCTION=true"
];
ExposedPorts = {
"9000/tcp" = { };
};
};
}

View file

@ -1,156 +0,0 @@
# syntax=docker/dockerfile:1
# Paperless-ngx — self-hosted document management
# Built from source via forge mirror of paperless-ngx/paperless-ngx
# Closely follows upstream Dockerfile structure with git clone instead of COPY
ARG CONTAINER_APP_VERSION=v2.20.13
###############################################
# Stage 1: Clone source (reused by later stages)
###############################################
FROM docker.io/library/alpine:3.22 AS source
ARG CONTAINER_APP_VERSION
RUN apk add --no-cache git
RUN git clone --depth 1 --branch ${CONTAINER_APP_VERSION} \
https://forge.ops.eblu.me/mirrors/paperless-ngx.git /src
###############################################
# Stage 2: Compile frontend
###############################################
FROM --platform=$BUILDPLATFORM docker.io/node:20-trixie-slim AS compile-frontend
COPY --from=source /src/src-ui /src/src-ui
WORKDIR /src/src-ui
RUN set -eux \
&& npm update -g pnpm \
&& npm install -g corepack@latest \
&& corepack enable \
&& pnpm install
RUN set -eux \
&& ./node_modules/.bin/ng build --configuration production
###############################################
# Stage 3: s6-overlay base
###############################################
FROM ghcr.io/astral-sh/uv:0.9.15-python3.12-trixie-slim AS s6-overlay-base
WORKDIR /usr/src/s6
ENV S6_BEHAVIOUR_IF_STAGE2_FAILS=2 \
S6_CMD_WAIT_FOR_SERVICES_MAXTIME=0 \
S6_VERBOSITY=1 \
PATH=/command:$PATH
ARG TARGETARCH
ARG TARGETVARIANT
ARG S6_OVERLAY_VERSION=3.2.1.0
RUN set -eux \
&& apt-get update \
&& apt-get install --yes --quiet --no-install-recommends curl xz-utils \
&& S6_ARCH="" \
&& if [ "${TARGETARCH}${TARGETVARIANT}" = "amd64" ]; then S6_ARCH="x86_64"; \
elif [ "${TARGETARCH}${TARGETVARIANT}" = "arm64" ]; then S6_ARCH="aarch64"; fi \
&& if [ -z "${S6_ARCH}" ]; then echo "Error: Cannot determine arch"; exit 1; fi \
&& curl --fail --silent --show-error --location --remote-name-all --parallel \
"https://github.com/just-containers/s6-overlay/releases/download/v${S6_OVERLAY_VERSION}/s6-overlay-noarch.tar.xz" \
"https://github.com/just-containers/s6-overlay/releases/download/v${S6_OVERLAY_VERSION}/s6-overlay-noarch.tar.xz.sha256" \
"https://github.com/just-containers/s6-overlay/releases/download/v${S6_OVERLAY_VERSION}/s6-overlay-${S6_ARCH}.tar.xz" \
"https://github.com/just-containers/s6-overlay/releases/download/v${S6_OVERLAY_VERSION}/s6-overlay-${S6_ARCH}.tar.xz.sha256" \
&& sha256sum --check ./*.sha256 \
&& tar --directory / -Jxpf s6-overlay-noarch.tar.xz \
&& tar --directory / -Jxpf s6-overlay-${S6_ARCH}.tar.xz \
&& rm ./*.tar.xz ./*.sha256 \
&& apt-get --yes purge curl xz-utils \
&& apt-get --yes autoremove --purge \
&& rm -rf /var/lib/apt/lists/*
# Copy rootfs (s6 service definitions, init scripts)
COPY --from=source /src/docker/rootfs /
###############################################
# Stage 4: Main application
###############################################
FROM s6-overlay-base AS main-app
ARG CONTAINER_APP_VERSION
ARG DEBIAN_FRONTEND=noninteractive
ARG TARGETARCH
ARG JBIG2ENC_VERSION=0.30
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
PYTHONWARNINGS="ignore:::django.http.response:517" \
PNGX_CONTAINERIZED=1 \
UV_LINK_MODE=copy \
UV_CACHE_DIR=/cache/uv/
# Runtime packages
RUN set -eux \
&& apt-get update \
&& apt-get install --yes --quiet --no-install-recommends \
curl gosu tzdata fonts-liberation gettext ghostscript gnupg \
icc-profiles-free imagemagick postgresql-client \
tesseract-ocr tesseract-ocr-eng tesseract-ocr-deu tesseract-ocr-fra \
tesseract-ocr-ita tesseract-ocr-spa unpaper pngquant jbig2dec \
libxml2 libxslt1.1 qpdf file libmagic1 media-types zlib1g \
libzbar0 poppler-utils \
&& curl --fail --silent --show-error --location --remote-name-all \
"https://github.com/paperless-ngx/builder/releases/download/jbig2enc-trixie-v${JBIG2ENC_VERSION}/jbig2enc_${JBIG2ENC_VERSION}-1_${TARGETARCH}.deb" \
&& dpkg --install ./jbig2enc_${JBIG2ENC_VERSION}-1_${TARGETARCH}.deb \
&& cp /etc/ImageMagick-6/paperless-policy.xml /etc/ImageMagick-6/policy.xml \
&& rm --force *.deb \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /usr/src/paperless/src/
# Python dependencies
COPY --from=source /src/pyproject.toml /src/uv.lock /usr/src/paperless/src/
RUN --mount=type=cache,target=${UV_CACHE_DIR},id=python-cache \
set -eux \
&& apt-get update \
&& apt-get install --yes --quiet --no-install-recommends \
build-essential default-libmysqlclient-dev pkg-config \
&& uv export --quiet --no-dev --all-extras --format requirements-txt --output-file requirements.txt \
&& uv pip install --system --no-python-downloads --python-preference system --requirements requirements.txt \
&& python3 -W ignore::RuntimeWarning -m nltk.downloader -d "/usr/share/nltk_data" snowball_data \
&& python3 -W ignore::RuntimeWarning -m nltk.downloader -d "/usr/share/nltk_data" stopwords \
&& python3 -W ignore::RuntimeWarning -m nltk.downloader -d "/usr/share/nltk_data" punkt_tab \
&& apt-get --yes purge build-essential default-libmysqlclient-dev pkg-config \
&& apt-get --yes autoremove --purge \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
# Copy backend source
COPY --from=source /src/src ./
# Copy compiled frontend
COPY --from=compile-frontend /src/src/documents/static/frontend/ ./documents/static/frontend/
# Create user and finalize
RUN set -eux \
&& addgroup --gid 1000 paperless \
&& useradd --uid 1000 --gid paperless --home-dir /usr/src/paperless paperless \
&& mkdir -p /usr/src/paperless/data /usr/src/paperless/media \
/usr/src/paperless/consume /usr/src/paperless/export \
&& chown -R paperless:paperless /usr/src/paperless \
&& s6-setuidgid paperless python3 manage.py collectstatic --clear --no-input --link \
&& s6-setuidgid paperless python3 manage.py compilemessages
VOLUME ["/usr/src/paperless/data", "/usr/src/paperless/media", \
"/usr/src/paperless/consume", "/usr/src/paperless/export"]
ENTRYPOINT ["/init"]
EXPOSE 8000
HEALTHCHECK --interval=30s --timeout=10s --retries=5 \
CMD [ "curl", "-fs", "-S", "-L", "--max-time", "2", "http://localhost:8000" ]
LABEL org.opencontainers.image.title="Paperless-ngx"
LABEL org.opencontainers.image.description="Self-hosted document management system"
LABEL org.opencontainers.image.version="${CONTAINER_APP_VERSION}"
LABEL org.opencontainers.image.source="https://forge.eblu.me/eblume/blumeops"
LABEL org.opencontainers.image.vendor="blumeops"

View file

@ -0,0 +1,77 @@
# Nix-built Paperless-ngx for ringtail (amd64).
#
# Replaces the from-source Dockerfile build (s6-overlay) with nixpkgs'
# paperless-ngx, which already bundles the full OCR/imaging closure
# (tesseract, ghostscript, imagemagick, qpdf, poppler, jbig2enc) and the
# NLTK data via wrappers — so the image stays lean.
#
# Unlike the upstream s6 image, this image does NOT run all processes
# itself. Paperless is multi-process; on ringtail it runs as four
# containers sharing this one image, each with a different command:
# web -> paperless-web (granian, the wrapper below)
# worker -> celery --app paperless worker
# beat -> celery --app paperless beat
# consumer -> paperless-ngx document_consumer
# plus a redis/valkey sidecar. The PYTHONPATH/granian invocation mirrors
# the nixpkgs paperless NixOS module's paperless-web service exactly.
#
# Self-pins nixos-unstable: stable nixpkgs lags at 2.19.6, while unstable
# carries 2.20.15 — a same-minor forward patch bump from the previous
# Dockerfile build (v2.20.13). The version assertion makes nix-build fail
# if a pin bump changes the version, forcing an explicit acknowledgment
# here and in service-versions.yaml (enforced by container-version-check).
let
nixpkgs = fetchTarball {
url = "https://github.com/NixOS/nixpkgs/archive/331800de5053fcebacf6813adb5db9c9dca22a0c.tar.gz";
sha256 = "1p54fm6dkbq62kpi55cr4wyx7b1nsajpsnjgs64cmp073fwi15f7";
};
pkgs = import nixpkgs { system = "x86_64-linux"; };
version = "2.20.15";
app = pkgs.paperless-ngx;
# Mirror the NixOS module's paperless-web service: granian serving the
# ASGI app with the package's propagated deps + src on PYTHONPATH.
pythonPath =
"${app.python.pkgs.makePythonPath app.propagatedBuildInputs}:${app}/lib/paperless-ngx/src";
paperless-web = pkgs.writeShellScriptBin "paperless-web" ''
export PYTHONPATH="${pythonPath}"
export PAPERLESS_NLTK_DIR="${app.nltkDataDir}"
exec ${app.python.pkgs.granian}/bin/granian \
--interface asginl --ws \
--host 0.0.0.0 --port 8000 \
"paperless.asgi:application"
'';
in
assert app.version == version;
pkgs.dockerTools.buildLayeredImage {
name = "blumeops/paperless";
contents = [
app
paperless-web
pkgs.bashInteractive
pkgs.coreutils
pkgs.cacert
pkgs.tzdata
];
config = {
# Default command is the web server; worker/beat/consumer containers
# override `command` in their k8s manifests.
Cmd = [ "${paperless-web}/bin/paperless-web" ];
Env = [
"PAPERLESS_NLTK_DIR=${app.nltkDataDir}"
"SSL_CERT_FILE=${pkgs.cacert}/etc/ssl/certs/ca-bundle.crt"
"PYTHONUNBUFFERED=1"
"PNGX_CONTAINERIZED=1"
];
ExposedPorts = {
"8000/tcp" = { };
};
};
}

View file

@ -25,7 +25,7 @@
{ pkgs ? import <nixpkgs> { } }:
let
version = "1.1.0";
version = "1.1.3";
python = pkgs.python314;
@ -43,7 +43,7 @@ let
showerSdist = pkgs.fetchurl {
name = "adelaide_baby_shower_app-${version}.tar.gz";
url = "https://forge.ops.eblu.me/api/packages/eblume/pypi/files/adelaide-baby-shower-app/${version}/adelaide_baby_shower_app-${version}.tar.gz";
hash = "sha256-5dp+0u4metOIC6s6/nPlT4cdpFBCV6S3+Z/3RO0sX5U=";
hash = "sha256-a3rCwEdOB+rnYXqsWDifyltpyKUgkOj0ikWB+WGQYKE=";
};
# Wheel pulled from forge.ops.eblu.me (tailnet) for the same reason the
@ -53,7 +53,7 @@ let
showerWheel = pkgs.fetchurl {
name = "adelaide_baby_shower_app-${version}-py3-none-any.whl";
url = "https://forge.ops.eblu.me/api/packages/eblume/pypi/files/adelaide-baby-shower-app/${version}/adelaide_baby_shower_app-${version}-py3-none-any.whl";
hash = "sha256-7orFbycON9dQxEIb6q45Xx2rFlEZ8xXSrC2tnrO5uug=";
hash = "sha256-a6j91gBigG4IzE2DVTBntnZ46Yrx9b5PgHn+Uro98Tk=";
};
staticAssets = pkgs.runCommand "shower-static-assets-${version}" { } ''
@ -148,7 +148,7 @@ let
outputHashAlgo = "sha256";
# Pinned dep closure — reproducible until version bumps. To recompute,
# set to pkgs.lib.fakeHash and read the failure.
outputHash = "sha256-kTNOswobtkgyQmmqbQM8XO4vvaGg57nCuuZGbNXb0NM=";
outputHash = "sha256-1xx2qWAIwherklHIPXo6IOKkKHML1KUrUx6pbkMxffc=";
dontFixup = true;
};

View file

@ -1,104 +0,0 @@
"""TeslaMate — Tesla data logger.
Two-stage build: Elixir+Node (builder), Debian slim (runtime).
Source cloned from forge mirror.
"""
import dagger
from dagger import dag
from blumeops.containers import clone_from_forge, oci_labels
VERSION = "v3.0.0"
async def build(src: dagger.Directory) -> dagger.Container:
source = clone_from_forge("teslamate", VERSION)
# Stage 1: Build Elixir release with Node.js assets
builder = (
dag.container()
.from_("elixir:1.19.5-otp-26")
.with_exec(
[
"bash",
"-c",
"apt-get update"
" && apt-get install -y ca-certificates curl gnupg git zstd brotli"
" && mkdir -p /etc/apt/keyrings"
" && curl -fsSL https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key"
" | gpg --dearmor -o /etc/apt/keyrings/nodesource.gpg"
' && echo "deb [signed-by=/etc/apt/keyrings/nodesource.gpg]'
' https://deb.nodesource.com/node_22.x nodistro main"'
" > /etc/apt/sources.list.d/nodesource.list"
" && apt-get update"
" && apt-get install -y nodejs"
" && apt-get clean"
" && rm -rf /var/lib/apt/lists/*",
]
)
.with_exec(["mix", "local.rebar", "--force"])
.with_exec(["mix", "local.hex", "--force"])
.with_directory("/opt/app", source)
.with_workdir("/opt/app")
.with_env_variable("MIX_ENV", "prod")
.with_exec(["mix", "deps.get", "--only", "prod"])
.with_exec(["mix", "deps.compile"])
.with_exec(
[
"npm",
"ci",
"--prefix",
"./assets",
"--progress=false",
"--no-audit",
"--loglevel=error",
]
)
.with_exec(["mix", "assets.deploy"])
.with_exec(["mix", "compile"])
.with_exec(
["bash", "-c", "SKIP_LOCALE_DOWNLOAD=true mix release --path /opt/built"]
)
)
# Stage 2: Debian slim runtime
entrypoint = src.file("containers/teslamate/entrypoint.sh")
runtime = (
dag.container()
.from_("debian:trixie-slim")
.with_exec(
[
"bash",
"-c",
"apt-get update && apt-get install -y --no-install-recommends"
" libodbc2 libsctp1 libssl3t64 libstdc++6"
" netcat-openbsd tini tzdata"
" && apt-get clean"
" && rm -rf /var/lib/apt/lists/*"
" && groupadd --gid 10001 --system nonroot"
" && useradd --uid 10000 --system --gid nonroot"
" --home-dir /home/nonroot --shell /sbin/nologin nonroot",
]
)
)
runtime = oci_labels(
runtime,
title="TeslaMate",
description="Tesla data logger and visualization",
version=VERSION,
)
return (
runtime.with_env_variable("LANG", "C.UTF-8")
.with_env_variable("SRTM_CACHE", "/opt/app/.srtm_cache")
.with_env_variable("HOME", "/opt/app")
.with_workdir("/opt/app")
.with_directory("/opt/app", builder.directory("/opt/built"), owner="nonroot")
.with_exec(["mkdir", "-p", "/opt/app/.srtm_cache"])
.with_file("/entrypoint.sh", entrypoint, permissions=0o555, owner="nonroot")
.with_user("nonroot")
.with_exposed_port(4000)
.with_entrypoint(["tini", "--", "/bin/dash", "/entrypoint.sh"])
.with_default_args(args=["bin/teslamate", "start"])
)

View file

@ -0,0 +1,122 @@
# Nix-built TeslaMate for ringtail (amd64).
#
# Replaces the Dagger container.py (Elixir+Node builder -> Debian slim).
# TeslaMate is NOT in nixpkgs, so this is a from-scratch beamPackages
# mixRelease: an Elixir/Phoenix release with npm-built assets.
#
# Pinned to the same nixos-unstable rev as paperless/mealie for a
# consistent toolchain. The BEAM combo is pinned to erlang_27 + elixir_1_18
# (teslamate requires elixir ~> 1.17; upstream's image uses OTP 26, so we
# stay off the default OTP 28 which elixir 1.18 does not target).
#
# Source comes from the forge mirror (supply-chain control), pinned by the
# v3.0.0 tag's commit so builtins.fetchGit needs no hash.
let
nixpkgs = fetchTarball {
url = "https://github.com/NixOS/nixpkgs/archive/331800de5053fcebacf6813adb5db9c9dca22a0c.tar.gz";
sha256 = "1p54fm6dkbq62kpi55cr4wyx7b1nsajpsnjgs64cmp073fwi15f7";
};
pkgs = import nixpkgs { system = "x86_64-linux"; };
lib = pkgs.lib;
version = "3.0.0";
beamPackages = pkgs.beam.packages.erlang_27;
elixir = beamPackages.elixir_1_18;
src = builtins.fetchGit {
url = "https://forge.ops.eblu.me/mirrors/teslamate.git";
ref = "refs/tags/v${version}";
rev = "3281154d42330786a182c1bbe094ecda0b1c5578";
};
# ex_cldr downloads locale JSON from GitHub at compile time, which the
# build sandbox blocks. teslamate's cldr.ex reads the data dir from the
# LOCALES env var; point it at the pre-fetched elixir-cldr data so no
# download is attempted (with SKIP_LOCALE_DOWNLOAD=true disabling the
# forced refresh). CLDR data version matches the compile-time errors.
cldrData = pkgs.fetchFromGitHub {
owner = "elixir-cldr";
repo = "cldr";
rev = "v2.46.0";
sha256 = "1iwzk9dc754l72vpf8vsisdjncnjx26pz509552b6vnm49xbxyji";
};
teslamate = beamPackages.mixRelease {
pname = "teslamate";
inherit version src elixir;
# Keep the build-generated Erlang cookie in the release. mixRelease
# strips it by default (expecting RELEASE_COOKIE at runtime), but the
# start script reads releases/COOKIE. teslamate is single-node (no
# distributed Erlang exposed), so a baked-in cookie is fine.
removeCookie = false;
mixFodDeps = beamPackages.fetchMixDeps {
pname = "mix-deps-teslamate";
inherit src version elixir;
hash = "sha256-DDrREiM1BIMgD2qFPTK8QyjOYlnfE3XlnaH/jk7G2go=";
};
# Frontend assets. esbuild + sass are devDeps and the esbuild platform
# binary is an optional dep, so npm ci must include both. We run npm ci
# here (not a separate derivation) because assets/package.json has
# file:../deps/phoenix references that only resolve once mixFodDeps has
# populated deps/. npmConfigHook wires up the offline cache from npmDeps;
# then `node scripts/build.js` (custom esbuild) + `mix phx.digest`.
nativeBuildInputs = [ pkgs.nodejs pkgs.npmHooks.npmConfigHook ];
npmDeps = pkgs.fetchNpmDeps {
name = "teslamate-npm-deps";
src = src + "/assets";
hash = "sha256-XyiaUkT/c4rZnNxmxhVLb+vEXnc64A1hjOrnR5fhaEk=";
};
npmRoot = "assets";
preBuild = ''
export SKIP_LOCALE_DOWNLOAD=true
export LOCALES=${cldrData}/priv/cldr
( cd assets && npm ci --include=dev --include=optional && node scripts/build.js )
mix phx.digest --no-deps-check
'';
};
in
pkgs.dockerTools.buildLayeredImage {
name = "blumeops/teslamate";
contents = [
teslamate
pkgs.bashInteractive
pkgs.coreutils
pkgs.dash
pkgs.netcat-openbsd
pkgs.cacert
pkgs.tzdata
];
config = {
# Mirror entrypoint.sh: wait for postgres, run migrations, then start.
Entrypoint = [
"${pkgs.dash}/bin/dash"
"-c"
''
: "''${DATABASE_HOST:=127.0.0.1}"
: "''${DATABASE_PORT:=5432}"
while ! ${pkgs.netcat-openbsd}/bin/nc -z "$DATABASE_HOST" "$DATABASE_PORT" 2>/dev/null; do
echo "waiting for postgres at $DATABASE_HOST:$DATABASE_PORT"; sleep 1
done
${teslamate}/bin/teslamate eval "TeslaMate.Release.migrate"
exec ${teslamate}/bin/teslamate start
''
];
Env = [
"HOME=/opt/app"
"SRTM_CACHE=/opt/app/.srtm_cache"
"LANG=C.UTF-8"
"SSL_CERT_FILE=${pkgs.cacert}/etc/ssl/certs/ca-bundle.crt"
];
ExposedPorts = {
"4000/tcp" = { };
};
};
}

View file

@ -1,23 +0,0 @@
#!/usr/bin/env dash
set -e
: "${DATABASE_HOST:="127.0.0.1"}"
: "${DATABASE_PORT:=5432}"
: "${ULIMIT_MAX_NOFILE:=65536}"
# prevent memory bloat in some misconfigured versions of Docker/containerd
# where the nofiles limit is very large. 0 means don't set it.
if test "${ULIMIT_MAX_NOFILE}" != 0 && test "$(ulimit -n)" -gt "${ULIMIT_MAX_NOFILE}"; then
ulimit -n "${ULIMIT_MAX_NOFILE}"
fi
# wait until Postgres is ready
while ! nc -z "${DATABASE_HOST}" "${DATABASE_PORT}" 2>/dev/null; do
echo waiting for postgres at "${DATABASE_HOST}":"${DATABASE_PORT}"
sleep 1s
done
# apply migrations
bin/teslamate eval "TeslaMate.Release.migrate"
exec "$@"

View file

@ -1,43 +0,0 @@
# UnPoller — UniFi metrics exporter for Prometheus
# Two-stage build: Go compilation, then minimal Alpine runtime
ARG CONTAINER_APP_VERSION=v2.34.0
FROM golang:alpine3.22 AS build
ARG CONTAINER_APP_VERSION
RUN apk add --no-cache git
RUN git clone --depth 1 --branch ${CONTAINER_APP_VERSION} \
https://forge.ops.eblu.me/mirrors/unpoller.git /app
WORKDIR /app
ENV CGO_ENABLED=0
RUN go build -ldflags="-s -w \
-X main.version=${CONTAINER_APP_VERSION} \
-X main.builtBy=blumeops \
-X golift.io/version.Version=${CONTAINER_APP_VERSION} \
-X golift.io/version.Branch=HEAD \
-X golift.io/version.BuildUser=blumeops \
-X golift.io/version.Revision=blumeops-build" \
-o /bin/unpoller .
FROM alpine:3.22
ARG CONTAINER_APP_VERSION
LABEL org.opencontainers.image.title="UnPoller"
LABEL org.opencontainers.image.description="UniFi metrics exporter for Prometheus"
LABEL org.opencontainers.image.version="${CONTAINER_APP_VERSION}"
LABEL org.opencontainers.image.source="https://forge.eblu.me/eblume/blumeops"
LABEL org.opencontainers.image.vendor="blumeops"
RUN apk add --no-cache ca-certificates tzdata
COPY --from=build /bin/unpoller /usr/bin/unpoller
EXPOSE 9130
USER 65534:65534
ENTRYPOINT ["/usr/bin/unpoller"]
CMD ["--config", "/etc/unpoller/up.conf"]

View file

@ -0,0 +1,53 @@
"""UnPoller — UniFi metrics exporter for Prometheus.
Two-stage build: Go backend, Alpine runtime.
Source cloned from forge mirror.
"""
import dagger
from blumeops.containers import (
alpine_runtime,
clone_from_forge,
go_build,
oci_labels,
)
VERSION = "v3.2.0"
async def build(src: dagger.Directory) -> dagger.Container:
source = clone_from_forge("unpoller", VERSION)
backend = go_build(
source,
"/unpoller",
ldflags=(
f"-s -w "
f"-X main.version={VERSION} "
f"-X main.builtBy=blumeops "
f"-X golift.io/version.Version={VERSION} "
f"-X golift.io/version.Branch=HEAD "
f"-X golift.io/version.BuildUser=blumeops "
f"-X golift.io/version.Revision=blumeops-build"
),
)
runtime = alpine_runtime(
extra_apk=["ca-certificates", "tzdata"],
create_user=False,
)
runtime = oci_labels(
runtime,
title="UnPoller",
description="UniFi metrics exporter for Prometheus",
version=VERSION,
)
return (
runtime.with_file("/usr/bin/unpoller", backend.file("/unpoller"))
.with_exposed_port(9130)
.with_user("65534")
.with_default_args(
args=["/usr/bin/unpoller", "--config", "/etc/unpoller/up.conf"]
)
)

View file

@ -1,8 +1,8 @@
"""Valkey — native Dagger build.
"""Valkey — native Dagger build (arm64, indri).
Alpine 3.22 base with the `valkey` apk package (8.1.x Redis-compatible).
Mirrors `docker.io/valkey/valkey:8.1-alpine`, used by paperless and immich
as a cache/queue sidecar.
Used by paperless (sidecar) on indri. immich on ringtail uses the
nix-built amd64 variant from `default.nix` in this directory.
"""
import dagger
@ -10,9 +10,10 @@ from dagger import dag
from blumeops.containers import oci_labels
# Alpine 3.22 ships valkey 8.1.6-r0. Alpine 3.23 jumps to 9.0 — hold on 3.22
# to keep this a 1:1 swap for the upstream `valkey:8.1-alpine` image.
VERSION = "8.1.6-r0"
# Alpine 3.22 currently ships valkey 8.1.7-r0. Alpine 3.23 jumps to 9.0 —
# hold on 3.22 to keep this aligned with the 8.1 line.
VERSION = "8.1.7"
ALPINE_PIN = "8.1.7-r0"
ALPINE_BASE = "alpine:3.22"
@ -21,7 +22,7 @@ async def build(src: dagger.Directory) -> dagger.Container:
ctr = (
dag.container()
.from_(ALPINE_BASE)
.with_exec(["apk", "add", "--no-cache", f"valkey={VERSION}"])
.with_exec(["apk", "add", "--no-cache", f"valkey={ALPINE_PIN}"])
.with_exec(["mkdir", "-p", "/data"])
.with_exec(["chown", "valkey:valkey", "/data"])
.with_workdir("/data")

View file

@ -0,0 +1,30 @@
# Nix-built Valkey for ringtail (amd64)
# Companion to container.py (Alpine 3.22, arm64 on indri).
# Used by immich-ringtail which needs an amd64 image; paperless on indri
# continues to use the Alpine container.py build.
#
# The version assertion ensures nix-build fails if a flake.lock update
# changes the Valkey version — forcing an explicit version acknowledgment
# here and in service-versions.yaml (enforced by container-version-check).
{ pkgs ? import <nixpkgs> { } }:
let
version = "8.1.7";
in
assert pkgs.valkey.version == version;
pkgs.dockerTools.buildLayeredImage {
name = "blumeops/valkey";
contents = [
pkgs.valkey
];
config = {
Entrypoint = [ "${pkgs.valkey}/bin/valkey-server" ];
Cmd = [ "--bind" "0.0.0.0" "--protected-mode" "no" "--dir" "/data" ];
ExposedPorts = {
"6379/tcp" = { };
};
};
}

View file

@ -1 +0,0 @@
Adopt `AGENTS.md` as the canonical agent instruction file, keep `CLAUDE.md` as a compatibility shim, and update docs to reference the neutral file and the correct agent-change-process path.

View file

@ -1,5 +0,0 @@
Rebuild and retag alloy v1.16.0 container images from the main-branch SHA
following the squash-merge of #345, per the build-container-image
squash-merge convention. Both images (`registry.ops.eblu.me/blumeops/alloy`)
now reference `9564435` rather than the branch SHA `26a3ab5`, restoring
source traceability after branch cleanup.

View file

@ -1,6 +0,0 @@
Upgrade native macOS Alloy on indri to v1.16.0. Built on gilbert with Go
1.26.2 + CGO (required for the macOS native DNS resolver, which Tailscale
MagicDNS depends on), scp'd to `~/.local/bin/alloy` on indri, codesigned,
and the LaunchAgent reloaded. Completes the v1.16.0 fleet upgrade started
in #345 — all four Alloy services (alloy-k8s, alloy-ringtail,
alloy-tracing-ringtail, alloy ansible) now run v1.16.0.

View file

@ -1 +0,0 @@
Add resource limits to all ArgoCD pods to prevent unbounded resource consumption during node-wide pressure events.

View file

@ -1 +0,0 @@
`blumeops-tasks` now annotates each task with a human-readable due offset (`5d overdue` / `due in 2d` / `due today`) and a `↻ <recurrence>` marker for recurring tasks, and sorts by overdue-ness (most overdue first, no-due-date last) with priority as tiebreaker.

View file

@ -1 +0,0 @@
CLAUDE.md now imports AGENTS.md via `@AGENTS.md` instead of telling agents to go read it. Claude Code only auto-loads CLAUDE.md, so the prose shim was easy to skip; the import inlines AGENTS.md into the session prompt unconditionally.

View file

@ -1 +0,0 @@
New explanation article [[compliance-mute-categories]] documenting the gap between current `CC:`-only mute tagging and the three structurally distinct categories (compensating control, not-applicable, risk-accepted) needed for real PCI DSS / SOC2 practice. Captures the current image-scan mutelist gap (`cronjob-image-scan.yaml` doesn't pass `--mutelist-file`) and proposes an order-of-operations for wiring it up alongside the new tag conventions. Triggered by CVE-2026-31789, an OpenSSL 32-bit-only finding that surfaced the need for an NA category.

View file

@ -1 +0,0 @@
`container-build-and-release` now prints the specific `mise run runner-logs <N>` command after dispatching, polling the Forgejo API to resolve the run number for the commit it just triggered.

View file

@ -0,0 +1 @@
Rebuilt the locally-built external-secrets image from the `main` branch so the deployed tag (`v2.2.0-0e70a1b`) traces to a `main` commit rather than the now-merged feature branch, giving a stable provenance reference.

View file

@ -0,0 +1 @@
Rebuilt the external-secrets images off `main` and repointed both clusters to the stable main-sha tags (`v2.2.0-13895bb` arm64 / `v2.2.0-13895bb-nix` amd64), so the deployed images on indri and ringtail trace to the same `main` commit rather than earlier feature-branch builds.

View file

@ -1 +0,0 @@
Fixed forge.eblu.me static assets (CSS, JS, images, fonts) not loading — the proxy's static asset cache block was missing the `Host` header, so Caddy couldn't route the requests.

View file

@ -1 +0,0 @@
Add local nix container build for `frigate-notify` (`containers/frigate-notify/default.nix`) so the Frigate→ntfy bridge is rebuilt on ringtail from the forge mirror instead of pulled from `ghcr.io/0x2142/frigate-notify`.

View file

@ -0,0 +1 @@
Bumped the indri heph hub to v1.2.1, which adds the hub `GET /config` endpoint and ships the heph-pwa **Login with Authentik** flow (Authorization Code + PKCE). Pairs with the Authentik `heph` provider redirect URIs registered earlier.

View file

@ -1,5 +0,0 @@
Fixed homepage container EACCES on cold start: the nix-built image now chowns
`/app/config` to uid 1000 at build time via `fakeRootCommands`, matching the
behavior of the old Dockerfile. Without this, homepage couldn't seed missing
skeleton configs (proxmox.yaml etc.) or create `/app/config/logs`, crashing on
its first uncached request. Caught during the ringtail cutover.

View file

@ -1 +0,0 @@
Rebuild Prowler container against main HEAD (v5.23.0-495e45d) after merging the IaC mutelist Dockerfile changes.

Some files were not shown because too many files have changed in this diff Show more