Default `general` zone (10r/s burst=20) is tuned for internet drive-by
traffic. At the party, 30 guests scanning the splash QR from one
venue-wifi NAT'd public IP would each fetch HTML + ~5 static assets
within a few seconds — easily clearing burst=20, and the second-wave
guests would see 503 with no auto-retry.
New shower_general zone (50r/s burst=200) absorbs that simultaneous-
load spike. Exploit scanners still trip it: the 45.88.138.44 burst
we already saw in Loki fired ~30 req in 2s, well above the new
sustained 50r/s when extrapolated, and burst=200 is still a hard cap
on instantaneous spikes.
Self-healing: `limit_req` is a token bucket — no persistent ban,
nothing to manually flush. A guest who trips it auto-recovers within
~1s; tuning here is about not tripping it on legit traffic in the
first place.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The wheel ships config/ and shower/ only (per pyproject hatchling
config), leaving the repo's top-level static/ dir — Sortable.min.js,
cropper.min.js, cropper.min.css, prize-placeholder.svg — behind. At
runtime, host_dashboard.html's {% static 'css/cropper.min.css' %}
hits the manifest, CompressedManifestStaticFilesStorage raises
ValueError on the missing entry, /host/ returns 500.
Fix on the deploy side: fetch the sdist via fetchurl (pinned SRI hash
from forge PyPI), extract its top-level static/ subtree into a
non-FOD derivation, lay it down at /app/static in the image. The
local_settings shim adds /app/static to STATICFILES_DIRS so
collectstatic at boot picks the vendored assets up alongside the
Django admin's own static files.
Sdist URL is forge.ops.eblu.me/api/packages/... (tailnet) — matches
the just-landed edge block on forge.eblu.me/api/packages/*. The
nix-container-builder runner on ringtail is on the tailnet, so the
FOD fetch works.
App doesn't change. v1.0.3 is no longer needed for the static gap —
the wheel's "packages = [config, shower]" pattern stays as-is, and we
treat the sdist as the canonical bundle for the assets the wheel
intentionally omits.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
forge.eblu.me's package registry (/api/packages/* and /api/v1/packages/*)
served anonymous reads to the world even for private-repo releases —
Forgejo's per-user visibility treats packages as world-readable when
the owner's Visibility is Public, and we keep eblume Public so the
profile page stays open. The sdist downloads include full source
trees of private repos; that's the leak.
The fix is to keep the user public but block /api/packages/* and
/api/v1/packages/* at the proxy edge. forge.ops.eblu.me (tailnet) is
untouched, so CI workflows + gilbert's uv + the nix-container-builder
still work — they just need to use the tailnet hostname.
Three consumers updated to forge.ops.eblu.me:
- containers/shower/default.nix (the FOD pip --extra-index-url)
- ansible/roles/cv/defaults/main.yml (cv_release_url for generic package)
- chezmoi-tracked fish dotfiles (devpi.fish + conf.d/pypi.fish) —
edited in chezmoi source, user will apply separately
The blumeops repo had no other forge-pypi consumers (audited: workers,
runner-job-image, ansible roles, container builds). Doc references in
changelog fragments + comments left as-is — they describe history.
The proper long-term fix is to move private packages to a Limited-
visibility Forgejo org instead of relying on a proxy-side block (see
queued Todoist for the migration plan). Edge block stays as
defense in depth.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
App v1.0.2 ships WhiteNoise for /static/ and /media/, so the
blumeops-side workaround is no longer needed:
- containers/shower/default.nix: drop the WhiteNoise pip dep + the
middleware-injection block from local_settings. The shim is back
to just path overrides (DATABASES.NAME, MEDIA_ROOT, STATIC_ROOT).
- version → 1.0.2, outputHash → fakeHash for re-pinning.
- service-versions.yaml mirrored.
fly/nginx.conf: cache /static/ (1y) and /media/ (1d) per location for
shower.eblu.me. /static/ filenames are content-hashed thanks to
CompressedManifestStaticFilesStorage so a year is safe and invalidation
is automatic on the next collectstatic.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Doc said "Store the auth key in 1Password as well for the \`fly-setup\`
mise task" right next to the description of fly-setup, which reads
the key from Pulumi state, not 1Password. No code path anywhere reads
this key from 1P — the instruction is vestigial from an earlier
design and confused us during the v1.0.1 rotation when the
flyio-proxy-key expired.
Rewrite the section to:
- point at \`mise run fly-setup\` as the canonical path
- state explicitly that Pulumi state is the only source of truth
- document the rotation recipe (tailnet-up --replace=<urn> +
fly-setup + fly-deploy) for the next time this 90-day key lapses
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two complementary fixes for the deploy that just landed:
1. Pod was 0/1 Running because the readiness probe sends
`Host: shower.ops.eblu.me` and the app's hardcoded ALLOWED_HOSTS
only includes `shower.eblu.me`. settings.py exposes a
DJANGO_ALLOWED_HOSTS env-var extras hook for exactly this case —
wired into the configmap.
2. `kubectl exec deploy/shower -- python -m django <cmd>` returned
"No module named django" because PYTHONPATH lived only inside the
entrypoint script. Moved PYTHONPATH, DJANGO_SETTINGS_MODULE, PATH,
and HOME into the image's Env block so exec'd shells inherit them.
The entrypoint now just runs the boot sequence; the exports are
redundant (image Env covers them) and gone.
FOD inputs are unchanged so outputHash stays valid; no fakeHash dance.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR review caught that we didn't need an admin login surface on WAN.
App v1.0.1 adds DJANGO_PUBLIC_URL_BASE so QR codes generated from
/host/ (now tailnet-only) still point at shower.eblu.me for guest
phones — that closes the loop and lets us strip the WAN admin surface
entirely.
Container:
- bump version to 1.0.1
- outputHash → fakeHash (build will print the real one)
- entrypoint still does migrate + collectstatic before gunicorn —
the app is small enough that auto-migration is fine
Manifests:
- configmap adds DJANGO_PUBLIC_URL_BASE=https://shower.eblu.me
Fly nginx (shower.eblu.me):
- drop the /admin/(login|logout) carveout
- 403 anything under /admin/ AND /host/ with a "tailnet only" pointer
- drop the shower_auth limit_req zone and \$shower_banned geo
- drop the shower-admin-login fail2ban filter + jail
- drop the shower-deny.conf touch from start.sh
Docs:
- rename how-to docs/how-to/operations/shower-app.md →
shower-on-ringtail.md (mirrors cv-on-indri / docs-on-indri)
- new reference card docs/reference/services/shower-app.md per PR
review comment 2 (≈30s read; quick facts + cross-links)
- rewrite Defense layers section: collapses to general rate limit +
django-axes on the tailnet-side login (the only credential surface)
- rewrite the .infra.md changelog fragment to match
- add a 'Create the admin user' step (kubectl exec createsuperuser)
so first-time deploys aren't locked out
The nginx-deny action's per-jail \`nginx_deny_file\` generalization
stays — harmless future-proofing for the next public service.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lets a re-run of `mise run fly-setup` (e.g. after a fly-app rebuild or
when bootstrapping fresh) re-issue the cert without remembering the
ad-hoc `fly certs add` we did during this deployment.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Build 536 finished cleanly with the strip-refs FOD + autopatchelf
wrapper. The [branch] tag is fine for ArgoCD branch-revision testing;
a follow-up C0 will rebuild from main and re-pin to the [main] SHA tag
after merge, per docs/how-to/deployment/build-container-image.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Run 534 failed with 'fixed-output derivations must not reference store
paths: ... gcc-14.3.0-lib' because pip-installed wheels pulled stdenv
into the venv (Python's setup, gcc-lib runtime references).
Adapts authentik's two-stage pattern:
- pyDepsFOD: pip-installs into the venv, then strips every nix store
ref it can find (find+remove-references-to). Output is fully
self-contained — pinned by outputHash.
- pyDeps (non-FOD wrapper): copies the FOD output and runs
autoPatchelfHook against runtime buildInputs (libstdc++, zlib, image
libs for pillow). This restores RPATHs on the .so files that pillow
and scipy ship, against the real on-image library locations.
outputHash still fakeHash — next build prints the real one.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The buildPythonPackage approach with `propagatedBuildInputs = [ python.pkgs.django ... ]` doesn't work:
1. nixpkgs python314Packages.django still aliases to Django 4.2 LTS,
which doesn't support Python 3.14.
2. django-axes from nixpkgs pulls selenium + browser fonts into its
check phase, and the nix sandbox can't provide those (fontconfig
errors, then build dep tree collapses).
Switching to authentik's FOD pattern instead: a single fixed-output
derivation that pip-installs the adelaide-baby-shower-app wheel + every
transitive dep from forge PyPI into a target dir. FODs get network
access in exchange for a pinned output hash, so the closure stays
reproducible.
outputHash is set to fakeHash for the first build — the runner will
print the real hash on failure; a follow-up commit will pin it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three follow-ups on the shower deployment branch:
1. containers/shower/default.nix now uses buildPythonPackage to install
the adelaide-baby-shower-app wheel + its deps at nix build time. The
wheel comes from the forge PyPI index with a pinned SRI hash. The
entrypoint no longer does pip-at-boot — it just runs migrations,
collectstatic, and execs gunicorn.
2. ansible/roles/borgmatic/defaults/main.yml:
- Adds shower to borgmatic_k8s_sqlite_dumps (context k3s-ringtail)
so /app/data/db.sqlite3 is dumped via kubectl exec on every run.
- Adds /Volumes/shower (sifaka SMB mount on indri) to
borgmatic_source_directories so prize-photo media gets archived.
3. NFS share docs corrected to match the real on-sifaka pattern:
exports allowlist 192.168.1.0/24 + 100.64.0.0/10 with all_squash to
admin (matching frigate/paperless/etc.), not "Squash=No mapping".
The pod's runAsUser doesn't need to match an on-disk uid because
all_squash rewrites every write to admin:users.
Also adds a missing service-versions entry for the tailscale container
introduced in PR #347 — pre-existing gap surfaced by the
container-version-check hook on this commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the Adelaide / Heidi / Addie baby shower app — a Django guest
splash, raffle picker, and prize-assignment console — on ringtail k3s.
Public landing at shower.eblu.me (via fly proxy), tailnet admin at
shower.ops.eblu.me. App source: forge.eblu.me/eblume/adelaide-baby-shower-app,
wheel-published to the Forgejo Packages PyPI index.
Manifests under argocd/manifests/shower/: NFS-backed PVC for /app/media,
local-path PVC for SQLite, ExternalSecret pulling DJANGO_SECRET_KEY from
1Password (item "Shower (blumeops)"), Tailscale ProxyGroup ingress.
Defense-in-depth for the public surface:
- /admin/ blocked at the fly edge except /admin/login/ and /admin/logout/
- shower_auth rate limit on the login path
- new fail2ban filter+jail with a per-service shower-deny.conf
(nginx-deny action generalized to accept nginx_deny_file)
- django-axes (5 / 1h) keyed on (username, ip_address)
Plus: Caddy route on indri, Pulumi gandi CNAME, Grafana APM dashboard
mirroring docs-apm.json, runbook at how-to/operations/shower-app.md,
and a service-versions entry. X-Clacks-Overhead set on the new server
block — GNU Terry Pratchett.
Build: containers/shower/default.nix uses dockerTools to ship a
nixpkgs Python plus a startup wrapper that installs the wheel into
/app/data/.venv on first boot and execs gunicorn. Lets the wheel come
from forge PyPI without pinning hashes for every transitive dep.
Prerequisites tracked in the runbook (not yet executed):
- NFS share sifaka:/volume1/shower (manual Synology step)
- 1Password item "Shower (blumeops)" with secret-key field
- container build via `mise run container-build-and-release shower`
- Pulumi dns-up after merge
- fly certs add shower.eblu.me
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>