Prowler's IaC provider hardcodes self._mutelist = None and delegates
filtering to Trivy, but doesn't plumb --ignorefile through. The original
attempt with --mutelist-file silently no-op'd. Add a wrapper around
trivy in our image that injects --ignorefile $TRIVY_IGNOREFILE on `fs`
subcommands; switch the IaC cronjob to mount a Trivy-format
trivyignore.yaml and set the env var.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Six critical IaC findings against argocd/manifests/ broke into two
patterns: legitimate-by-design RBAC (mute) and over-broad RBAC (fix).
Plumbing:
- cronjob-iac-scan.yaml now passes --mutelist-file (previously
unused, which is why all IaC findings reported as unmuted)
- new mutelist/iac.yaml is bundled into the prowler-mutelist
ConfigMap and mounted into the IaC cronjob via items: selector
Compensating controls (in compensating-controls.yaml):
- operator-purpose-bound-rbac — external-secrets-operator's whole
function is to manage Secret objects; ClusterRole over secrets
matches its purpose. cert-controller mutates its own validating
webhooks to inject a rotating CA bundle.
- kube-state-metrics-metadata-only — KSM exposes only Secret
metadata via kube_secret_info / kube_secret_labels; the data
field is never read into exposed metrics.
Mutes (mutelist/iac.yaml):
- KSV-0041 for external-secrets/rbac.yaml,
kube-state-metrics/rbac.yaml,
kube-state-metrics-ringtail/rbac.yaml
- KSV-0114 for external-secrets/rbac.yaml
Real fix:
- grafana-clusterrole no longer reads secrets. The dashboard sidecar
(RESOURCE=both → configmap, both init and watch instances) only
needs ConfigMap-labeled dashboards; no Secrets are labeled
grafana_dashboard.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously only the K8s CIS in-cluster scan was processed; the weekly
container-image and IaC Prowler scans were running on schedule but never
reviewed. Now each scan gets its own status / severity / week-over-week
delta, with top-N grouped tables (by check ID and resource) for the
high-volume image and IaC outputs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
No version bump; build deps (jinja2, pyyaml) still loose-pinned and fine.
Known issue: deployed v1.0.3 package predates phone-hide commit; tracked
separately in Todoist by user.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The signed offset format read as "due in 5 days" rather than
"5 days overdue", causing misreads. Switch to self-explanatory
text: "5d overdue" / "due in 2d" / "due today".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Claude Code only auto-loads CLAUDE.md. The prose shim told agents to go
read AGENTS.md, which is easy to skip. Replacing the shim with
`@AGENTS.md` inlines AGENTS.md content into the session prompt, so the
startup rules (ai-docs, blumeops-tasks, change classification) land in
context unconditionally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Splits the nebulous gandi-operations how-to into two single-topic cards
(manage-eblu-me-dns, rotate-gandi-pat) and adds a mise task for the
recurring _acme-challenge TXT cleanup needed due to a value-comparison
bug in libdns/gandi v1.1.0 that prevents certmagic's cleanup phase from
removing presented TXT values.
The gandi reference card is updated to drop the false "different
credential from Pulumi PAT" claim — verified during the 2026-04-27
incident that Caddy and Pulumi share a single PAT.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the procedure used to restore mealie's SQLite DB from a borgmatic
archive after the post-DR wipe: extract from borg, snapshot the wiped DB,
swap via a helper pod on the ReadWriteOnce PVC, fix UID 911 ownership.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extend (not replace) home-manager's default sway keybindings via
lib.mkOptionDefault, with lib.mkForce on the custom overrides that
conflict with defaults. Add Mod+F1 cheatsheet binding (fuzzel-filterable).
Move fuzzel's border-radius/border-width out of [main] into a proper
[border] section with the expected short names.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Now that argocd's Authentik OAuth2 client is public, `argocd login --sso`
works for day-to-day use. Promote it to the default in AGENTS.md,
argocd-cli reference, and troubleshooting; keep the admin/password flow
documented as a break-glass fallback for when Authentik is unavailable.
Also drops --grpc-web from every interactive login command — confirmed
extraneous (login succeeds without it). Left in CI workflows and
`argocd cluster add` untouched; those are different contexts that I
didn't re-test.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Now that argocd's Authentik OAuth2 client is public (PKCE-only), the
client_secret plumbing is dead code:
- delete argocd-oidc-authentik ExternalSecret and drop it from kustomization
- remove AUTHENTIK_ARGOCD_CLIENT_SECRET env from authentik-worker
- remove argocd-client-secret mapping from authentik-config ExternalSecret
The argocd-client-secret field in the 1Password "Authentik (blumeops)"
item is now unreferenced and can be deleted there.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Changes argocd's Authentik OAuth2 client from confidential to public and
drops the clientSecret from argocd-cm. Public + PKCE works for both the
web UI (argocd-server backend) and the argocd CLI (`argocd login --sso`)
without a shared secret, matching OAuth 2.1 guidance.
Confidential → public was needed because the CLI can't hold a client
secret; Authentik's per-app issuer model made the alternative
("cliClientID" pattern with separate public client) awkward since it
requires a shared issuer across apps which Authentik doesn't serve.
Follow-up: deadcode AUTHENTIK_ARGOCD_CLIENT_SECRET env wiring and the
argocd-oidc-authentik ExternalSecret once verified.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds http://localhost:8085/auth/callback to the ArgoCD OAuth2 provider's
redirect_uris so `argocd login --sso` works. Loopback redirect is the
RFC 8252 pattern for native CLI apps; PKCE (already enabled) covers the
code-interception risk.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After dispatching, poll the Forgejo API for the run matching our
head_sha and print `mise run runner-logs <N>` so the suggested monitor
command is one copy-paste away. Falls back to the bare command if the
poll times out.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tailscale operator still defaults to privileged proxy pods with no
seccomp profile (issue #7359 open upstream). Control remains valid.
Added note about ProxyClass + device plugin remediation path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The upstream binary expects CWD=/app (relative config.yml lookup,
lumberjack logfile at ./log/app.log). Without this, the pod crashed on
startup — the ConfigMap-mounted /app/config.yml wasn't found and zerolog
spammed "mkdir log: permission denied" as it tried to create ./log at
/ as nonroot.
Creates /app as 1777 (tmp-style) so nonroot can write logs; WorkingDir
set to /app so the default config path resolves correctly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Built from main in run #516 after #339 merged. Follows the navidrome
kustomization convention (deployment image = local ref + :kustomized,
kustomization override = newTag only).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Summary
- Mirrors `github.com/0x2142/frigate-notify` at `v0.5.4` to `forge.ops.eblu.me/mirrors/frigate-notify`.
- Adds `containers/frigate-notify/default.nix` — `buildGoModule` + `dockerTools.buildLayeredImage`, following the `ntfy` pattern.
- Uses `-tags goolm` to avoid the libolm CGO dependency (matrix notifier is imported unconditionally in the upstream but we only use ntfy alerts).
- Runs as nonroot (UID 65534), exposes port 8000, bundles `cacert`/`tzdata`.
## Why
Move `ghcr.io/0x2142/frigate-notify:v0.5.4` (ringtail-deployed) under local control. Aligns with the [[indri → ringtail migration plan]] and the `default.nix` convention for ringtail-targeted containers documented in [[build-container-image]].
## Verification
- `dagger call build-nix --src=. --container-name=frigate-notify export --path=./out.tar.gz` produces a valid 20MB docker archive (10 layers) with `blumeops/frigate-notify` tag locally.
- Hashes pinned for `fetchgit` (src) and `vendorHash` (go modules).
## Follow-up (post-merge)
1. `mise run container-build-and-release frigate-notify` — release from main SHA.
2. C0 follow-up: update `argocd/manifests/frigate/kustomization.yaml` image ref to `registry.ops.eblu.me/blumeops/frigate-notify:v0.5.4-<sha>-nix`.
3. ArgoCD auto-syncs the deployment.
## Test plan
- [ ] `dagger call build-nix` succeeds from a clean checkout.
- [ ] `mise run container-build-and-release frigate-notify --dry-run` looks correct.
- [ ] After release + kustomization swap: frigate-notify pod comes up healthy on ringtail; ntfy alerts still fire on Frigate events.
Reviewed-on: #339
Swaps the k8s runner label from the local bootstrap tag (v0.20.6-9b6be09)
to the equivalent image rebuilt by CI from main. Functionally identical;
closes the bootstrap loop.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Points the k8s Forgejo runner label at the locally-bootstrapped
runner-job-image built from the Alpine container.py on this branch.
Once merged, CI will rebuild the same image from the same SHA.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bumps the Dagger engine/CLI from v0.20.1 to v0.20.6 (mise pin, dagger.json
engineVersion, SDK regen) and rewrites the runner-job-image container as a
native Dagger pipeline on Alpine 3.23 using the shared alpine_runtime helper,
replacing the Debian-based Dockerfile. All Forgejo Actions in this repo use
actions/checkout (a JS action), so musl is not a compatibility concern.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Summary
- consolidate forgejo-runner how-to docs into current cards
- upgrade the k8s forgejo-runner deployment to the latest v12.8.x runner image
- switch the k8s runner from first-boot register flow to declarative server.connections config
- keep the runner image on the native Dagger build path and update the surrounding manifests/secrets
## Notes
- PR opened early for C1 review
- implementation and deployment verification will follow in subsequent commits
Reviewed-on: #338
Historical one-shot fix from the zot hardening chain — knowledge is
self-evident in containers/ntfy/default.nix and container-version-check
regex. Should have been removed at mikado finalization. Scrubbed the two
wiki-link references in add-container-version-sync-check.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Forgejo's web action routes don't support API token auth for private
repos (only session cookies or public access). Switch log fetching to
read the zstd-compressed log files directly from indri via SSH —
Forgejo stores all runner logs on disk regardless of which runner
executed the job.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
runner-logs now always authenticates with the Forgejo API token
(via --token flag, FORGEJO_TOKEN env, or 1Password) so it works on
private repos. The --repo default is auto-detected from the git
remote origin URL instead of being hardcoded.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The static asset cache block (css/js/png/etc) was missing
proxy_set_header Host, so Caddy received "forge.eblu.me" instead of
"forge.ops.eblu.me" and couldn't route the request. HTML loaded fine
because the main location / block had the header.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All 7 ArgoCD containers had no resource limits, allowing them to consume
unlimited CPU/memory during node pressure events. This contributed to
cluster-wide probe timeout cascades on minikube-indri.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comprehensive docs pass reflecting the new Fly proxy architecture:
- Fly proxy routes through Caddy on indri (not per-service TS Ingress)
- Direct WireGuard peering via --port=41641 pinning
- DERP relay performance lesson in Tailscale docs
- Caddy now in public traffic path
- indri tagged as flyio-target
- Removed fly-reload references
- Updated architecture diagrams and per-service setup guide
- Added changelog fragment
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tailscale Ingress pods in k8s can't establish direct WireGuard
connections (stuck behind pod-network NAT → DERP relay → 20s latency).
Indri's host-level Tailscale CAN peer directly with Fly.
Change all nginx upstreams to route through Caddy on indri instead of
per-service Tailscale Ingress endpoints. Tag indri as flyio-target in
the Tailscale ACL so the Fly proxy can reach it.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Enable direct peer-to-peer WireGuard connections by pinning tailscaled
to port 41641 and exposing it as a UDP service. Without this, all
traffic routes through Tailscale DERP relays causing 20+ second
latency. Requires dedicated IPv4 (allocated: 168.220.82.221).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The nix-built Alloy image sets User=65534 (nobody). Even with
privileged: true, a non-root user gets no effective capabilities
(CapEff=0). Override with runAsUser: 0 so Beyla gets CAP_BPF and
CAP_SYS_ADMIN needed for eBPF instrumentation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
NixOS defaults kernel.unprivileged_bpf_disabled=2, which blocks BPF
syscalls outside the init namespace even with CAP_BPF. Set to 1 so
privileged containers (Beyla/Alloy tracing) can create BPF maps.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Beyla (alloy-tracing) has been failing since April 13 with
"failed to set memlock rlimit: operation not permitted" because k3s
inherits the default 8MB memlock limit. Set LimitMEMLOCK=infinity on
the k3s systemd service so privileged containers can use eBPF.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The earthdistance extension (depends on cube) must be created before
restoring the teslamate database — discovered missing after 2026-04-13 DR.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Queries the Forgejo API to verify the target commit exists on the remote
before dispatching a build, preventing wasted CI runs on unpushed commits.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace Dockerfile with container.py for native Dagger builds.
Bump devpi-server 6.19.1→6.19.3, devpi-web 5.0.1→5.0.2.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>