Compare commits

...
Sign in to create a new pull request.

8 commits

Author SHA1 Message Date
b24fd147ac C0: fix 1Password export menu wording in backup how-to
The desktop app's menu is File > Export > <account name> (e.g.
Blume/Davis), not "All Vaults". Verified an account-level 1PUX export
contains all four vaults (Private, blumeops, Payrix, Shared). Updated
the op-backup script's prompt text to match.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-09 16:27:08 -07:00
ccbc2ff0a9 C0: service-review automounter (1.13.0, healthy); fix tracking-file path in script
AutoMounter on indri auto-updated to 1.13.0 via the App Store, matching
the latest upstream release; all seven sifaka SMB mounts are live and
the app + helper are running. The service-review script's guidance text
pointed at docs/reference/services/service-versions.yaml, but the file
lives at the repo root (where the script actually reads it).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-09 16:08:39 -07:00
db0512b5d4 Doc review: 5 stalest cards; scale back ai-docs rule; document heph CLI (#373)
## Doc review (5 stalest, all never-reviewed)

Each card was verified against live state (ArgoCD app list/health, manifests, 1Password item fields, Mealie API probe) and stamped `last-reviewed: 2026-06-09`.

| Card | Findings fixed |
|------|----------------|
| `reference/services/argocd.md` | Added Authentik SSO (public PKCE client, `--sso` login, admins→role:admin RBAC); documented dual-cluster management (minikube + ringtail k3s at `ringtail.tail8d86e.ts.net:6443`); corrected sync policy — the `apps` root is **manual**, not automated |
| `reference/services/authentik.md` | Blueprint list grown from 5 to 10 files; OIDC client table now lists all 8 clients with types; secrets table updated to `postgresql-*` fields and per-client secrets |
| `reference/services/grafana.md` | TeslaMate datasource moved to `pg.ops.eblu.me:5434` (ringtail); dashboard inventory refreshed (20 provisioned ConfigMaps); TeslaMate dashboards documented as init-container fetch from forge mirror at pinned tag; SSO role mapping wording corrected (Admin only for `admins` group) |
| `reference/infrastructure/unifi.md` | UnPoller image is now locally built (`registry.ops.eblu.me/blumeops/unpoller`); verified namespace/port |
| `how-to/mealie/plan-a-meal.md` | Procedure verified; **found the stored API token (`op://blumeops/mealie/credential`) returns 401** — operational fix in progress, doc content unchanged |

## AGENTS.md

- **Scaled back the ai-docs rule** (per discussion): agents now start by finding and reading relevant docs; `mise run ai-docs` (~130K tokens now) and `ai-sources` become opt-in bulk loads. `agent-change-process.md` updated to match. The `ai-docs` mise task itself is kept for now — happy to retire it in a follow-up if desired.
- **Documented the heph CLI** task workflow (list/show/context/log read paths; done/drop/skip/log/edit/task write paths) so future sessions can read and manipulate Blumeops tasks without rediscovery.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #373
2026-06-09 16:05:01 -07:00
1c41cca903 Retire Prowler image + IaC scans (keep K8s CIS only) (#372)
## Why

Weekly compliance review (2026-06-07) surfaced the toil problem head-on:

| Report | Unmuted findings | Muted | Acted on |
|--------|------------------|-------|----------|
| **K8s CIS (In-Cluster)** | 0 | 65 | clean  |
| **Container Images** | 20,005 (+713 WoW) | 0 | never |
| **IaC (manifests)** | 654 (+31/−30 WoW) | 0 | never |

The image and IaC scans generate tens of thousands of un-actioned, un-muted findings every week:

- **Image scan** — overwhelmingly unpatchable *upstream* base-image CVEs, and it re-scans every historical tag still in the registry (2× paperless, 3× mealie, 4× prowler tags in the latest report), multiplying the count.
- **IaC scan** — systemic Trivy KSV pod-security warnings against our own manifests; real but homelab-acceptable, never muted, so re-surfaced indefinitely.

The K8s CIS scan is the only one with realized value (fully mutelisted, 0 unmuted WoW) and is retained. Matches the broader scaling-back of the reporting system as minikube heads toward retirement.

## Changes

- Delete `cronjob-image-scan.yaml` and `cronjob-iac-scan.yaml` + remove from kustomization
- Drop the now-unused `mutelist/trivyignore.yaml` (only the IaC scan consumed it)
- `review-compliance-reports`: drop the two retired scans (and the grouped-findings rendering that existed solely for them)
- Docs: deploy-prowler (new 'Why only the K8s CIS scan' section), read-compliance-reports, security reference, prowler reference

## Deploy (after review)

```fish
argocd app set prowler --revision retire-prowler-image-iac-scans
argocd app sync prowler   # prune removes the two CronJobs
# after merge: argocd app set prowler --revision main && argocd app sync prowler
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #372
2026-06-08 09:30:09 -07:00
e592ecfca4 C0: update ringtail flake inputs (nixpkgs, disko)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 07:17:21 -07:00
6370d2bddb C0: doc-review tailscale-operator (dual indri/ringtail, host caveat)
Add last-reviewed; document the operator now running on both indri's
minikube and ringtail's k3s; correct the ArgoCD apps row; pin upstream
v1.94.2; add the ProxyGroup Ingress 'host: *' requirement.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 07:00:48 -07:00
8072cd21d7 C0: review jellyfin, upgrade indri to 10.11.11 (security fixes)
Jellyfin was 5 patch releases behind (10.11.6 -> 10.11.11). 10.11.7 and
10.11.10 contain disclosed CVE/GHSA security fixes. Upgraded via
brew upgrade --cask jellyfin on indri; service verified healthy and
externally reachable (HTTPS 200).

Documented the recurring Gatekeeper gotcha: cask upgrades re-quarantine
the .app and the launchd service hangs silently until the first-launch
dialog is approved on indri's GUI console (xattr removal over SSH is
blocked by macOS TCC).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 06:35:23 -07:00
bc34b601be Merge pull request 'heph Authentik: grant offline_access scope (fixes spoke sync refresh-token 400)' (#371) from heph-offline-access into main 2026-06-06 18:29:47 -07:00
35 changed files with 239 additions and 399 deletions

View file

@ -12,10 +12,9 @@ blumeops is Erich Blume's GitOps repository for personal infrastructure, orchest
## Rules ## Rules
1. **Always run `mise run ai-docs` at session start** 1. **Start every task by finding and reading the relevant docs**
This will refresh your context with important information you will be assumed to know and follow. Search `docs/` for cards related to the change area (grep for titles/tags, follow `[[wiki-links]]`) and read what you find before acting. Wiki-links refer to cards under `docs/` by filename stem.
**Read the full output** — never truncate, pipe to `head`/`tail`, or skip sections. For problems with a very large surface area, `mise run ai-sources` concatenates all non-doc source files (~270K tokens) — opt-in only, confirm with the user before loading it wholesale; targeted reading is usually better.
For problems with a large surface area, ask the user if `mise run ai-sources` should also be run — it concatenates all non-doc source files (~270K tokens) for deep codebase context.
2. **Always use `--context=minikube-indri` with kubectl** (or `--context=k3s-ringtail` for ringtail services) - work contexts must never be touched 2. **Always use `--context=minikube-indri` with kubectl** (or `--context=k3s-ringtail` for ringtail services) - work contexts must never be touched
**NEVER run `minikube delete`** — it destroys all PVs, etcd, and cluster state. Use `minikube stop`/`minikube start` for restarts. If minikube is stuck, see [[restart-indri]]. Full rebuild from scratch requires the DR procedure in [[rebuild-minikube-cluster]]. **NEVER run `minikube delete`** — it destroys all PVs, etcd, and cluster state. Use `minikube stop`/`minikube start` for restarts. If minikube is stuck, see [[restart-indri]]. Full rebuild from scratch requires the DR procedure in [[rebuild-minikube-cluster]].
3. **Classify the change as C0/C1/C2 before starting** (see below) — this determines branching and PR requirements 3. **Classify the change as C0/C1/C2 before starting** (see below) — this determines branching and PR requirements
@ -69,7 +68,7 @@ See [[agent-change-process]] for the full methodology.
~/code/3rd/ # mirrored external projects ~/code/3rd/ # mirrored external projects
~/code/work # FORBIDDEN ~/code/work # FORBIDDEN
``` ```
Other code paths will be listed via ai-docs, this is just an overview. When you This is just an overview — explore `docs/` for the rest. When you
encounter wiki-links (`[[like-this]]`) it is referring to docs/ cards. encounter wiki-links (`[[like-this]]`) it is referring to docs/ cards.
## Service Deployment ## Service Deployment
@ -148,13 +147,42 @@ Create a new spork: `mise run spork-create <mirror-name>`
## Task Discovery ## Task Discovery
BlumeOps tasks live in [hephaestus](https://github.com/eblume/hephaestus) (`heph`), BlumeOps tasks live in [hephaestus](https://github.com/eblume/hephaestus) (`heph`),
the user's self-hosted context/task system. Fetch them with the CLI: the user's self-hosted context/task system. The CLI is a thin client of the
local `hephd` daemon. (This replaced the retired `blumeops-tasks` mise task,
which read from Todoist.)
### Reading tasks
```fish ```fish
heph list --project Blumeops --json # outstanding Blumeops tasks as JSON heph list --project Blumeops --json # outstanding Blumeops tasks as JSON
heph next # tactical "what is next?" ranking
heph show <node_id> # one task with its scalars
heph context <node_id> # print the task's canonical-context doc
heph log <node_id> # print the task's latest log entries
``` ```
(This replaced the retired `blumeops-tasks` mise task, which read from Todoist.) JSON rows carry `node_id` (use this as `<ID>` in all commands below), `title`,
`state`, `do_date`/`late_on` (epoch ms), `recurrence` (RFC-5545), and
`attention` (red|orange|white|blue — a1a4 urgency tiers; blue = on-deck).
### Manipulating tasks
```fish
heph done <node_id> # mark done (recurring tasks roll forward)
heph drop <node_id> # mark dropped
heph skip <node_id> # skip a recurring task's current occurrence
heph log <node_id> "text" # append a log entry
heph context <node_id> --append "…" # append to the canonical-context doc (--body replaces; `-` reads stdin)
heph edit <node_id> --do-date +3d # reschedule; also --late-on/--recur/--attention/--project (`none` clears)
heph task "Title" --project Blumeops --do-date fri --attention white # create a task
```
Date forms: `today|tomorrow|+3d|fri|YYYY-MM-DD`. Recurrence: presets
(`daily|weekly|monthly|yearly|weekdays`) or natural language (`"every 3 days"`).
Conventions: don't save TODOs to agent memory — file them as heph tasks under
the Blumeops project. When completing a recurring chore (e.g. "BlumeOps doc
review"), `heph log` a short note of what was done, then `heph done` it.
Most operational scripts are stored in `./mise-tasks/`. For scripts with any logic or Most operational scripts are stored in `./mise-tasks/`. For scripts with any logic or
complexity, use uv run --script 's with explicit dependencies. Complex complexity, use uv run --script 's with explicit dependencies. Complex

View file

@ -1,54 +0,0 @@
---
apiVersion: batch/v1
kind: CronJob
metadata:
name: prowler-iac-scan
namespace: prowler
spec:
schedule: "0 2 * * 6" # Saturday 2am
concurrencyPolicy: Forbid
jobTemplate:
spec:
ttlSecondsAfterFinished: 604800 # Auto-delete after 7 days
template:
spec:
securityContext:
seccompProfile:
type: RuntimeDefault
containers:
- name: prowler
image: registry.ops.eblu.me/blumeops/prowler:kustomized
command: ["/bin/sh", "-c"]
# Prowler's --mutelist-file is a no-op for the IaC provider
# (it delegates to Trivy). The Prowler image's trivy shim
# injects --ignorefile $TRIVY_IGNOREFILE when set; see
# containers/prowler/Dockerfile.
env:
- name: TRIVY_IGNOREFILE
value: /mutelist/trivyignore.yaml
args:
- |
DATEDIR=/reports/prowler-iac/$(date +%Y-%m-%d)
mkdir -p "$DATEDIR"
prowler iac \
--scan-repository-url https://forge.ops.eblu.me/eblume/blumeops.git \
-z \
--output-formats html csv json-ocsf \
--output-directory "$DATEDIR"
volumeMounts:
- name: reports
mountPath: /reports
- name: mutelist
mountPath: /mutelist
readOnly: true
restartPolicy: OnFailure
volumes:
- name: reports
persistentVolumeClaim:
claimName: prowler-reports
- name: mutelist
configMap:
name: prowler-mutelist
items:
- key: trivyignore.yaml
path: trivyignore.yaml

View file

@ -1,39 +0,0 @@
---
apiVersion: batch/v1
kind: CronJob
metadata:
name: prowler-image-scan
namespace: prowler
spec:
schedule: "0 3 * * 6" # Saturday 3am
concurrencyPolicy: Forbid
jobTemplate:
spec:
ttlSecondsAfterFinished: 604800 # Auto-delete after 7 days
template:
spec:
securityContext:
seccompProfile:
type: RuntimeDefault
containers:
- name: prowler
image: registry.ops.eblu.me/blumeops/prowler:kustomized
command: ["/bin/sh", "-c"]
args:
- |
DATEDIR=/reports/prowler-images/$(date +%Y-%m-%d)
mkdir -p "$DATEDIR"
prowler image \
--registry https://registry.ops.eblu.me \
--image-filter "^blumeops/" \
-z \
--output-formats html csv json-ocsf \
--output-directory "$DATEDIR"
volumeMounts:
- name: reports
mountPath: /reports
restartPolicy: OnFailure
volumes:
- name: reports
persistentVolumeClaim:
claimName: prowler-reports

View file

@ -10,8 +10,6 @@ resources:
- pv-nfs.yaml - pv-nfs.yaml
- pvc.yaml - pvc.yaml
- cronjob.yaml - cronjob.yaml
- cronjob-image-scan.yaml
- cronjob-iac-scan.yaml
configMapGenerator: configMapGenerator:
- name: prowler-mutelist - name: prowler-mutelist
@ -23,7 +21,6 @@ configMapGenerator:
- mutelist/core-pod-security.yaml - mutelist/core-pod-security.yaml
- mutelist/manual-node-checks.yaml - mutelist/manual-node-checks.yaml
- mutelist/rbac.yaml - mutelist/rbac.yaml
- mutelist/trivyignore.yaml
images: images:
- name: registry.ops.eblu.me/blumeops/prowler - name: registry.ops.eblu.me/blumeops/prowler

View file

@ -1,37 +0,0 @@
# Trivy ignorefile for Prowler IaC scan.
#
# Prowler's `--mutelist-file` flag is a no-op for the IaC provider
# (iac_provider.py sets self._mutelist = None and delegates to Trivy).
# Trivy in turn does not auto-discover this YAML form from cwd, so the
# Prowler image ships a shim wrapper around `trivy` that injects
# --ignorefile $TRIVY_IGNOREFILE when the env var is set. The cronjob
# mounts this file and sets TRIVY_IGNOREFILE accordingly.
#
# Schema: https://trivy.dev/latest/docs/configuration/filtering/
# IDs use the hyphenated form Trivy displays (KSV-0041, not KSV0041).
misconfigurations:
- id: KSV-0041
paths:
- "argocd/manifests/external-secrets/rbac.yaml"
statement: >-
external-secrets-operator's entire function is to read and
synthesize Secret objects; ClusterRole over secrets is its
purpose. Both the controller and cert-controller are
upstream-defined.
- id: KSV-0041
paths:
- "argocd/manifests/kube-state-metrics/rbac.yaml"
- "argocd/manifests/kube-state-metrics-ringtail/rbac.yaml"
statement: >-
KSM exposes only Secret metadata (name, namespace, type, labels),
never the data field. list/watch on secrets is required for
kube_secret_info / kube_secret_labels metrics.
- id: KSV-0114
paths:
- "argocd/manifests/external-secrets/rbac.yaml"
statement: >-
cert-controller manages the external-secrets validating webhook
configurations to inject its own rotating CA bundle. RBAC is
scoped to two named webhooks (secretstore-validate,
externalsecret-validate) via resourceNames; KSV-0114 doesn't see
the resourceNames restriction so reports the full ClusterRole.

View file

@ -0,0 +1 @@
Corrected the 1Password backup how-to: the desktop app's export menu item is named after the account ("File > Export > Blume/Davis"), not "All Vaults". Verified an account export contains all four vaults (Private, blumeops, Payrix, Shared).

View file

@ -0,0 +1 @@
Upgraded Jellyfin on indri from 10.11.6 to 10.11.11, picking up the security fixes in 10.11.7 (disclosed CVEs/GHSAs, flagged "upgrade immediately") and 10.11.10 (three further GHSAs). Noted the recurring gotcha in the service-versions tracking: after a `brew upgrade --cask jellyfin`, the re-quarantined `.app` makes the launchd-spawned process hang silently until the Gatekeeper first-launch dialog is approved on indri's GUI console — removing the quarantine xattr over SSH is blocked by macOS TCC.

View file

@ -0,0 +1 @@
Updated ringtail NixOS flake inputs (nixpkgs `nixos-25.11`, disko) to latest via `dagger call flake-update`.

View file

@ -0,0 +1 @@
Service review: AutoMounter on indri is current at 1.13.0 (App Store auto-updated from the tracked 1.11.0); all sifaka SMB mounts verified healthy. Fixed the stale tracking-file path shown by `mise run service-review`.

View file

@ -0,0 +1 @@
Reviewed the tailscale-operator reference card: documented the dual indri/ringtail deployment, corrected the ArgoCD apps list, pinned the upstream version, and added the ProxyGroup Ingress `host:` caveat.

View file

@ -0,0 +1 @@
Retired the `ai-docs` mise task and its mandatory session-start rule: the concatenated docs corpus had grown to ~130K tokens, too large to ingest wholesale. Agents now start tasks by finding and reading the relevant docs (grep + wiki-links); `ai-sources` remains for opt-in deep codebase context. Also documented the full `heph` CLI task workflow (read, log, complete, create) in AGENTS.md.

View file

@ -0,0 +1 @@
Reviewed the five stalest documentation cards (argocd, authentik, grafana, unifi, plan-a-meal): brought ArgoCD's SSO/dual-cluster/sync-policy story up to date, expanded Authentik's blueprint and OIDC client inventory to all eight clients, fixed Grafana's TeslaMate datasource target and dashboard list, and noted UnPoller's locally-built image.

View file

@ -0,0 +1 @@
Retired the Prowler container-image CVE scan and IaC scan, keeping only the K8s CIS benchmark scan. The two retired scans generated tens of thousands of un-actioned, un-muted findings every week (~20,000 image findings and growing, mostly unpatchable upstream-image CVEs; ~650 systemic Trivy KSV pod-security warnings) — the weekly `mise run review-compliance-reports` re-surfaced them all as "action needed" though none were ever triaged. The K8s CIS scan is fully mutelisted and runs clean, so it stays. Removed the two CronJobs, the now-unused `trivyignore.yaml` mutelist, and the grouped-findings rendering in the review tool that existed solely for the high-volume scans.

View file

@ -1,6 +1,6 @@
--- ---
title: Agent Change Process title: Agent Change Process
modified: 2026-03-15 modified: 2026-06-09
last-reviewed: 2026-02-23 last-reviewed: 2026-02-23
tags: tags:
- explanation - explanation
@ -25,13 +25,13 @@ Before starting work, classify the change:
When in doubt, start at C1. Upgrade to C2 if complexity spirals or the user requests it. When in doubt, start at C1. Upgrade to C2 if complexity spirals or the user requests it.
**Context loading:** All change classes start with `mise run ai-docs` (~85K tokens of documentation). For problems with a large surface area, ask the user if `mise run ai-sources` should also be run — it concatenates all non-doc source files (~270K tokens). Together they cover the full codebase without overlap. **Context loading:** All change classes start by finding and reading the docs relevant to the change area — grep `docs/` and follow wiki-links. For problems with a very large surface area, `mise run ai-sources` concatenates all non-doc source files (~270K tokens); confirm with the user before loading it wholesale.
## C0 — Quick Fix ## C0 — Quick Fix
A change where the risk is low enough that problems can be quickly fixed forward. A change where the risk is low enough that problems can be quickly fixed forward.
1. Run `mise run ai-docs` to load context 1. Find and read the docs relevant to the change area
2. Implement the change directly on main 2. Implement the change directly on main
3. Add a changelog fragment if the change is user-visible or noteworthy (`docs/changelog.d/+<descriptive-slug>.<type>.md`) 3. Add a changelog fragment if the change is user-visible or noteworthy (`docs/changelog.d/+<descriptive-slug>.<type>.md`)
4. Commit and push 4. Commit and push
@ -46,7 +46,7 @@ A change with enough complexity or risk that a human should review it, but not s
### Process ### Process
1. Run `mise run ai-docs` to load context 1. Find and read the docs relevant to the change area
2. **Search related docs** — read existing documentation and reference cards related to the change area 2. **Search related docs** — read existing documentation and reference cards related to the change area
3. **Create a feature branch** and open a PR early (draft is fine) 3. **Create a feature branch** and open a PR early (draft is fine)
4. **Documentation first** — commit doc changes reflecting the desired end state before writing code. This helps the reviewer understand intent and catches design issues early 4. **Documentation first** — commit doc changes reflecting the desired end state before writing code. This helps the reviewer understand intent and catches design issues early
@ -77,7 +77,7 @@ A complex, multi-session change managed through the [Mikado method](https://mika
Before writing any code, invest in understanding the problem: Before writing any code, invest in understanding the problem:
1. Run `mise run ai-docs` to load context 1. Find and read the docs relevant to the change area
2. Search related docs, reference cards, and existing how-to guides for the change area 2. Search related docs, reference cards, and existing how-to guides for the change area
3. Think through the dependency graph — what prerequisites exist? What could go wrong? 3. Think through the dependency graph — what prerequisites exist? What could go wrong?
4. Create Mikado cards for everything you can anticipate (you'll discover more later — that's the point of the method) 4. Create Mikado cards for everything you can anticipate (you'll discover more later — that's the point of the method)
@ -220,7 +220,7 @@ When the final leaf node is closed and no `status: active` cards remain:
When starting a new session to continue C2 work: When starting a new session to continue C2 work:
1. Run `mise run ai-docs` to load context 1. Find and read the docs relevant to the change area
2. Run `mise run docs-mikado --resume` — this will: 2. Run `mise run docs-mikado --resume` — this will:
- Detect the current branch and match it to an active chain - Detect the current branch and match it to an active chain
- Show the chain state, ready leaf nodes, and current position in the invariant - Show the chain state, ready leaf nodes, and current position in the invariant

View file

@ -1,6 +1,7 @@
--- ---
title: Plan a Meal title: Plan a Meal
modified: 2026-03-17 modified: 2026-06-09
last-reviewed: 2026-06-09
tags: tags:
- how-to - how-to
- mealie - mealie

View file

@ -1,6 +1,6 @@
--- ---
title: Deploy Prowler CIS Scanner title: Deploy Prowler CIS Scanner
modified: 2026-03-24 modified: 2026-06-08
last-reviewed: 2026-03-24 last-reviewed: 2026-03-24
tags: tags:
- how-to - how-to
@ -11,7 +11,20 @@ tags:
# Deploy Prowler CIS Scanner # Deploy Prowler CIS Scanner
Prowler runs weekly CIS Kubernetes Benchmark scans against minikube-indri and writes HTML/CSV/JSON reports to the NFS share on sifaka. Prowler runs a weekly CIS Kubernetes Benchmark scan against minikube-indri and writes HTML/CSV/JSON reports to the NFS share on sifaka.
## Why only the K8s CIS scan
Prowler originally ran three CronJobs: K8s CIS, container-image CVE scanning, and IaC scanning. The image and IaC scans were **retired in 2026-06**.
Both were pure toil with no realized value:
- **Image scan** produced ~20,000 unmuted findings per run and growing, none ever triaged or muted. They were overwhelmingly CVEs in *upstream* base images we don't control and can't patch, and the job re-scanned every historical tag still in the registry, multiplying the count.
- **IaC scan** produced ~650 Trivy KSV findings (`runAsNonRoot`, `readOnlyRootFilesystem`, drop-capabilities, …) against our own manifests — real but systemic, homelab-acceptable, and likewise never muted, so the weekly review re-surfaced all of them indefinitely.
The K8s CIS scan, by contrast, is fully mutelisted and runs clean (0 unmuted findings week over week), so it stays. The guiding principle matches [[ai-scraper-mitigation]]: don't keep generating a firehose of output that has no audience. If image-CVE signal is wanted later, the right shape is critical-severity-only, currently-deployed-tags-only, alert-on-new — a rebuild, not a revival (tracked as the "Trivy for image/IaC scanning" task).
Note that the K8s CIS scan itself is tied to minikube-indri, which is slated for retirement; on k3s only ~22 of 70 checks produce results (no static pods). Re-pointing a lean posture check at ringtail is tracked separately ("prowler scan against ringtail").
## What it checks ## What it checks
@ -33,38 +46,6 @@ Prowler's Kubernetes provider runs ~70 checks from the CIS Kubernetes Benchmark
**k3s note:** k3s embeds the control plane in a single binary — no static pods exist. Only core + RBAC checks (~22 of 70) produce results. Consider `kube-bench` for k3s control plane checks. **k3s note:** k3s embeds the control plane in a single binary — no static pods exist. Only core + RBAC checks (~22 of 70) produce results. Consider `kube-bench` for k3s control plane checks.
### Image vulnerability scanning (Saturday 3am)
Prowler's image provider scans all `blumeops/*` container images in `registry.ops.eblu.me` for:
- **CVEs** — known vulnerabilities from NVD, Alpine SecDB, Debian Security Tracker, and other sources
- **Embedded secrets** — credentials or API keys baked into image layers
- **Misconfigurations** — Dockerfile best practices (running as root, missing HEALTHCHECK, etc.)
Uses Trivy under the hood. Reports are written to `sifaka:/volume1/reports/prowler-images/`.
To run an ad-hoc image scan:
```fish
kubectl create job --from=cronjob/prowler-image-scan prowler-image-manual -n prowler --context=minikube-indri
```
### IaC scanning (Saturday 2am)
Prowler's IaC provider scans the blumeops repository (cloned at scan time) for misconfigurations in:
- **Dockerfiles** — running as root, using `latest` tags, missing `HEALTHCHECK`
- **Kubernetes manifests** — missing resource limits, privileged containers, insecure settings
- **Other IaC files** — Terraform, CloudFormation, etc. if present
Uses Trivy under the hood. Reports are written to `sifaka:/volume1/reports/prowler-iac/`.
To run an ad-hoc IaC scan:
```fish
kubectl create job --from=cronjob/prowler-iac-scan prowler-iac-manual -n prowler --context=minikube-indri
```
## Reports ## Reports
Reports are written to `sifaka:/volume1/reports/prowler/` with timestamped filenames. See [[read-compliance-reports]] for how to access and interpret them. Reports are written to `sifaka:/volume1/reports/prowler/` with timestamped filenames. See [[read-compliance-reports]] for how to access and interpret them.

View file

@ -1,6 +1,6 @@
--- ---
title: Read Compliance Reports title: Read Compliance Reports
modified: 2026-04-06 modified: 2026-06-08
last-reviewed: 2026-04-06 last-reviewed: 2026-04-06
tags: tags:
- how-to - how-to
@ -27,8 +27,13 @@ Reports are stored on sifaka at `/volume1/reports/`. Each scanner writes to its
| Scanner | Path | Schedule | | Scanner | Path | Schedule |
|---------|------|----------| |---------|------|----------|
| [[prowler]] K8s CIS | `sifaka:/volume1/reports/prowler/` | Weekly (Sunday 3am) | | [[prowler]] K8s CIS | `sifaka:/volume1/reports/prowler/` | Weekly (Sunday 3am) |
| [[prowler]] Image | `sifaka:/volume1/reports/prowler-images/` | Weekly (Saturday 3am) |
| [[prowler]] IaC | `sifaka:/volume1/reports/prowler-iac/` | Weekly (Saturday 2am) | > **Retired (2026-06):** the Prowler **image** (`prowler-images/`) and **IaC**
> (`prowler-iac/`) scans were retired. They produced tens of thousands of
> un-actioned, un-muted findings every week — mostly unpatchable upstream-image
> CVEs and systemic pod-security KSV warnings — and nobody triaged them. See
> [[deploy-prowler#Why only the K8s CIS scan]] for the rationale. Their stale
> report directories may linger on sifaka until manually removed.
Copy reports to your local machine (remember `scp -O` for sifaka): Copy reports to your local machine (remember `scp -O` for sifaka):

View file

@ -1,7 +1,7 @@
--- ---
title: Run 1Password Backup title: Run 1Password Backup
modified: 2026-03-11 modified: 2026-06-09
last-reviewed: 2026-03-16 last-reviewed: 2026-06-09
tags: tags:
- how-to - how-to
- operations - operations
@ -24,7 +24,7 @@ How to export and encrypt your 1Password vaults for inclusion in [[borgmatic]] b
### 1. Export Vaults From 1Password ### 1. Export Vaults From 1Password
1. Open the 1Password desktop app 1. Open the 1Password desktop app
2. **File > Export > All Vaults** 2. **File > Export > Blume/Davis** (the menu item is named after the account, not "All Vaults" — exporting the account covers all vaults: Private, blumeops, Payrix, and Shared)
3. Choose **1PUX** format 3. Choose **1PUX** format
4. Save to `~/Documents/` — 1Password names the file `1PasswordExport-<account-uuid>-<timestamp>.1pux` automatically; don't bother renaming it, pass the path to the task in the next step 4. Save to `~/Documents/` — 1Password names the file `1PasswordExport-<account-uuid>-<timestamp>.1pux` automatically; don't bother renaming it, pass the path to the task in the next step

View file

@ -1,6 +1,7 @@
--- ---
title: UniFi title: UniFi
modified: 2026-03-16 modified: 2026-06-09
last-reviewed: 2026-06-09
tags: tags:
- infrastructure - infrastructure
- networking - networking
@ -71,7 +72,7 @@ Attempted Feb 2026 with the `ubiquiti-community/unifi` Terraform provider via Pu
## Monitoring ## Monitoring
UniFi metrics are exported to Prometheus via [UnPoller](https://github.com/unpoller/unpoller), running as a k8s deployment in the `monitoring` namespace on indri. UnPoller polls the UX7 controller API using an API key and exposes metrics on port 9130. UniFi metrics are exported to Prometheus via [UnPoller](https://github.com/unpoller/unpoller), running as a k8s deployment in the `monitoring` namespace on indri's minikube (`argocd/manifests/unpoller/`, locally-built image `registry.ops.eblu.me/blumeops/unpoller`). UnPoller polls the UX7 controller API using an API key and exposes metrics on port 9130.
- **Prometheus job:** `unpoller` - **Prometheus job:** `unpoller`
- **Metrics prefix:** `unifi_` - **Metrics prefix:** `unifi_`

View file

@ -1,6 +1,7 @@
--- ---
title: Tailscale Operator title: Tailscale Operator
modified: 2026-02-08 modified: 2026-06-08
last-reviewed: 2026-06-08
tags: tags:
- kubernetes - kubernetes
- tailscale - tailscale
@ -15,8 +16,16 @@ The Tailscale operator enables Kubernetes services to be exposed directly on the
| Property | Value | | Property | Value |
|----------|-------| |----------|-------|
| **Namespace** | `tailscale` | | **Namespace** | `tailscale` |
| **Upstream** | `mirrors/tailscale` on forge (static manifest) | | **Upstream** | `mirrors/tailscale` on forge (static manifest, pinned `v1.94.2`) |
| **ArgoCD Apps** | `tailscale-operator-base` (upstream), `tailscale-operator` (config) | | **ArgoCD Apps** | `tailscale-operator` (indri/minikube), `tailscale-operator-ringtail` (ringtail/k3s) |
The operator runs on **both** clusters — indri's minikube and ringtail's k3s.
Both apps layer on the shared `tailscale-operator-base` kustomize directory
(operator manifest, `ProxyClass`, `dnsconfig`); each cluster supplies its own
`ProxyGroup` (indri: 2 replicas, ringtail: 1) and OAuth `ExternalSecret`. The
ringtail overlay additionally rewrites the proxy image to a locally nix-built
mirror. See [[ringtail]] and [[migrate-wave1-ringtail]] for the ongoing
migration of k8s workloads onto ringtail.
## How It Works ## How It Works
@ -27,7 +36,13 @@ Ingresses use a shared ProxyGroup (`ingress`) rather than per-service Tailscale
3. Service becomes accessible at `<hostname>.tail8d86e.ts.net` 3. Service becomes accessible at `<hostname>.tail8d86e.ts.net`
4. TLS is handled automatically via Tailscale 4. TLS is handled automatically via Tailscale
Tailnet clients must have `--accept-routes` enabled to route to VIP addresses. Two requirements for VIP routing to work:
1. Tailnet clients must have `--accept-routes` enabled to route to VIP addresses.
2. Ingress rules must **not** set an explicit `host:` field. The ProxyGroup
proxy receives the FQDN as the `Host` header (e.g.
`prometheus.tail8d86e.ts.net`), which won't match a short name. Use
`host: "*"` or omit `host:` entirely.
Services can be individually tagged (e.g., `tag:flyio-target`) via Ingress annotations to control which ACL grants apply. See [[expose-service-publicly]] for the tagging workflow. Services can be individually tagged (e.g., `tag:flyio-target`) via Ingress annotations to control which ACL grants apply. See [[expose-service-publicly]] for the tagging workflow.

View file

@ -1,6 +1,6 @@
--- ---
title: Security & Compliance title: Security & Compliance
modified: 2026-03-24 modified: 2026-06-08
last-reviewed: 2026-03-24 last-reviewed: 2026-03-24
tags: tags:
- operations - operations
@ -21,7 +21,7 @@ Security posture and compliance scanning for BlumeOps infrastructure.
## Scanning tools ## Scanning tools
- [[prowler]] — CIS Kubernetes Benchmark scanner (weekly CronJob) - [[prowler]] — CIS Kubernetes Benchmark scanner (weekly CronJob). The container-image CVE scan and IaC scan were retired in 2026-06 (un-actioned noise — see [[deploy-prowler#Why only the K8s CIS scan]]); only the K8s CIS scan remains.
- [[deploy-prowler]] — deployment and ad-hoc scan how-to - [[deploy-prowler]] — deployment and ad-hoc scan how-to
- [[read-compliance-reports]] — accessing and interpreting reports - [[read-compliance-reports]] — accessing and interpreting reports
- [[kingfisher]] — Secret detection and live validation for Forgejo repos (weekly CronJob + prek hook) - [[kingfisher]] — Secret detection and live validation for Forgejo repos (weekly CronJob + prek hook)
@ -52,5 +52,5 @@ Suppressed findings are kept in Prowler mutelist YAML under `argocd/manifests/pr
- No SOC 2 compliance mapping for Kubernetes (Prowler only maps SOC 2 for AWS/Azure/GCP) - No SOC 2 compliance mapping for Kubernetes (Prowler only maps SOC 2 for AWS/Azure/GCP)
- k3s control plane checks produce no results (embedded binary, no static pods) — consider kube-bench - k3s control plane checks produce no results (embedded binary, no static pods) — consider kube-bench
- Container image scanning covers `blumeops/*` images only — upstream images (ollama, immich, etc.) are not scanned - No container-image CVE scanning (the Prowler image scan was retired 2026-06 as un-actioned noise). If reintroduced, scope it to critical-severity, currently-deployed tags, alert-on-new
- IaC scanning covers the blumeops repo only — no scanning of third-party Helm charts or vendored manifests - No automated IaC misconfiguration scanning (the Prowler IaC scan was retired 2026-06). Manifest pod-security hardening is now an accept-and-document decision rather than a weekly report

View file

@ -1,6 +1,7 @@
--- ---
title: ArgoCD title: ArgoCD
modified: 2026-02-07 modified: 2026-06-09
last-reviewed: 2026-06-09
tags: tags:
- service - service
- gitops - gitops
@ -18,22 +19,38 @@ GitOps continuous delivery platform for the [[cluster|Kubernetes cluster]].
| **Tailscale URL** | https://argocd.tail8d86e.ts.net | | **Tailscale URL** | https://argocd.tail8d86e.ts.net |
| **Namespace** | `argocd` | | **Namespace** | `argocd` |
| **Git Source** | `ssh://forgejo@forge.ops.eblu.me:2222/eblume/blumeops.git` | | **Git Source** | `ssh://forgejo@forge.ops.eblu.me:2222/eblume/blumeops.git` |
| **Manifests Path** | `argocd/` | | **Manifests Path** | `argocd/apps/` (Applications), `argocd/manifests/` (workloads) |
## Clusters
A single ArgoCD instance (on indri's minikube) manages both clusters:
| Cluster | Destination | Apps |
|---------|-------------|------|
| minikube (indri) | `https://kubernetes.default.svc` | Most services |
| k3s ([[ringtail]]) | `https://ringtail.tail8d86e.ts.net:6443` | GPU workloads and `*-ringtail` apps |
## Sync Policy ## Sync Policy
| Application | Sync Policy | Rationale | All applications use **manual sync** — including the `apps` app-of-apps root. To pick up newly added Application manifests, sync `apps` explicitly:
|-------------|-------------|-----------|
| `apps` | Automated | Picks up new Application manifests |
| All workloads | Manual | Explicit control over deployments |
## Credentials ```bash
argocd app sync apps
```
- Admin password: 1Password (blumeops vault) This gives explicit control over every deployment; nothing rolls out on push alone.
- Git deploy key (SSH): 1Password
## Authentication
- **SSO via [[authentik]]** — OIDC with a public PKCE client (`argocd`), shared by the web UI and CLI: `argocd login argocd.ops.eblu.me --sso`. The Authentik `admins` group maps to `role:admin` via the RBAC ConfigMap; the default policy grants no access.
- **Local admin** — break-glass password in 1Password (blumeops vault), for when Authentik is down.
The git deploy key (SSH) is injected via [[external-secrets]].
## Related ## Related
- [[argocd-cli]] - CLI usage and deployment workflows - [[argocd-cli]] - CLI usage and deployment workflows
- [[apps|Apps]] - Full application registry - [[apps|Apps]] - Full application registry
- [[forgejo]] - Git source - [[forgejo]] - Git source
- [[authentik]] - OIDC identity provider for SSO
- [[federated-login]] - How authentication works across BlumeOps

View file

@ -1,6 +1,7 @@
--- ---
title: Authentik title: Authentik
modified: 2026-02-20 modified: 2026-06-09
last-reviewed: 2026-06-09
tags: tags:
- service - service
- security - security
@ -42,9 +43,7 @@ Authentik configuration is managed via Blueprints (YAML) stored as a ConfigMap m
- **`common.yaml`** — shared identity resources (`admins` group) - **`common.yaml`** — shared identity resources (`admins` group)
- **`mfa.yaml`** — MFA enforcement on the default authentication flow (`not_configured_action: configure`) - **`mfa.yaml`** — MFA enforcement on the default authentication flow (`not_configured_action: configure`)
- **`grafana.yaml`** — Grafana OAuth2 provider, application, and policy binding - One blueprint per OIDC client (provider, application, and policy binding): `grafana.yaml`, `forgejo.yaml`, `zot.yaml`, `argocd.yaml`, `jellyfin.yaml`, `mealie.yaml`, `paperless.yaml`, `heph.yaml`
- **`forgejo.yaml`** — Forgejo OAuth2 provider, application, and policy binding
- **`zot.yaml`** — Zot registry OAuth2 provider, application, and policy binding
Group membership is included in the `profile` scope claim (Authentik built-in). Services use `--group-claim-name groups` to read it. Group membership is included in the `profile` scope claim (Authentik built-in). Services use `--group-claim-name groups` to read it.
@ -52,13 +51,18 @@ Blueprint file: `argocd/manifests/authentik/configmap-blueprint.yaml`
## OIDC Clients ## OIDC Clients
| Client | Status | | Client | Type |
|--------|--------| |--------|------|
| [[grafana]] | Active | | [[grafana]] | Confidential |
| [[forgejo]] | Active | | [[forgejo]] | Confidential |
| [[zot]] | Active | | [[zot]] | Confidential |
| [[argocd]] | Public (PKCE, shared by web UI and CLI) |
| [[jellyfin]] | Confidential |
| [[mealie]] | Confidential |
| [[paperless]] | Confidential |
| heph | Public (PKCE, with `offline_access` for spoke sync refresh tokens) |
Future clients: [[argocd]], [[miniflux]] Future clients: [[miniflux]]
## Secrets ## Secrets
@ -67,11 +71,10 @@ Injected via [[external-secrets]] from the "Authentik (blumeops)" 1Password item
| 1Password Field | Purpose | | 1Password Field | Purpose |
|-----------------|---------| |-----------------|---------|
| `secret-key` | Authentik secret key | | `secret-key` | Authentik secret key |
| `db-password` | PostgreSQL password | | `postgresql-host` / `-port` / `-name` / `-user` / `-password` | PostgreSQL connection |
| `grafana-client-secret` | OIDC client secret for Grafana | | `<client>-client-secret` | OIDC client secret, one per confidential client (grafana, forgejo, zot, jellyfin, mealie, paperless) |
| `forgejo-client-secret` | OIDC client secret for Forgejo |
| `zot-client-secret` | OIDC client secret for Zot | The item also holds an `api-token` field (Authentik API access for admin scripting); it is not synced into the cluster.
| `api-token` | Authentik API token |
## Container Image ## Container Image

View file

@ -1,6 +1,7 @@
--- ---
title: Grafana title: Grafana
modified: 2026-02-28 modified: 2026-06-09
last-reviewed: 2026-06-09
tags: tags:
- service - service
- observability - observability
@ -25,7 +26,7 @@ Dashboards and visualization for BlumeOps observability.
Grafana supports two login methods: Grafana supports two login methods:
- **SSO via [[authentik]]** — OIDC login through Authentik (`auth.generic_oauth`). Users click "Sign in with Authentik", authenticate at Authentik, and are redirected back as Admin. - **SSO via [[authentik]]** — OIDC login through Authentik (`auth.generic_oauth`). Members of the Authentik `admins` group get the Admin role; everyone else gets Viewer (`role_attribute_path` in `grafana.ini`).
- **Local admin** — break-glass login using the password from 1Password ("Grafana (blumeops)"). Always available if Authentik is down. - **Local admin** — break-glass login using the password from 1Password ("Grafana (blumeops)"). Always available if Authentik is down.
The OIDC client secret is injected via [[external-secrets]] (`grafana-authentik-oauth` secret in monitoring namespace). The OIDC client secret is injected via [[external-secrets]] (`grafana-authentik-oauth` secret in monitoring namespace).
@ -37,7 +38,7 @@ The OIDC client secret is injected via [[external-secrets]] (`grafana-authentik-
| Prometheus | prometheus | `prometheus.monitoring.svc.cluster.local:9090` | | Prometheus | prometheus | `prometheus.monitoring.svc.cluster.local:9090` |
| Loki | loki | `loki.monitoring.svc.cluster.local:3100` | | Loki | loki | `loki.monitoring.svc.cluster.local:3100` |
| Tempo | tempo | `tempo.monitoring.svc.cluster.local:3200` | | Tempo | tempo | `tempo.monitoring.svc.cluster.local:3200` |
| TeslaMate | postgres | `blumeops-pg-rw.databases.svc.cluster.local:5432` | | TeslaMate | postgres | `pg.ops.eblu.me:5434` (TeslaMate's database on [[ringtail]], via Caddy L4) |
## Dashboard Provisioning ## Dashboard Provisioning
@ -49,13 +50,9 @@ Optional annotation: `grafana_folder: "FolderName"`
## Key Dashboards ## Key Dashboards
- macOS System - Host metrics for indri Provisioned dashboards live in `argocd/manifests/grafana-config/dashboards/` (one ConfigMap per dashboard). Coverage as of 2026-06: alerts, borgmatic, CV APM, devpi, docs APM, fly.io proxy, forgejo, frigate, jellyfin, kubernetes, loki, macOS (indri host), postgresql, ringtail, shower APM, sifaka disks, snowflake proxy, tempo, transmission, zot.
- Minikube - Kubernetes cluster overview
- Borgmatic Backups - Backup status and trends TeslaMate's dashboards are not in the repo — an init container fetches them from the forge mirror at a pinned tag (`TESLAMATE_VERSION` in `argocd/manifests/grafana/deployment.yaml`).
- Services Health - HTTP probe results
- Docs APM - Request rate, latency, cache for docs.eblu.me
- Fly.io Proxy Health - Aggregate proxy health across all upstream services
- TeslaMate (18 dashboards) - Vehicle data
## Related ## Related

View file

@ -1,7 +1,7 @@
--- ---
title: Jellyfin title: Jellyfin
modified: 2026-02-07 modified: 2026-06-08
last-reviewed: 2026-03-23 last-reviewed: 2026-06-08
tags: tags:
- service - service
- media - media
@ -41,6 +41,24 @@ Dashboard > Playback:
2. Allow hardware encoding: Enabled 2. Allow hardware encoding: Enabled
3. VPP Tone mapping: Enabled 3. VPP Tone mapping: Enabled
## Upgrades
Installed via Homebrew cask (`state: present`, unpinned), so the Ansible role
won't bump an already-installed cask. To upgrade, run on indri:
```bash
brew upgrade --cask jellyfin
```
**Gatekeeper gotcha:** a cask upgrade replaces `/Applications/Jellyfin.app` and
re-applies the `com.apple.quarantine` xattr. When launchd respawns the service,
the new binary hangs silently — process alive but ~0 CPU, no logs, no listening
socket — because Gatekeeper is holding the first launch pending approval.
Removing the xattr over SSH fails (`xattr -dr com.apple.quarantine ...`
"Operation not permitted", blocked by macOS TCC). Approve the first-launch
dialog on indri's GUI console (or run the `xattr` removal from a local Terminal
with Full Disk Access), then reload the LaunchAgent.
## Observability ## Observability
- Metrics: `jellyfin_metrics` ansible role - Metrics: `jellyfin_metrics` ansible role

View file

@ -1,6 +1,6 @@
--- ---
title: Prowler title: Prowler
modified: 2026-03-24 modified: 2026-06-08
last-reviewed: 2026-03-24 last-reviewed: 2026-03-24
tags: tags:
- service - service
@ -17,20 +17,20 @@ CIS Kubernetes Benchmark scanner for compliance posture reporting.
|----------|-------| |----------|-------|
| **Namespace** | `prowler` | | **Namespace** | `prowler` |
| **Image** | `registry.ops.eblu.me/blumeops/prowler` (see `argocd/manifests/prowler/kustomization.yaml` for current tag) | | **Image** | `registry.ops.eblu.me/blumeops/prowler` (see `argocd/manifests/prowler/kustomization.yaml` for current tag) |
| **Schedule** | K8s CIS: Sunday 3am / Image: Saturday 3am / IaC: Saturday 2am | | **Schedule** | K8s CIS: Sunday 3am |
| **Reports** | `sifaka:/volume1/reports/prowler/`, `prowler-images/`, `prowler-iac/` (NFS) | | **Reports** | `sifaka:/volume1/reports/prowler/` (NFS) |
| **Manifests** | `argocd/manifests/prowler/` | | **Manifests** | `argocd/manifests/prowler/` |
## What it does ## What it does
Runs Prowler 5 as two CronJobs: Runs Prowler 5 as a single CronJob:
- **K8s CIS scan** (Sunday) — CIS Kubernetes Benchmark v1.11 checks across pod security, RBAC, apiserver, etcd, kubelet, controller-manager, and scheduler - **K8s CIS scan** (Sunday) — CIS Kubernetes Benchmark v1.11 checks across pod security, RBAC, apiserver, etcd, kubelet, controller-manager, and scheduler
- **Image scan** (Saturday) — CVE, secret, and misconfiguration scanning of all `blumeops/*` container images in the registry via Trivy
- **IaC scan** (Saturday) — static analysis of Dockerfiles, K8s manifests, and other IaC files in the repo via Trivy
Reports are written in HTML, CSV, and JSON-OCSF to the NFS share on sifaka. Reports are written in HTML, CSV, and JSON-OCSF to the NFS share on sifaka.
The **image** and **IaC** scans (formerly Saturday CronJobs) were retired in 2026-06 — they generated tens of thousands of un-actioned findings weekly. See [[deploy-prowler#Why only the K8s CIS scan]].
## See also ## See also
- [[security]] — security & compliance posture overview - [[security]] — security & compliance posture overview

View file

@ -1,6 +1,6 @@
--- ---
title: Mise Tasks title: Mise Tasks
modified: 2026-04-11 modified: 2026-06-09
tags: tags:
- reference - reference
- tools - tools
@ -17,7 +17,6 @@ Run `mise tasks --sort name` for the live list with descriptions.
| Task | Description | | Task | Description |
|------|-------------| |------|-------------|
| `ai-docs` | All documentation concatenated for AI context (~85K tokens) |
| `ai-sources` | All non-doc source files for deep AI context (~270K tokens) | | `ai-sources` | All non-doc source files for deep AI context (~270K tokens) |
| `docs-check-frontmatter` | Check required frontmatter fields | | `docs-check-frontmatter` | Check required frontmatter fields |
| `docs-check-links` | Validate wiki-links resolve correctly (supports path-based links) | | `docs-check-links` | Validate wiki-links resolve correctly (supports path-based links) |

View file

@ -1,6 +1,6 @@
--- ---
title: AI Assistance Guide title: AI Assistance Guide
modified: 2026-02-23 modified: 2026-06-09
tags: tags:
- tutorials - tutorials
- ai - ai
@ -17,7 +17,7 @@ This guide provides context for AI agents assisting with BlumeOps operations, an
These are non-negotiable for AI agents working in this repo: These are non-negotiable for AI agents working in this repo:
1. **Always use `--context=minikube-indri` with kubectl** - Work contexts exist that must never be touched 1. **Always use `--context=minikube-indri` with kubectl** - Work contexts exist that must never be touched
2. **Run `mise run ai-docs` at session start** - Review current infrastructure state 2. **Start every task by finding and reading the relevant docs** - Grep `docs/` and follow wiki-links
3. **Never commit secrets** - The repo is public at github.com/eblume/blumeops 3. **Never commit secrets** - The repo is public at github.com/eblume/blumeops
4. **Wait for user review before deploying** - Create PRs, don't auto-deploy 4. **Wait for user review before deploying** - Create PRs, don't auto-deploy
5. **Never merge PRs without explicit request** - The user merges after review 5. **Never merge PRs without explicit request** - The user merges after review
@ -91,8 +91,7 @@ BlumeOps operations are driven by mise tasks. Run `mise tasks` to list all avail
| Task | When to Use | | Task | When to Use |
|------|-------------| |------|-------------|
| `ai-docs` | At session start - all documentation concatenated for AI context (~85K tokens, see [[mise-tasks]]) | | `ai-sources` | Deep context - all non-doc source files (~270K tokens). Ask user before running; useful for problems with a large surface area (see [[mise-tasks]]) |
| `ai-sources` | Deep context - all non-doc source files (~270K tokens). Ask user before running; useful for problems with a large surface area |
| `docs-mikado` | View active Mikado dependency chains for C2 changes | | `docs-mikado` | View active Mikado dependency chains for C2 changes |
| `docs-mikado --resume` | Resume a C2 chain: detect branch, show state and next steps | | `docs-mikado --resume` | Resume a C2 chain: detect branch, show state and next steps |
| `provision-indri` | Deploy changes to [[indri]]-hosted services via Ansible | | `provision-indri` | Deploy changes to [[indri]]-hosted services via Ansible |

View file

@ -1,6 +1,6 @@
--- ---
title: Exploring the Docs title: Exploring the Docs
modified: 2026-02-10 modified: 2026-06-09
tags: tags:
- tutorials - tutorials
- getting-started - getting-started
@ -31,7 +31,6 @@ You probably want quick access to operational details:
- [How-to](/how-to/) guides for common operations (deploy, troubleshoot, update ACLs) - [How-to](/how-to/) guides for common operations (deploy, troubleshoot, update ACLs)
- [Reference](/reference/) has service URLs, commands, and config locations - [Reference](/reference/) has service URLs, commands, and config locations
- [[ai-assistance-guide]] explains how to work effectively with AI agents - [[ai-assistance-guide]] explains how to work effectively with AI agents
- Run `mise run ai-docs` to prime AI context with key documentation
### For AI Agents ### For AI Agents
@ -75,13 +74,7 @@ Prek hooks validate that all wiki-links resolve to existing files and flag ambig
## AI Context Priming ## AI Context Priming
The `ai-docs` mise task concatenates key documentation files for AI context: AI agents prime themselves by searching `docs/` for cards relevant to the task at hand and following wiki-links from there. (The retired `ai-docs` mise task used to concatenate every doc for this purpose, but the corpus outgrew a context window.) For deep codebase questions, `mise run ai-sources` concatenates all non-doc source files.
```bash
mise run ai-docs
```
This outputs key documentation files and a full tree listing of all docs, providing an agent with essential context for BlumeOps operations.
## Related ## Related

View file

@ -1,13 +0,0 @@
#!/usr/bin/env bash
#MISE description="Prime AI context with all BlumeOps documentation"
set -euo pipefail
DOCS_DIR="$(cd "$(dirname "$0")/.." && pwd)/docs"
# Concatenate all docs (excluding changelog fragments)
find "$DOCS_DIR" -name '*.md' -not -path '*/changelog.d/*' | sort | while read -r f; do
printf '=== %s ===\n' "${f#"$DOCS_DIR/"}"
cat "$f"
printf '\n'
done

View file

@ -86,7 +86,7 @@ def get_export_path(argv_path: str | None) -> Path | None:
else: else:
console.print("Export your vaults from the 1Password desktop app:") console.print("Export your vaults from the 1Password desktop app:")
console.print(" 1. Open 1Password") console.print(" 1. Open 1Password")
console.print(" 2. File > Export > All Vaults (or select specific vaults)") console.print(" 2. File > Export > <account name> (exports all vaults in the account)")
console.print(f" 3. Save as 1PUX format to: [cyan]{EXPORT_DIR}[/cyan]") console.print(f" 3. Save as 1PUX format to: [cyan]{EXPORT_DIR}[/cyan]")
console.print() console.print()
raw = console.input("Path to .1pux file: ").strip() raw = console.input("Path to .1pux file: ").strip()

View file

@ -10,19 +10,19 @@
Covers: Covers:
- Prowler K8s CIS (in-cluster): per-finding detail - Prowler K8s CIS (in-cluster): per-finding detail
- Prowler container image scans: grouped by check + resource
- Prowler IaC manifest scans: grouped by check + resource
- Kingfisher secret scanning: TODO — pending upstream JSON/CSV output - Kingfisher secret scanning: TODO — pending upstream JSON/CSV output
support (currently HTML-only; contribute from spork) support (currently HTML-only; contribute from spork)
For each Prowler scan, copies the two most recent CSV reports, parses The Prowler container-image CVE scan and IaC scan were retired in 2026-06
(see docs/how-to/operations/deploy-prowler.md) — they produced tens of
thousands of un-actioned findings weekly. Only the K8s CIS scan remains.
For the Prowler scan, copies the two most recent CSV reports, parses
them, and displays: them, and displays:
1. Overall status (pass/fail/manual/muted counts) 1. Overall status (pass/fail/manual/muted counts)
2. Unmuted failures by severity 2. Unmuted failures by severity
3. Delta from the previous report (new vs resolved) 3. Delta from the previous report (new vs resolved)
4. Actionable unmuted failures (per-finding for in-cluster; grouped 4. Actionable unmuted failures (per-finding detail)
by check ID and resource for image/IaC because they have far too
many findings to list individually)
This is the primary tool for the weekly compliance report review. This is the primary tool for the weekly compliance report review.
""" """
@ -39,11 +39,9 @@ from rich.console import Console
from rich.panel import Panel from rich.panel import Panel
from rich.table import Table from rich.table import Table
PROWLER_SCANS: list[tuple[str, str, bool]] = [ PROWLER_SCANS: list[tuple[str, str]] = [
# (label, sifaka base path, group_findings) # (label, sifaka base path)
("K8s CIS (In-Cluster)", "/volume1/reports/prowler", False), ("K8s CIS (In-Cluster)", "/volume1/reports/prowler"),
("Container Images", "/volume1/reports/prowler-images", True),
("IaC (manifests)", "/volume1/reports/prowler-iac", True),
] ]
console = Console() console = Console()
@ -334,14 +332,8 @@ def summarize_report(
tmpdir: str, tmpdir: str,
*, *,
show_muted: bool = False, show_muted: bool = False,
group_findings: bool = False,
) -> None: ) -> None:
"""Fetch and summarize the latest Prowler report under `base`. """Fetch and summarize the latest Prowler report under `base`."""
When `group_findings` is True, top-N CHECK_ID and RESOURCE_NAME tables
are shown instead of a per-finding detail table — appropriate for
image and IaC scans that produce thousands of findings.
"""
console.rule(f"[bold]{label}[/bold]") console.rule(f"[bold]{label}[/bold]")
csvs = list_reports(base) csvs = list_reports(base)
if not csvs: if not csvs:
@ -458,36 +450,29 @@ def summarize_report(
) )
console.print() console.print()
# For grouped scans the new/resolved listings are too noisy if new_keys:
# (potentially thousands of lines). Skip the listings; the count console.print("[bold red]New Unmuted Failures:[/bold red]")
# is in the panel above and detail is in the grouped tables. for k in sorted(new_keys):
if not group_findings: r = curr_keys[k]
if new_keys: console.print(
console.print("[bold red]New Unmuted Failures:[/bold red]") f" [{r['SEVERITY']}] {r['CHECK_ID']}: "
for k in sorted(new_keys): f"{r['STATUS_EXTENDED'][:120]}"
r = curr_keys[k] )
console.print( console.print()
f" [{r['SEVERITY']}] {r['CHECK_ID']}: "
f"{r['STATUS_EXTENDED'][:120]}"
)
console.print()
if resolved_keys: if resolved_keys:
console.print("[bold green]Resolved:[/bold green]") console.print("[bold green]Resolved:[/bold green]")
for k in sorted(resolved_keys): for k in sorted(resolved_keys):
r = prev_keys[k] r = prev_keys[k]
console.print( console.print(
f" [dim][{r['SEVERITY']}] {r['CHECK_ID']}: " f" [dim][{r['SEVERITY']}] {r['CHECK_ID']}: "
f"{r['STATUS_EXTENDED'][:120]}[/dim]" f"{r['STATUS_EXTENDED'][:120]}[/dim]"
) )
console.print() console.print()
# --- Unmuted failure details (grouped or per-finding) --- # --- Unmuted failure details ---
if latest["unmuted"]: if latest["unmuted"]:
if group_findings: _print_findings_detail(latest["unmuted"])
_print_grouped_findings(latest["unmuted"])
else:
_print_findings_detail(latest["unmuted"])
# --- Muted findings summary --- # --- Muted findings summary ---
if show_muted and latest["muted"]: if show_muted and latest["muted"]:
@ -566,75 +551,6 @@ def _print_findings_detail(unmuted: list[dict]) -> None:
console.print() console.print()
def _worst_severity(rows: list[dict]) -> str:
"""Return the most severe severity label across `rows`."""
if not rows:
return ""
return min(
(r["SEVERITY"] for r in rows),
key=lambda s: severity_sort({"SEVERITY": s}),
)
def _print_grouped_findings(unmuted: list[dict], top_n: int = 15) -> None:
"""Top-N tables grouped by CHECK_ID and RESOURCE_NAME.
Used for image and IaC scans where per-finding tables would be too
large to be useful. Shows count and worst severity for each group.
"""
by_check: dict[str, list[dict]] = {}
by_resource: dict[str, list[dict]] = {}
for r in unmuted:
by_check.setdefault(r["CHECK_ID"], []).append(r)
by_resource.setdefault(r.get("RESOURCE_NAME", "") or "(no resource)", []).append(r)
check_table = Table(
show_header=True,
header_style="bold",
title=f"Top {top_n} Checks by Unmuted Finding Count",
)
check_table.add_column("Worst Sev")
check_table.add_column("Check ID")
check_table.add_column("Count", justify="right")
for check, rows in sorted(
by_check.items(), key=lambda kv: -len(kv[1])
)[:top_n]:
worst = _worst_severity(rows)
style = _sev_style(worst)
check_table.add_row(
f"[{style}]{worst}[/{style}]" if style else worst,
check,
str(len(rows)),
)
console.print(check_table)
console.print()
res_table = Table(
show_header=True,
header_style="bold",
title=f"Top {top_n} Resources by Unmuted Finding Count",
)
res_table.add_column("Worst Sev")
res_table.add_column("Resource")
res_table.add_column("Count", justify="right")
for resource, rows in sorted(
by_resource.items(), key=lambda kv: -len(kv[1])
)[:top_n]:
worst = _worst_severity(rows)
style = _sev_style(worst)
res_table.add_row(
f"[{style}]{worst}[/{style}]" if style else worst,
resource[:80],
str(len(rows)),
)
console.print(res_table)
console.print()
def main( def main(
full: Annotated[ full: Annotated[
bool, typer.Option(help="(reserved) currently a no-op; all unmuted failures already shown") bool, typer.Option(help="(reserved) currently a no-op; all unmuted failures already shown")
@ -646,13 +562,12 @@ def main(
del full # historical flag, kept for backwards compatibility del full # historical flag, kept for backwards compatibility
with tempfile.TemporaryDirectory() as tmpdir: with tempfile.TemporaryDirectory() as tmpdir:
for label, base, group in PROWLER_SCANS: for label, base in PROWLER_SCANS:
summarize_report( summarize_report(
label, label,
base, base,
tmpdir, tmpdir,
show_muted=show_muted, show_muted=show_muted,
group_findings=group,
) )
# --- Node-level MANUAL check verification --- # --- Node-level MANUAL check verification ---

View file

@ -8,7 +8,7 @@
#USAGE flag "--type <type>" help="Filter by service type (argocd, ansible, nixos, fly, mise)" #USAGE flag "--type <type>" help="Filter by service type (argocd, ansible, nixos, fly, mise)"
"""Review the most stale service for version freshness. """Review the most stale service for version freshness.
Reads ``docs/reference/services/service-versions.yaml`` and sorts services Reads ``service-versions.yaml`` (repo root) and sorts services
by the ``last-reviewed`` field. Services without the field (or null) are by the ``last-reviewed`` field. Services without the field (or null) are
treated as never-reviewed and float to the top. Displays a staleness table treated as never-reviewed and float to the top. Displays a staleness table
and then shows the most stale service with a review checklist. and then shows the most stale service with a review checklist.
@ -210,7 +210,7 @@ def main(
"• Verify the service is running and healthy\n", "• Verify the service is running and healthy\n",
"• Check logs for errors or warnings\n", "• Check logs for errors or warnings\n",
"\n[bold]After Review:[/bold]\n", "\n[bold]After Review:[/bold]\n",
"• Update the tracking file: [cyan]docs/reference/services/service-versions.yaml[/cyan]\n", "• Update the tracking file: [cyan]service-versions.yaml[/cyan] (repo root)\n",
f"• Set [cyan]last-reviewed: {today}[/cyan] and [cyan]current-version[/cyan]\n", f"• Set [cyan]last-reviewed: {today}[/cyan] and [cyan]current-version[/cyan]\n",
"• Commit the change (along with any upgrades)", "• Commit the change (along with any upgrades)",
] ]

View file

@ -7,11 +7,11 @@
] ]
}, },
"locked": { "locked": {
"lastModified": 1780290312, "lastModified": 1780894562,
"narHash": "sha256-eTAlX0CwgB84Ts3GaBd944A3DRXVMzgA0EqroZBISUo=", "narHash": "sha256-c3430xwxwhHipl3jigUGMMBfpaMylDqytW/kdmB3ZGs=",
"owner": "nix-community", "owner": "nix-community",
"repo": "disko", "repo": "disko",
"rev": "115e5211780054d8a890b41f0b7734cafad54dfe", "rev": "24fed06cac83bcc44ac8efbb57cab1a82fa0bedc",
"type": "github" "type": "github"
}, },
"original": { "original": {
@ -43,11 +43,11 @@
}, },
"nixpkgs": { "nixpkgs": {
"locked": { "locked": {
"lastModified": 1779796641, "lastModified": 1780511130,
"narHash": "sha256-ZsIrKmhp4vbBXoXXmR/tBXA/UCsAQiJL9vsgZEduhVY=", "narHash": "sha256-2v9lT4ya59Lh1FqPeLnz1MoX9y/wz2huqfe9RtQZITk=",
"owner": "NixOS", "owner": "NixOS",
"repo": "nixpkgs", "repo": "nixpkgs",
"rev": "25f538306313eae3927264466c70d7001dcea1df", "rev": "535f3e6942cb1cead3929c604320d3db54b542b9",
"type": "github" "type": "github"
}, },
"original": { "original": {

View file

@ -440,14 +440,20 @@ services:
- name: jellyfin - name: jellyfin
type: ansible type: ansible
last-reviewed: 2026-03-17 last-reviewed: 2026-06-08
current-version: "10.11.6" current-version: "10.11.11"
upstream-source: https://github.com/jellyfin/jellyfin/releases upstream-source: https://github.com/jellyfin/jellyfin/releases
notes: >-
Homebrew cask (state: present, unpinned). Upgrade with
`brew upgrade --cask jellyfin` on indri. After upgrade the .app is
re-quarantined; launchd-spawned launch hangs silently until the
Gatekeeper first-launch dialog is approved on indri's GUI console
(xattr removal over SSH is blocked by TCC).
- name: automounter - name: automounter
type: ansible type: ansible
last-reviewed: 2026-03-17 last-reviewed: 2026-06-09
current-version: "1.11.0" current-version: "1.13.0"
upstream-source: https://www.pixeleyes.co.nz/automounter/ upstream-source: https://www.pixeleyes.co.nz/automounter/
notes: Mac App Store app, no Ansible role. Updates via App Store. notes: Mac App Store app, no Ansible role. Updates via App Store.