Commit graph

554 commits

Author SHA1 Message Date
86a0dee000 Remove ollama LAN NodePort service
The sanctioned ingress is ollama.ops.eblu.me via tailnet.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 10:00:05 -08:00
3af346f1cd Move ollama LAN NodePort to port 80
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 09:37:50 -08:00
2d5b78a4d7 Add Forge (public) to services-check
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 08:48:15 -08:00
a87c997ee1 Expose Forgejo publicly at forge.eblu.me (#278)
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 1m28s
## Summary

Expose Forgejo publicly at `forge.eblu.me` via the Fly.io reverse proxy — the first dynamic, authenticated public-facing service.

- **Forgejo hardening:** Domain changed to forge.eblu.me, SSH stays on forge.ops.eblu.me, reverse proxy trust headers configured, local registration locked to external-only (Authentik SSO)
- **Tailscale Ingress:** ExternalName Service + Ingress in tailscale-operator creates forge.tail8d86e.ts.net endpoint
- **Fly.io proxy:** nginx server block with rate-limited auth endpoints (3r/s), fail2ban with custom nginx-deny action, security headers, /swagger blocked, WebSocket support, 512m body limit
- **Authentik:** OAuth callback updated to forge.eblu.me
- **DNS/TLS:** CNAME record in Pulumi, cert in fly-setup
- **Rename:** ~29 files updated from forge.ops.eblu.me to forge.eblu.me (HTTPS refs only; SSH, container builds, and Caddy table kept as-is)

## Deployment Order

1. `mise run provision-indri -- --tags forgejo` (config changes)
2. Verify forge.ops.eblu.me still works
3. `argocd app set tailscale-operator --revision feature/forge-public && argocd app sync tailscale-operator`
4. Verify `curl https://forge.tail8d86e.ts.net`
5. `cd fly && fly deploy`
6. Verify pre-DNS: `curl -H "Host: forge.eblu.me" https://blumeops-proxy.fly.dev/`
7. `fly certs add forge.eblu.me -a blumeops-proxy`
8. `argocd app set authentik --revision feature/forge-public && argocd app sync authentik`
9. `mise run dns-preview && mise run dns-up`
10. Full verification (see below)
11. Rehearse `mise run fly-shutoff`
12. After merge: reset ArgoCD revisions to main, re-sync

## Verification Checklist

- [ ] forge.eblu.me loads, shows public repos
- [ ] forge.ops.eblu.me still works from tailnet
- [ ] SSH clone via forge.ops.eblu.me:2222 works
- [ ] HTTPS clone via forge.eblu.me works
- [ ] UI shows forge.eblu.me for HTTPS clone, forge.ops.eblu.me for SSH
- [ ] /swagger returns 403
- [ ] Rapid login attempts trigger 429 rate limit
- [ ] fail2ban bans after 5 failed logins in 10 minutes
- [ ] ArgoCD can still sync (SSH unaffected)
- [ ] `mise run fly-shutoff` stops all public traffic
- [ ] `mise run services-check` passes

Reviewed-on: #278
2026-03-03 08:40:41 -08:00
a32c99a252 Limit ollama to one loaded model and one parallel request
Prevents OOM when switching between models — only one 14B model
fits in 16GB VRAM at a time with KV cache for context.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 21:23:12 -08:00
203e3cd567 Add NodePort service for ollama LAN access
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 20:57:18 -08:00
31d925814f Deploy Ollama LLM server on ringtail (#277)
## Summary
- Deploy Ollama as a new ArgoCD-managed service on ringtail's k3s cluster with GPU acceleration
- Declarative model management via `models.txt` + sidecar sync script (mirrors kiwix torrent pattern)
- Initial models: `qwen2.5:14b`, `deepseek-r1:14b`, `phi4:14b`, `gemma3:12b`
- hostPath PV on `/mnt/storage1/ollama` for fast local model storage (200Gi)
- Tailscale ingress at `ollama.ops.eblu.me` for API access from tailnet
- Enable GPU time-slicing (`replicas: 2`) on nvidia-device-plugin so Frigate and Ollama share the RTX 4080

## Deployment and Testing
- [ ] Deploy nvidia-device-plugin changes first: `argocd app sync nvidia-device-plugin`
- [ ] Verify GPU time-slicing: `kubectl describe node ringtail --context=k3s-ringtail` shows `nvidia.com/gpu: 2`
- [ ] Sync `apps` app with `--revision feature/ollama-ringtail`
- [ ] Set ollama app to branch: `argocd app set ollama --revision feature/ollama-ringtail && argocd app sync ollama`
- [ ] Verify model-sync sidecar pulls models: `kubectl logs -n ollama deploy/ollama -c model-sync --context=k3s-ringtail`
- [ ] Test API: `curl https://ollama.ops.eblu.me/api/tags`
- [ ] Test inference: `curl https://ollama.ops.eblu.me/api/generate -d '{"model":"qwen2.5:14b","prompt":"Hello"}'`
- [ ] Verify Frigate still works after GPU sharing change
- [ ] After merge: `argocd app set ollama --revision main && argocd app sync ollama`

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/277
2026-03-02 20:39:51 -08:00
Forgejo Actions
0f79c61c42 Update docs release to v1.12.1
- Built changelog from towncrier fragments

[skip ci]
2026-03-02 18:17:07 -08:00
7a1875936c Switch git hooks from pre-commit to prek (#276) v1.12.1
## Summary

- Replace pre-commit with [prek](https://github.com/j178/prek), a faster Rust-native drop-in alternative
- Migrate config from `.pre-commit-config.yaml` (YAML) to `prek.toml` (TOML)
- Add new built-in checks: case conflicts, private key detection, executable shebangs
- Install prek via mise native registry (`aqua:j178/prek`) instead of pipx
- Update all doc references across README, contributing guide, and how-to docs

## Notes

- `check-yaml` still uses the remote `pre-commit-hooks` repo because prek's builtin fast path doesn't support `--unsafe` yet (needed for Ansible custom YAML tags)
- All existing custom hooks (docs validation, container version check, mikado invariant, workflow validation) work unchanged
- Tested: all hooks pass on clean tree, deliberate doc link breakage is caught

## Test plan

- [x] `prek run --all-files` passes all checks
- [x] Broken wiki-link correctly caught by `docs-check-links`
- [x] taplo-format auto-fixes TOML formatting on commit
- [x] commit-msg hook (mikado invariant) fires correctly

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/276
2026-03-02 18:15:23 -08:00
940219338a Slightly less confusing output when mikado cards share a prereq 2026-03-02 17:58:18 -08:00
2d54f93c68 Add changelog for impl-card-guard feature
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 17:45:01 -08:00
6131f881a8 Enforce impl commits can't modify Mikado card files
The mikado-branch-invariant-check hook now inspects staged files (commit-msg
hook) and historical commit files (standalone mode) to reject impl commits
that touch markdown files with Mikado frontmatter (requires:, status:, or
branch: mikado/). Cards should only be modified by plan, close, or finalize.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 17:44:34 -08:00
10d0636cee Review navidrome and miniflux: both at latest versions
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 07:32:17 -08:00
9465b75815 Add changelog for authentik source chain doc review
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 07:28:09 -08:00
08b9570ac7 Review build-authentik-from-source Mikado chain docs
Fix go-server-derivation: wrong path target (webui not authentik-django)
and missing internal/web/static.go patch. Remove stale DRF fork content
from mirror-build-deps (no longer needed as of 2026.2.0). Add
last-reviewed to all 5 cards without it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 07:28:09 -08:00
Forgejo Actions
847e47eaf3 Update docs release to v1.12.0
- Built changelog from towncrier fragments

[skip ci]
2026-03-01 17:24:09 -08:00
2a2811d7a5 Review authentik-api-client-generation doc: fix stale content v1.12.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 17:21:46 -08:00
c9d273dc81 Update authentik changelog fragment to mention version upgrade
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 17:11:41 -08:00
503775085d Deploy authentik 2026.2.0 with migration ordering fix
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 16:32:10 -08:00
2d4098e480 Fix authentik 2026.2.0 migration ordering bug (#275)
All checks were successful
Build Container / detect (push) Successful in 2s
Build Container (Nix) / detect (push) Successful in 1s
Build Container / build (authentik) (push) Successful in 1s
Build Container (Nix) / build (authentik) (push) Successful in 3m6s
## Summary

- Patch `authentik_rbac/0010` migration to depend on `authentik_core/0056`, fixing non-deterministic ordering that crashes startup with `FieldError: Cannot resolve keyword 'group_id'`
- Upstream bug: goauthentik/authentik#19616, #20634 — no fix released yet
- Document the issue in the lessons-learned table

## Deployment and Testing

- [ ] CI builds container image
- [ ] Deploy from branch: `argocd app set authentik --revision fix/authentik-migration-ordering && argocd app sync authentik`
- [ ] Pods reach Running/Ready without crash-looping
- [ ] `kubectl logs` show 0056 migrating before 0010
- [ ] authentik UI loads at authentik.ops.eblu.me
- [ ] `mise run services-check`
- [ ] After merge: `argocd app set authentik --revision main && argocd app sync authentik`

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/275
2026-03-01 16:28:36 -08:00
90621e4155 Deploy authentik 2026.2.0 with entry_points fix
Update image tag to v2026.2.0-78027eb-nix.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 16:04:29 -08:00
78027eb783 Fix authentik: entry_points API for Python 3.14
All checks were successful
Build Container (Nix) / detect (push) Successful in 1s
Build Container / detect (push) Successful in 2s
Build Container / build (authentik) (push) Successful in 2s
Build Container (Nix) / build (authentik) (push) Successful in 3m7s
Python 3.14's EntryPoints uses string keys, not integer indices.
eps[0] raises KeyError(0); use next(iter(eps)) instead.
Verified on ringtail with the actual venv python.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 16:02:49 -08:00
e2c650b027 Deploy authentik 2026.2.0 with BASE_DIR fix
Update image tag to v2026.2.0-e49d966-nix.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 15:55:50 -08:00
e49d966229 Fix authentik: symlink authentik/ for BASE_DIR resolution
All checks were successful
Build Container (Nix) / detect (push) Successful in 1s
Build Container / detect (push) Successful in 2s
Build Container / build (authentik) (push) Successful in 2s
Build Container (Nix) / build (authentik) (push) Successful in 3m4s
Django's BASE_DIR is $out but source lives in site-packages. Code
like BASE_DIR / "authentik" / "sources" / "scim" / "schemas" / ...
needs a top-level symlink to find data files alongside Python source.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 15:55:26 -08:00
c0e29476f3 Deploy authentik 2026.2.0 with TMPDIR fix
Update image tag to v2026.2.0-b7bfb0b-nix.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 15:53:09 -08:00
b7bfb0bfae Fix authentik container: set TMPDIR=/tmp
All checks were successful
Build Container (Nix) / detect (push) Successful in 1s
Build Container / detect (push) Successful in 2s
Build Container / build (authentik) (push) Successful in 1s
Build Container (Nix) / build (authentik) (push) Successful in 45s
lifecycle/ak uses ${TMPDIR}/authentik-mode — without TMPDIR set it
tries to write /authentik-mode in root, which user 65534 can't do.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 15:52:36 -08:00
38da372f94 Deploy authentik 2026.2.0 with /tmp fix
Update image tag to v2026.2.0-2ac353b-nix.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 15:51:17 -08:00
2ac353b7bf Fix authentik container: create /tmp for unprivileged user
All checks were successful
Build Container (Nix) / detect (push) Successful in 1s
Build Container / detect (push) Successful in 2s
Build Container / build (authentik) (push) Successful in 1s
Build Container (Nix) / build (authentik) (push) Successful in 54s
buildLayeredImage doesn't create /tmp by default. The container runs
as user 65534 (nobody) which can't mkdir /tmp at runtime.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 15:48:05 -08:00
098f3e517c Deploy authentik 2026.2.0 (source-built) to ArgoCD
Update image tag to v2026.2.0-efa9806-nix — the first source-built
authentik container from the build-authentik-from-source chain.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 15:44:35 -08:00
efa9806bfa C2: Build authentik from source (Mikado chain) (#274)
All checks were successful
Build Container / detect (push) Successful in 3s
Build Container (Nix) / detect (push) Successful in 1s
Build Container / build (authentik) (push) Successful in 2s
Build Container (Nix) / build (authentik) (push) Successful in 22s
## Mikado Chain: build-authentik-from-source

Replace `pkgs.authentik` from nixpkgs with a custom Nix derivation built from source.
This removes the dependency on the nixpkgs packaging timeline and gives full version control.

Target version: **2025.12.4** (nixpkgs reference, upgrading from deployed 2025.10.1).

### Dependency Graph

```
build-authentik-from-source (goal)
├── authentik-go-server-derivation
│   ├── authentik-api-client-generation  ← IN PROGRESS
│   └── authentik-python-backend-derivation
├── authentik-web-ui-derivation
│   └── authentik-api-client-generation  ← IN PROGRESS
└── authentik-python-backend-derivation
```

### Ready Leaves
- `authentik-api-client-generation` — Go + TypeScript client generation from OpenAPI schema
- `authentik-python-backend-derivation` — Django backend with 60+ deps, 4 in-tree packages

### Architecture
Ported from [nixpkgs `pkgs/by-name/au/authentik/package.nix`](https://github.com/NixOS/nixpkgs/tree/master/pkgs/by-name/au/authentik):
- `source.nix` — shared version/source fetch
- `client-go.nix` — Go API client generation
- `client-ts.nix` — TypeScript API client generation
- `api-go-vendor-hook.nix` — Go vendor directory injection hook
- (more components to follow as leaves are closed)

### Related Cards
- [[build-authentik-from-source]] — Goal card
- [[authentik-api-client-generation]]
- [[authentik-python-backend-derivation]]
- [[authentik-web-ui-derivation]]
- [[authentik-go-server-derivation]]

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/274
2026-03-01 13:45:00 -08:00
0aaf9bb8b2 Add Dagger local build step to authentik source build goal
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 08:39:25 -08:00
7094ea7d3e Start C2 Mikado chain: build authentik from source
Create goal card and 4 prerequisite cards for building authentik from a
custom Nix derivation instead of using pkgs.authentik from nixpkgs. This
removes the dependency on the nixpkgs packaging timeline and gives full
version control over authentik releases.

Chain: mikado/authentik-source-build
Leaf nodes: authentik-api-client-generation, authentik-python-backend-derivation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 08:20:17 -08:00
922265c88f Add changelog fragment for grafana doc review
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 07:28:47 -08:00
8d1e98617b Review build-grafana-container docs: stamp reviewed, fix cross-links
Also fix stale grafana.md reference card (Helm → Kustomize).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 07:28:06 -08:00
02eb169403 Pin blumeops-pg to PostgreSQL 18.3
Replace floating :18 tag with pinned :18.3 (upstream out-of-cycle
release fixing 18.2 regressions). Stamps service as reviewed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 16:25:32 -08:00
2312e5fbf8 Update ringtail flake inputs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 15:22:29 -08:00
7cecaf0471 Review forgejo-runner docs: stamp reviewed, fix cross-links
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 15:10:20 -08:00
776caa87f5 Sync Frigate zone coordinates from live API
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 07:52:09 -08:00
Forgejo Actions
fa223f8e3b Update docs release to v1.11.5
- Built changelog from towncrier fragments

[skip ci]
2026-02-26 07:56:02 -08:00
be3cdad1cb Add HA for CV and Docs: zero-downtime deploys (#273) v1.11.5
## Summary
- Set `replicas: 2` with `maxUnavailable: 0` / `maxSurge: 1` on CV and Docs deployments so rolling updates never drop below 2 ready pods
- Add PodDisruptionBudgets (`minAvailable: 1`) to protect against node drains and cluster maintenance
- Add Fly.io cache purge step to `cv-deploy.yaml` workflow (docs already had this) so CV deploys don't serve stale cached content

## Deployment and Testing
- [ ] `argocd app diff cv` / `argocd app diff docs` from branch
- [ ] Deploy from branch: `argocd app set cv --revision feature/ha-cv-docs-zero-downtime && argocd app sync cv`
- [ ] Verify 2 pods running: `kubectl get pods -n cv --context=minikube-indri`
- [ ] Test rolling restart: `kubectl rollout restart deployment/cv -n cv --context=minikube-indri`
- [ ] During rollout, confirm continuous availability via `curl -I https://cv.eblu.me`
- [ ] After merge: reset ArgoCD to main, re-sync both apps

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/273
2026-02-26 07:53:21 -08:00
8e9d89ca73 docs-review: print file path instead of content for LLM usage
The LLM should read the file itself using its tools rather than
receiving it inline in the task output.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 07:24:37 -08:00
9a7acffa26 Review manage-forgejo-mirrors doc: clarify cron default, stamp reviewed
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 07:17:18 -08:00
fb83c5c577 Add explicit ExternalSecret defaults for SSA sync parity
The external-secrets webhook injects conversionStrategy, decodingStrategy,
and metadataPolicy defaults on admission. Declaring them explicitly prevents
ArgoCD SSA from flagging the resource as OutOfSync.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 07:02:54 -08:00
db561c6b0e Upgrade ArgoCD v3.2.6 → v3.3.2 with Server-Side Apply (#272)
## Summary
- Upgrade ArgoCD from v3.2.6 to v3.3.2
- Enable `ServerSideApply=true` sync option (required by v3.3 — ApplicationSet CRD exceeds client-side apply annotation limit)
- Update service-versions.yaml with review for argocd and 1password-connect

## Breaking changes reviewed
- **Server-Side Apply required**: Added to syncOptions 
- **Source Hydrator git notes**: Not used — N/A
- **Application path cleaning removed**: Not used — N/A
- **Settings API field restriction**: Authenticated access only — N/A

## Deployment and Testing
- [ ] Sync the `apps` app first (picks up SSA syncOption change)
- [ ] `argocd app set argocd --revision feature/argocd-v3.3.2`
- [ ] `argocd app sync argocd`
- [ ] Verify all argocd pods running with v3.3.2 images
- [ ] Verify other apps still sync correctly
- [ ] After merge: `argocd app set argocd --revision main && argocd app sync argocd`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/272
2026-02-26 06:51:50 -08:00
95c8424e62 Add Transmission metrics exporter and Grafana dashboard (#271)
## Summary
- Add `metalmatze/transmission-exporter` as a sidecar container in the torrent deployment, exposing Prometheus metrics on port 19091
- Add metrics port to the torrent service for Prometheus scraping
- Add Prometheus scrape job targeting the transmission exporter
- Create Grafana dashboard with:
  - Overview stats (download/upload speed, active/total torrents)
  - Transfer speed timeseries (download + upload over time)
  - Transfer volume stats (total downloaded/uploaded in selected range)
  - Per-torrent download and upload rate timeseries
  - Per-torrent details table (ratio, uploaded, percent done)

## Deployment and Testing
- [ ] Sync ArgoCD `torrent` app from branch — verify exporter sidecar starts
- [ ] Verify exporter metrics: `kubectl exec` into pod, `curl localhost:19091/metrics`
- [ ] Verify Prometheus scrapes it: check targets at prometheus.ops.eblu.me
- [ ] Open Grafana, find "Transmission" dashboard, verify panels populate
- [ ] Sync ArgoCD `prometheus` app from branch
- [ ] Sync ArgoCD `grafana-config` app from branch

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/271
2026-02-25 22:23:33 -08:00
03d71544ec Add multi-cluster observability with ringtail metrics and dashboards (#270)
## Summary
- Add `cluster` label (indri/ringtail) to all Prometheus scrape jobs, Alloy k8s metrics/logs, and Alloy host metrics/logs
- Deploy kube-state-metrics on ringtail's k3s cluster (ArgoCD app + manifests)
- Deploy Alloy on ringtail to collect pod metrics and logs, remote-writing to indri's Prometheus and Loki
- Replace single-cluster "Minikube Kubernetes" and "K8s Services Health" dashboards with:
  - **Kubernetes Clusters** dashboard — multi-cluster with `cluster` and `namespace` template variables
  - **Ringtail (k3s)** dashboard — dedicated ringtail view with GPU usage panels

## Deployment and Testing
1. Sync `apps` on indri ArgoCD to pick up new app definitions (`kube-state-metrics-ringtail`, `alloy-ringtail`)
2. Sync `prometheus` → verify `cluster` label on scraped metrics
3. Sync `alloy-k8s` → verify `cluster=indri` on remote-written metrics and logs
4. Run `mise run provision-indri -- --tags alloy` → verify `cluster=indri` on host Alloy metrics/logs
5. Sync `kube-state-metrics-ringtail` → verify pods running on ringtail
6. Sync `alloy-ringtail` → verify pods running, check Prometheus for `kube_pod_info{cluster="ringtail"}`
7. Sync `grafana-config` → verify dashboards appear, cluster variable populates both values
8. Check Loki for `{cluster="ringtail"}` logs from ringtail pods

## Notes
- Alloy on ringtail uses `insecure_skip_verify=true` for TLS to Prometheus/Loki (Tailscale-managed certs not in container trust store) — tighten later
- DNS resolution for `*.tail8d86e.ts.net` from ringtail pods depends on CoreDNS inheriting host's MagicDNS resolver; may need CoreDNS forwarding rules if pods can't resolve
- The old services dashboard (blackbox probes) is removed — those probes are still running in alloy-k8s and the data is still in Prometheus, just not in a dedicated dashboard

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/270
2026-02-25 22:01:00 -08:00
2243f2e0a1 Filter driveway zone to person/dog/cat only in Frigate
Parked car was being re-detected every few minutes at night due to IR
illumination noise triggering motion detection. Restrict the driveway
zone to [person, dog, cat] so cars and birds no longer create events
there. Cars still alert via the driveway_entrance zone.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 20:45:07 -08:00
84338c32c2 Add authenticated GitHub PAT for Forgejo mirror sync (#269)
## Summary

- **mirror-create**: Auto-includes GitHub PAT from 1Password for authenticated upstream fetches at mirror creation time
- **mirror-update-pats**: New mise task that SSHes into indri and rewrites the git remote URL in every GitHub mirror's bare repo config to embed the PAT. Idempotent, supports `--dry-run`
- **app.ini.j2**: Explicit `[mirror]` section with `DEFAULT_INTERVAL = 8h` and `MIN_INTERVAL = 10m` (bakes in the defaults for visibility)
- **manage-forgejo-mirrors**: New how-to doc covering mirror creation, PAT storage, the `mirror-update-pats` task, and the full 20-day PAT rotation procedure

## Context

GitHub tightened unauthenticated rate limits for git clone/fetch in May 2025. With 23 GitHub mirrors syncing every 8 hours, authenticated fetches avoid throttling. The PAT is stored in 1Password (`Forgejo Secrets` → `github-mirror-pat`) and has been applied to all existing mirrors.

## Deployment and Testing

- [x] `mirror-update-pats` dry-run verified (23 mirrors detected)
- [x] `mirror-update-pats` applied to all 23 GitHub mirrors on indri
- [x] Idempotency confirmed (re-run shows 0 updated, 23 skipped)
- [ ] Provision indri with `--tags forgejo` to apply `[mirror]` config
- [ ] Trigger a manual mirror sync and verify success in Forgejo UI

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/269
2026-02-25 20:20:23 -08:00
23dc79058e Bake default display options into ai-docs mise task
The --style=header --color=never --decorations=always flags are now built
into the script so callers can just run `mise run ai-docs`. Also adds a
note to CLAUDE.md to never truncate the output.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 17:42:47 -08:00
de54b4e33d Port CloudNative-PG off Helm to direct release manifest (#268)
## Summary
- Point ArgoCD app directly at forge-mirrored upstream repo (`mirrors/cloudnative-pg`) instead of the Helm charts repo
- Use `directory.include` to select the specific release manifest (`cnpg-1.27.1.yaml`) from the `releases/` directory
- No vendored files, no Helm — upgrades are a two-line change (`targetRevision` + `directory.include`)
- Delete unused `values.yaml` (was empty, all Helm defaults)

## Deployment and Testing
- [ ] Register mirror repo in ArgoCD: `argocd repo add ssh://forgejo@forge.ops.eblu.me:2222/mirrors/cloudnative-pg.git --ssh-private-key-path <key>`
- [ ] `argocd app set cloudnative-pg --revision feature/cnpg-direct-source && argocd app sync cloudnative-pg`
- [ ] Verify operator pod running: `kubectl get pods -n cnpg-system --context=minikube-indri`
- [ ] Verify CRDs exist: `kubectl get crd --context=minikube-indri | grep cnpg`
- [ ] Verify existing clusters healthy: `kubectl get clusters -A --context=minikube-indri`
- [ ] After merge: `argocd app set cloudnative-pg --revision main && argocd app sync cloudnative-pg`

## Notes
- The forge mirror was created via `mise run mirror-create` from `https://github.com/cloudnative-pg/cloudnative-pg.git`
- ArgoCD may need the mirror repo added to its known repositories if the credential template doesn't already match `mirrors/*`

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/268
2026-02-25 17:37:53 -08:00