Commit graph

428 commits

Author SHA1 Message Date
391dd2dd10 Disable sway config check for runtime wallpaper path
The Nix build sandbox can't access ~/.config/sway/wallpaper.jpg,
so the config check fails. The config is valid at runtime.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 16:48:48 -08:00
62fb1744d0 Add bluetooth, improve waybar audio/network modules
- Enable bluetooth with blueman for speaker pairing
- Pulseaudio: headphone icon, mute indicator
- Network: show bandwidth up/down instead of interface name
- Bluetooth waybar module with catppuccin styling

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 16:47:51 -08:00
ee21f80d35 Add wallpaper and waybar module pill styling
- Wallpaper from ~/.config/sway/wallpaper.jpg
- Waybar modules styled as rounded pills with gaps
- Semi-transparent waybar background

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 16:44:28 -08:00
354d745ec6 Add unzip for Mason LSP installs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 16:30:31 -08:00
36fb711ee3 Add librewolf and Catppuccin Macchiato theme for sway/waybar
- librewolf browser
- Sway: gaps (8 inner, 4 outer), 2px borders, catppuccin macchiato
  window colors, VictorMono Nerd Font, solid base color background
- Waybar: catppuccin macchiato styling with accent colors per module

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 16:29:56 -08:00
a295298366 Add compile-time flags for mise python-build on NixOS
python-build compiles from source and needs headers/library paths.
nix-ld only handles runtime linking for prebuilt binaries. Set
CFLAGS, LDFLAGS, and PKG_CONFIG_PATH via sessionVariables so
configure scripts find zlib, openssl, readline, etc.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 15:46:24 -08:00
a870b2c278 Fix changed_when check for nixos-rebuild (stderr not stdout)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 15:33:30 -08:00
6b946349c3 Move runtime libs to nix-ld.libraries for mise binaries
Dynamically linked binaries (dotnet, python) need libraries in
NIX_LD_LIBRARY_PATH, not just on PATH via systemPackages.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 15:29:00 -08:00
8f7b7ea11a Add ICU and python build deps for mise runtimes
dotnet needs libicu for globalization support. python-build needs
zlib, readline, bzip2, xz, libffi, ncurses, and sqlite.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 15:18:11 -08:00
a1e308a43c Launch 1Password and Steam on sway startup
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 12:39:53 -08:00
505799448d Update ringtail docs and changelog for PR
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 11:39:35 -08:00
24fc5df7ec Add gnupg and nix-ld for mise-installed runtimes
gnupg fixes GPG verification warnings. nix-ld provides a dynamic
linker shim so generic Linux binaries (dotnet, rustup, etc.)
downloaded by mise can run on NixOS.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 11:21:39 -08:00
a42e73009f Add build toolchain for mise-managed language runtimes
gcc, gnumake, pkg-config, and openssl needed to compile
Python, Rust, Node, etc. via mise.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 11:16:37 -08:00
dbd389cd64 Map Caps Lock to Control in sway
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 11:14:48 -08:00
4668bf9978 Add mise to ringtail for managing node/npm
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 11:13:00 -08:00
5ad47ef42c Add VictorMono Nerd Font for wezterm
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 11:09:13 -08:00
421311ff75 Add waybar with system tray for sway
Configured via home-manager with workspaces, window title,
audio, network, clock, and tray modules.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 11:05:03 -08:00
c1ec4851d5 Use NixOS 1Password modules for proper CLI-GUI integration
Raw _1password-cli and _1password-gui packages don't set up the
onepassword-cli group, setgid bit, or polkit policy needed for
CLI-to-desktop-app communication. The NixOS modules handle this.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 11:00:36 -08:00
7548fda5d7 Disable TPM2 to fix 90s boot delay
Crosshair VI Hero has no TPM module. systemd waits 90s for
/dev/tpm0 and /dev/tpmrm0 before timing out on every boot.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:44:20 -08:00
25feb2fb1e Fix /mnt/* ownership so eblume can use Steam library on /mnt/games
Drives mounted by disko default to root ownership. Use tmpfiles
rules to set eblume:users ownership at boot.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:38:07 -08:00
74352603cc Fix ringtail tailscale check: use jq instead of grep
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:36:08 -08:00
91ed79578f Add ringtail to services-check (SSH + Tailscale)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:31:59 -08:00
c56bc1d596 Fix flake-lock: enable experimental features, update lockfile
The nixos/nix container doesn't have flakes enabled by default.
Pass --extra-experimental-features flag. Also commit the updated
flake.lock with home-manager input resolved via Dagger.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:25:18 -08:00
df5d1bae4d Add Dagger flake-lock function and improve provision-ringtail
- New `flake-lock` Dagger function: runs `nix flake lock` in a
  nixos/nix container, returns the updated flake.lock file.
- provision-ringtail now: updates flake.lock via Dagger before
  deploy, verifies current commit is pushed to forge, and passes
  the exact commit SHA to the ansible playbook.
- Playbook accepts `ringtail_commit` var to deploy a specific ref.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:21:29 -08:00
1f97c5498e Add home-manager for sway keybinding, fix extraConfig error
The NixOS programs.sway module doesn't have extraConfig. Use
home-manager's wayland.windowManager.sway instead to set the
terminal to wezterm (which gives us $mod+Return automatically).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:03:55 -08:00
8daf990aa5 Add detailed hardware specs to ringtail reference card
Queried ringtail directly for CPU, RAM, GPU, storage, monitor,
and peripheral details via dmidecode, edid-decode, and lsusb.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:57:24 -08:00
8c99efee79 Polish ringtail NixOS config and add documentation
Sway keybinding for wezterm, fish as default shell, remove
initialPassword, add 1Password/chezmoi/dev tool packages.
Add ringtail reference card and update host inventory.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:51:10 -08:00
b76f2314c2 Add force: true to ringtail git task
nixos-rebuild can dirty the tree (e.g. flake.lock updates), which
blocks the Ansible git module. Force ensures we always reset to the
upstream state.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:32:23 -08:00
7bf46f4e28 Add flake.lock for ringtail NixOS config
Prevents 'Git tree is dirty' warnings during nixos-rebuild.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:31:21 -08:00
5a087c10df Fix deprecated greetd.tuigreet package reference
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:30:01 -08:00
4b7491c58f Add python3 to ringtail for Ansible compatibility
NixOS doesn't include Python by default. Ansible needs it on the
managed host for module execution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:29:09 -08:00
b08ed98881 Enable passwordless sudo for wheel group on ringtail
Required for Ansible unattended provisioning via become: true.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:25:32 -08:00
8ee6c1271a Add --accept-routes and --ssh to tailscale config
Makes tailscale settings declarative so they persist across rebuilds.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:24:17 -08:00
aaf7e73c27 Fix sway on NVIDIA proprietary drivers
Sway/wlroots refuses to start on proprietary NVIDIA by default.
Add --unsupported-gpu flag and disable hardware cursors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:08:26 -08:00
104e49d337 Allow unfree packages for NVIDIA drivers and Steam
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 08:56:27 -08:00
b9d813cde1 Add NixOS configuration for ringtail workstation (#207)
## Summary
- NixOS flake for ringtail (gaming/compute workstation, RTX 4080) in `nixos/ringtail/`
- Declarative disk partitioning via disko (GPT, 512M EFI + ext4 root on NVMe)
- NVIDIA proprietary drivers, sway/Wayland desktop, greetd, PipeWire, Steam
- Tailscale integration for tailnet connectivity
- Ansible playbook + `mise run provision-ringtail` for ongoing management
- Pulumi auth key (`tag:homelab`, `tag:blumeops`) for tailnet bootstrap

## Deployment Order
1. **Merge PR**
2. `pulumi up` in tailscale stack → creates auth key
3. Retrieve auth key: `pulumi stack output ringtail_authkey --show-secrets`
4. On ringtail NixOS installer:
   - `nix run github:nix-community/disko -- --mode disko /tmp/disk-config.nix` (or from cloned repo)
   - `nixos-install --flake github:eblume/blumeops?dir=nixos/ringtail#ringtail`
5. Reboot, `tailscale up --auth-key=<key>`
6. Verify: `tailscale status`, SSH from gilbert

## Test plan
- [ ] Review NixOS configuration for completeness
- [ ] Verify disko partition layout matches ringtail hardware
- [ ] Run `pulumi preview` for tailscale stack
- [ ] Install NixOS on ringtail
- [ ] Confirm tailscale connectivity
- [ ] Confirm sway desktop works
- [ ] Test `mise run provision-ringtail` for ongoing management

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/207
2026-02-18 08:24:25 -08:00
5f9b024b4a Add Apple Silicon ZMQ detector for Frigate (#206)
## Summary

- New `frigate_detector` ansible role deploys the [apple-silicon-detector](https://github.com/frigate-nvr/apple-silicon-detector) as a LaunchAgent on indri
- Switches Frigate from ONNX CPU detector (~117ms) to ZMQ detector backed by CoreML/Neural Engine (~15ms)
- Removes detect FPS cap (no longer needed with fast inference)
- Updates Frigate docs and adds changelog fragment

## Deployment

### Phase 1: Deploy detector on indri (one-time setup + ansible)
```fish
ssh indri 'git clone https://github.com/frigate-nvr/apple-silicon-detector.git ~/code/3rd/apple-silicon-detector'
ssh indri 'cd ~/code/3rd/apple-silicon-detector && make install'
mise run provision-indri -- --tags frigate_detector --check --diff  # dry run
mise run provision-indri -- --tags frigate_detector                 # apply
ssh indri 'launchctl list mcquack.eblume.frigate-detector'          # verify running
ssh indri 'tail ~/Library/Logs/mcquack.frigate-detector.out.log'    # verify bound
```

### Phase 2: Test connectivity
```fish
kubectl --context=minikube-indri -n frigate exec deploy/frigate -- nc -vz host.minikube.internal 5555
```

### Phase 3: Deploy Frigate config (branch workflow)
```fish
argocd app set frigate --revision feature/frigate-zmq-detector && argocd app sync frigate
```

### Phase 4: Post-deploy checks
- [ ] Pod starts, no config errors
- [ ] `/api/stats` shows detector type zmq, inference_speed ~15ms
- [ ] detect_fps uncapped
- [ ] Recordings and MQTT events flowing
- [ ] After merge: `argocd app set frigate --revision main && argocd app sync frigate`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/206
2026-02-17 19:03:28 -08:00
f45897b7c7 Upgrade Frigate 0.16.4 → 0.17.0-rc2 (#205)
## Summary

- Bump Frigate image from `0.16.4-standard-arm64` to `0.17.0-rc2-standard-arm64`
- Adapt `record` config to 0.17 schema: `retain.days`/`mode: all` → `continuous.days`
- Update service docs and version tracker

This is the first step toward the Apple Silicon ZMQ detector. The existing ONNX detector is kept so we can validate the upgrade independently.

## What is NOT changing

- Detector config (still `type: onnx` with YOLO-NAS-s)
- go2rtc streams, MQTT, cameras, zones, review rules
- frigate-notify, storage PVs, Grafana dashboard

## Deployment and Testing

- [ ] `argocd app set frigate --revision upgrade-frigate-0.17 && argocd app sync frigate`
- [ ] Pod starts, `/api/version` returns `0.17.0-rc2`
- [ ] No config errors in pod logs
- [ ] Frigate web UI loads at `https://nvr.ops.eblu.me`
- [ ] Live view works, detection running (`/api/stats` shows `detection_fps > 0`)
- [ ] Recordings being created (`/api/recordings/summary`)
- [ ] MQTT events flowing (check frigate-notify logs)
- [ ] After merge: `argocd app set frigate --revision main && argocd app sync frigate`

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/205
2026-02-17 16:56:12 -08:00
acd213559e Fix frigate live view by capping detect FPS (#204)
## Summary
- Cap detect FPS to 2 to prevent recording segment backlog from ONNX inference bottleneck (~750ms/frame on ARM64 CPU)
- Sync motion masks from live config (added second mask area)
- Update driveway_entrance zone coordinates from live config
- Add explicit alert labels `[person, car]` while keeping `required_zones: [driveway_entrance]`

## Context
The "No frames have been received" error on the gablecam live view was caused by the detect stream falling behind — ONNX YOLO-NAS-s takes ~750ms per inference on ARM64 CPU, but the sub-stream sends 5 FPS. This caused recording segments to pile up and the ffmpeg watchdog to repeatedly kill/restart the process, creating gaps in the live view.

## Test plan
- [ ] Sync ArgoCD `frigate` app to branch and verify pod restarts cleanly
- [ ] Check `/api/stats` — `skipped_fps` should drop significantly, `process_fps` should be close to 2
- [ ] Verify live view at https://nvr.ops.eblu.me/#gablecam no longer shows "No frames" error
- [ ] Verify detections and alerts still work in the driveway_entrance zone

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/204
2026-02-17 16:18:02 -08:00
1e96866dd3 Grafana helm chart upgrade plan 2026-02-17 11:15:34 -08:00
b9d1acaf3a Service review for external-secrets 2026-02-17 10:48:09 -08:00
105a2c8c08 Update External Secrets Helm chart 1.3.1 → 2.0.0 (#203)
## Summary
- Bump External Secrets Operator Helm chart from `helm-chart-1.3.1` to `helm-chart-2.0.0` (operator v1.3.2)
- Updates both the operator app and CRDs app `targetRevision`
- No Helm values changes needed — `installCRDs`, `resources`, `webhook`, `certController` keys are unchanged

## Breaking changes in chart 2.0.0
- **Removed providers:** Alibaba and Device42 (unmaintained) — does not affect our 1Password setup
- **Templating engine v1 deprecated** — our ExternalSecrets don't set `engineVersion`, so they use the default (v2)
- **Webhook `failurePolicy`** for SecretStore is now dynamic

## Deployment
1. Sync CRDs first: `argocd app set external-secrets-crds --revision update/external-secrets-helm-2.0.0 && argocd app sync external-secrets-crds`
2. Sync operator: `argocd app set external-secrets --revision update/external-secrets-helm-2.0.0 && argocd app sync external-secrets`
3. Verify: `kubectl --context=minikube-indri -n external-secrets get pods`
4. After merge, set both apps back to `--revision main`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/203
2026-02-17 10:43:21 -08:00
5fbe70d1ba Port ntfy to locally built container image (#202)
All checks were successful
Build Container / build (push) Successful in 6m28s
ntfy-v1.0.0
## Summary
- Add `containers/ntfy/Dockerfile` — three-stage build (Node web UI, Go+CGO server, Alpine runtime) pinned to commit SHA `a03a37fe` (v2.17.0), sourced from forge mirror
- Update ntfy deployment image from `binwiederhier/ntfy:v2.17.0` to `registry.ops.eblu.me/blumeops/ntfy:v1.0.0`
- Note fish shell in CLAUDE.md

## Deployment
After merge, release the container image:
```fish
mise run container-tag-and-release ntfy v1.0.0
```
Then sync:
```fish
argocd app sync ntfy
```

## Test plan
- [x] `docker build` succeeds
- [x] `dagger call build --src=. --container-name=ntfy` succeeds (exit 0, container ID printed)
- [x] `ntfy --help` works in built container
- [ ] Tag and release `ntfy-v1.0.0` after merge
- [ ] Verify ntfy pod starts with new image
- [ ] Verify health endpoint responds at `ntfy.ops.eblu.me/v1/health`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/202
2026-02-17 10:18:20 -08:00
3e604d8fdc Review ntfy: upgrade to v2.17.0 and add reference docs (#201)
## Summary
- Upgrade ntfy from v2.11.0 to v2.17.0 (6 minor releases, no breaking changes)
- Add reference doc for ntfy service
- Add reference doc for frigate service (ntfy's sole producer via frigate-notify)
- Update reference index and service-versions.yaml tracking

## Notable upstream changes (v2.12.0–v2.17.0)
- **v2.14.0:** Declarative users/ACL config in files
- **v2.15.0:** `require-login` flag for topic-level auth
- **v2.16.0:** Dead man's switch (heartbeat) notifications, notification update/delete
- **v2.17.0:** Priority templating, crash fixes (nil pointer panics)

## Deployment and Testing
- [ ] ArgoCD sync ntfy after merge
- [ ] Verify ntfy pod healthy with new image
- [ ] Send a test notification via `curl -d "test" https://ntfy.ops.eblu.me/test`
- [ ] Verify frigate-notify still delivers alerts to ntfy

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/201
2026-02-17 09:51:40 -08:00
54c3b0a5f3 Expanded some CLAUDE.md stuff manualy 2026-02-17 07:54:34 -08:00
2f599a15bd Fix zk-docs broken path after how-to reorg
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 07:32:54 -08:00
Forgejo Actions
530460171a Update docs release to v1.9.4
- Built changelog from towncrier fragments

[skip ci]
2026-02-17 07:30:39 -08:00
27d8f3cf1f Review gandi-operations doc and reorganize how-to guides (#200) v1.9.4
## Summary
- **Doc review:** Reviewed `gandi-operations.md` — added `last-reviewed` frontmatter, verified all wiki-links, confirmed Pulumi state has no drift
- **Gandi reference fix:** Added missing `cv.eblu.me` CNAME row to `gandi.md` DNS records table (was present in Pulumi but undocumented)
- **Pulumi comment fix:** Updated stale `README.md` reference in `__main__.py` to point to `docs/how-to/gandi-operations.md`
- **How-to reorg:** Moved 14 how-to guides into 3 subdirectories (`deployment/`, `configuration/`, `operations/`), collapsed the Documentation and Database index sections into Configuration and Operations respectively

## Verification
- `docs-check-links` — all 180 wiki-links valid
- `docs-check-filenames` — all 90 filenames unique
- `dns-preview` — 5 resources unchanged, no drift
- All pre-commit hooks pass

## Test plan
- [ ] Verify docs site builds correctly with new paths
- [ ] Spot-check a few wiki-links from other pages to moved how-to guides

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/200
2026-02-17 07:29:33 -08:00
Forgejo Actions
8a48171acf Update docs release to v1.9.3
- Built changelog from towncrier fragments

[skip ci]
2026-02-16 21:25:47 -08:00
779b7d6709 Eliminate double towncrier run in release workflow (#199) v1.9.3
## Summary

- Added a new `build_quartz` Dagger function that builds the Quartz site from a pre-processed source tree (no towncrier)
- Reordered the release workflow so towncrier runs **once** on the runner, then passes the updated working tree to `build-quartz`
- `build_docs` and `build_changelog` are preserved for standalone use — `build_docs` now delegates to `build_quartz` internally

## Motivation

Previously towncrier ran twice per release: once inside a Dagger container (via `build_docs` → `build_changelog`) and once on the runner to capture CHANGELOG.md changes for the git commit. This was wasteful and fragile — if towncrier behavior changed, the two runs could produce different results.

## Test plan

- [ ] Review diff to confirm workflow step ordering is correct
- [ ] Trigger a release and confirm towncrier runs only once
- [ ] Verify the docs tarball contains the updated CHANGELOG.md
- [ ] `dagger call build-quartz --src=. --version=vX.Y.Z` should work standalone

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/199
2026-02-16 21:24:34 -08:00