Revert webapi change — polling latency is too high for alerts.
MQTT with zone/label filters gives sub-second delivery. Add
TZ=America/Los_Angeles to frigate-notify for local timestamps.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
frigate-notify was firing on every MQTT detection event regardless of
zone, causing notification spam. Add filters to match the Frigate
review config: only alert for person/car in the driveway_entrance
zone, and drop all unzoned events.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create a stable symlink at /etc/nvidia-driver/lib pointing to the
nvidia driver package's lib directory. The device plugin now mounts
only the driver libs it needs instead of the entire nix store.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
These services moved to ringtail k3s and are no longer autodiscovered
by homepage (which runs on indri's minikube). Add them as static
service entries in the Infrastructure group.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The YOLOv9m model fails with CUDA graph capture on the tensorrt image.
Try YOLO-NAS-S which has a different architecture that may be fully
partitionable to the CUDA execution provider.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The YOLOv9m ONNX model has ops not fully partitionable to CUDA EP,
causing CUDA graph capture to fail on the -tensorrt image. Use the
default model that ships with the image and is tested for GPU inference.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
GPU resources can't be shared during rolling updates — the old pod
holds nvidia.com/gpu preventing the new pod from scheduling. Recreate
strategy ensures the old pod is terminated before the new one starts.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The CDI spec generated by NixOS uses index-based device names (0, all)
not UUIDs. The device plugin must match by using --device-id-strategy=index,
otherwise nvidia-container-runtime.cdi fails to resolve CDI devices.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the CDI device-list-strategy approach (which fails because the
device plugin generates its own CDI specs and can't find libs on NixOS)
with the nvidia-container-runtime.cdi runtime handler approach:
- Add wrapper script at /etc/nvidia-container-runtime/ that provides
runc in PATH for nvidia-container-runtime.cdi
- Register nvidia runtime handler in k3s containerd config
- Create RuntimeClass for GPU workloads
- Revert device plugin to default envvar strategy (already working)
- Add runtimeClassName: nvidia to Frigate deployment
The nvidia-container-runtime.cdi binary reads the NixOS-generated CDI
specs from /var/run/cdi/ and injects GPU devices and driver libraries
into containers at create time.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use CDI-based device injection instead of nvidia-container-runtime.
The NixOS nvidia-container-toolkit module generates CDI specs with all
the correct nix store paths, so containerd's native CDI support handles
GPU device and library injection without a custom runtime.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
NixOS /run/opengl-driver/lib contains symlinks to /nix/store paths.
Without mounting the nix store, the symlinks are dangling inside the
container and libnvidia-ml.so can't be loaded.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CDI annotations require NVML validation that fails on NixOS. Use the
default envvar strategy for the device plugin — CDI device injection
still works at the containerd level via enable_cdi=true.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
NVML needs both libnvidia-ml.so and /dev/nvidia* device nodes.
Mount libs to a non-clobbering path and run privileged (matching
NVIDIA's official deployment) for device file access.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
go-nvml uses dl.Open which looks in standard library paths.
Mount to /usr/lib/x86_64-linux-gnu for reliable discovery.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The device plugin needs libnvidia-ml.so to discover GPUs even when using
CDI annotations. Mount /run/opengl-driver/lib (NixOS NVIDIA lib path)
into the pod and set LD_LIBRARY_PATH.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
NixOS splits nvidia-container-toolkit into separate derivations, making
the nvidia-container-runtime binary path unreliable in containerd config.
CDI (Container Device Interface) is the modern approach:
- Enable CDI in k3s containerd config (cdi_spec_dirs: /var/run/cdi)
- Device plugin uses CDI annotations to inject GPU devices
- Remove RuntimeClass (not needed with CDI)
- Remove runtimeClassName from Frigate deployment
- Mount CDI specs into device plugin pod
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The device plugin needs access to NVIDIA libraries (NVML) to discover
GPUs. Running with the nvidia runtime makes device files visible.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
K3s ships containerd 2.0+ which uses config v3 format. The plugin key
path is 'io.containerd.cri.v1.runtime' not 'io.containerd.grpc.v1.cri'.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>