## Summary - Enable NVIDIA container toolkit on ringtail NixOS and configure k3s containerd with nvidia runtime - Add NVIDIA device plugin ArgoCD app (RuntimeClass + DaemonSet) to expose `nvidia.com/gpu` resources - Re-target Frigate from indri minikube (arm64, ZMQ detector) to ringtail k3s (x86_64, TensorRT/ONNX) - Switch Frigate image to `-tensorrt` variant with GPU resource limits and increased shared memory ## Manual Prerequisites 1. **NFS access**: Verify ringtail can mount `sifaka:/volume1/frigate` ```fish ssh ringtail 'sudo mount -t nfs sifaka:/volume1/frigate /mnt/storage1 && ls /mnt/storage1 && sudo umount /mnt/storage1' ``` 2. **YOLO model**: Verify `/volume1/frigate/models/yolov9m.onnx` exists on sifaka ## Deployment Steps 1. Provision ringtail: `mise run provision-ringtail` 2. Sync ArgoCD apps: `argocd app sync apps --prune` 3. Deploy NVIDIA device plugin: `argocd app sync nvidia-device-plugin` 4. Verify GPU: `kubectl --context=k3s-ringtail get nodes -o json | jq '.items[].status.capacity'` 5. Deploy Frigate: `argocd app sync frigate` ## Verification - [ ] `nvidia.com/gpu: 1` visible in node capacity - [ ] Frigate pod running with GPU allocated - [ ] Frigate UI loads at `https://nvr.ops.eblu.me` - [ ] Detector shows ONNX/TensorRT on System page - [ ] Camera feed with bounding boxes in live view - [ ] TensorRT engine build completes (watch logs on first start) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/217
51 lines
1.3 KiB
YAML
51 lines
1.3 KiB
YAML
---
|
|
apiVersion: apps/v1
|
|
kind: DaemonSet
|
|
metadata:
|
|
name: nvidia-device-plugin
|
|
namespace: nvidia-device-plugin
|
|
labels:
|
|
app: nvidia-device-plugin
|
|
spec:
|
|
selector:
|
|
matchLabels:
|
|
app: nvidia-device-plugin
|
|
template:
|
|
metadata:
|
|
labels:
|
|
app: nvidia-device-plugin
|
|
spec:
|
|
tolerations:
|
|
- key: nvidia.com/gpu
|
|
operator: Exists
|
|
effect: NoSchedule
|
|
priorityClassName: system-node-critical
|
|
containers:
|
|
- name: nvidia-device-plugin
|
|
image: nvcr.io/nvidia/k8s-device-plugin:v0.18.2
|
|
args:
|
|
- --device-id-strategy=index
|
|
env:
|
|
- name: LD_LIBRARY_PATH
|
|
value: /run/nvidia/lib
|
|
securityContext:
|
|
privileged: true
|
|
volumeMounts:
|
|
- name: device-plugins
|
|
mountPath: /var/lib/kubelet/device-plugins
|
|
- name: cdi-specs
|
|
mountPath: /var/run/cdi
|
|
readOnly: true
|
|
- name: nvidia-libs
|
|
mountPath: /run/nvidia/lib
|
|
readOnly: true
|
|
volumes:
|
|
- name: device-plugins
|
|
hostPath:
|
|
path: /var/lib/kubelet/device-plugins
|
|
- name: cdi-specs
|
|
hostPath:
|
|
path: /var/run/cdi
|
|
- name: nvidia-libs
|
|
hostPath:
|
|
path: /etc/nvidia-driver/lib
|