blumeops/service-versions.yaml
Erich Blume d5d32fe91f Port Frigate NVR to ringtail k3s with GPU acceleration (#217)
## Summary

- Enable NVIDIA container toolkit on ringtail NixOS and configure k3s containerd with nvidia runtime
- Add NVIDIA device plugin ArgoCD app (RuntimeClass + DaemonSet) to expose `nvidia.com/gpu` resources
- Re-target Frigate from indri minikube (arm64, ZMQ detector) to ringtail k3s (x86_64, TensorRT/ONNX)
- Switch Frigate image to `-tensorrt` variant with GPU resource limits and increased shared memory

## Manual Prerequisites

1. **NFS access**: Verify ringtail can mount `sifaka:/volume1/frigate`
   ```fish
   ssh ringtail 'sudo mount -t nfs sifaka:/volume1/frigate /mnt/storage1 && ls /mnt/storage1 && sudo umount /mnt/storage1'
   ```
2. **YOLO model**: Verify `/volume1/frigate/models/yolov9m.onnx` exists on sifaka

## Deployment Steps

1. Provision ringtail: `mise run provision-ringtail`
2. Sync ArgoCD apps: `argocd app sync apps --prune`
3. Deploy NVIDIA device plugin: `argocd app sync nvidia-device-plugin`
4. Verify GPU: `kubectl --context=k3s-ringtail get nodes -o json | jq '.items[].status.capacity'`
5. Deploy Frigate: `argocd app sync frigate`

## Verification

- [ ] `nvidia.com/gpu: 1` visible in node capacity
- [ ] Frigate pod running with GPU allocated
- [ ] Frigate UI loads at `https://nvr.ops.eblu.me`
- [ ] Detector shows ONNX/TensorRT on System page
- [ ] Camera feed with bounding boxes in live view
- [ ] TensorRT engine build completes (watch logs on first start)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/217
2026-02-19 14:27:04 -08:00

241 lines
6.5 KiB
YAML

# Service Version Tracking
#
# Tracks when each BlumeOps service was last reviewed for version freshness.
# Used by `mise run service-review` to surface stale services.
#
# Fields:
# name - kebab-case service identifier
# type - argocd | ansible | hybrid (custom container + ArgoCD)
# last-reviewed - date (YYYY-MM-DD) or null
# current-version - deployed version string or null
# upstream-source - URL to upstream releases/changelog
# notes - optional context
services:
# --- ArgoCD plain manifests ---
- name: prometheus
type: argocd
last-reviewed: 2026-02-16
current-version: "v3.9.1"
upstream-source: https://github.com/prometheus/prometheus/releases
- name: loki
type: argocd
last-reviewed: 2026-02-16
current-version: "3.6.5"
upstream-source: https://github.com/grafana/loki/releases
- name: kube-state-metrics
type: argocd
last-reviewed: 2026-02-16
current-version: "v2.18.0"
upstream-source: https://github.com/kubernetes/kube-state-metrics/releases
- name: mosquitto
type: argocd
last-reviewed: 2026-02-16
current-version: "2.0.22"
upstream-source: https://github.com/eclipse/mosquitto/releases
- name: ntfy
type: argocd
last-reviewed: 2026-02-17
current-version: "v2.17.0"
upstream-source: https://github.com/binwiederhier/ntfy/releases
- name: homepage
type: argocd
last-reviewed: null
current-version: null
upstream-source: https://github.com/gethomepage/homepage/releases
notes: Deployed via Helm chart
- name: nvidia-device-plugin
type: argocd
last-reviewed: 2026-02-19
current-version: "v0.18.2"
upstream-source: https://github.com/NVIDIA/k8s-device-plugin/releases
notes: DaemonSet + RuntimeClass on ringtail for GPU workloads
- name: frigate
type: argocd
last-reviewed: 2026-02-17
current-version: "0.17.0-rc2"
upstream-source: https://github.com/blakeblackshear/frigate/releases
- name: frigate-notify
type: argocd
last-reviewed: null
current-version: null
upstream-source: https://github.com/0x2142/frigate-notify/releases
- name: alloy-k8s
type: argocd
last-reviewed: 2026-02-16
current-version: "v1.13.1"
upstream-source: https://github.com/grafana/alloy/releases
- name: tailscale-operator
type: argocd
last-reviewed: 2026-02-16
current-version: "v1.94.2"
upstream-source: https://github.com/tailscale/tailscale/releases
# --- ArgoCD Helm charts ---
- name: grafana
type: argocd
last-reviewed: null
current-version: null
upstream-source: https://github.com/grafana/grafana/releases
notes: Deployed via Helm chart
- name: cloudnative-pg
type: argocd
last-reviewed: null
current-version: null
upstream-source: https://github.com/cloudnative-pg/cloudnative-pg/releases
notes: Deployed via Helm chart
- name: immich
type: argocd
last-reviewed: null
current-version: null
upstream-source: https://github.com/immich-app/immich/releases
notes: Deployed via Helm chart
- name: external-secrets
type: argocd
last-reviewed: 2026-02-17
current-version: "helm-chart-2.0.0"
upstream-source: https://github.com/external-secrets/external-secrets/releases
notes: Deployed via Helm chart (operator v1.3.2)
- name: 1password-connect
type: argocd
last-reviewed: null
current-version: null
upstream-source: https://github.com/1Password/connect/releases
notes: Deployed via Helm chart
# --- ArgoCD infra ---
- name: argocd
type: argocd
last-reviewed: null
current-version: null
upstream-source: https://github.com/argoproj/argo-cd/releases
- name: blumeops-pg
type: argocd
last-reviewed: null
current-version: null
upstream-source: https://github.com/cloudnative-pg/cloudnative-pg/releases
notes: CloudNativePG Cluster resource
# --- Hybrid (custom container + ArgoCD) ---
- name: navidrome
type: hybrid
last-reviewed: null
current-version: null
upstream-source: https://github.com/navidrome/navidrome/releases
- name: miniflux
type: hybrid
last-reviewed: null
current-version: null
upstream-source: https://github.com/miniflux/v2/releases
- name: teslamate
type: hybrid
last-reviewed: null
current-version: null
upstream-source: https://github.com/teslamate-org/teslamate/releases
- name: transmission
type: hybrid
last-reviewed: null
current-version: null
upstream-source: https://github.com/transmission/transmission/releases
- name: kiwix
type: hybrid
last-reviewed: null
current-version: null
upstream-source: https://github.com/kiwix/kiwix-tools/releases
- name: devpi
type: hybrid
last-reviewed: null
current-version: null
upstream-source: https://github.com/devpi/devpi/releases
- name: cv
type: hybrid
last-reviewed: null
current-version: null
upstream-source: null
notes: Personal static site, no upstream
- name: docs
type: hybrid
last-reviewed: null
current-version: null
upstream-source: https://github.com/jackyzha0/quartz/releases
notes: Quartz static site generator
- name: forgejo-runner
type: hybrid
last-reviewed: null
current-version: null
upstream-source: https://code.forgejo.org/forgejo/runner/releases
# --- Ansible native ---
- name: forgejo
type: ansible
last-reviewed: null
current-version: null
upstream-source: https://codeberg.org/forgejo/forgejo/releases
- name: alloy
type: ansible
last-reviewed: null
current-version: null
upstream-source: https://github.com/grafana/alloy/releases
notes: Built from source on indri
- name: zot
type: ansible
last-reviewed: null
current-version: null
upstream-source: https://github.com/project-zot/zot/releases
notes: Built from source on indri
- name: caddy
type: ansible
last-reviewed: null
current-version: null
upstream-source: https://github.com/caddyserver/caddy/releases
notes: Built from source with Gandi DNS plugin
- name: borgmatic
type: ansible
last-reviewed: null
current-version: null
upstream-source: https://github.com/borgmatic-collective/borgmatic/releases
- name: jellyfin
type: ansible
last-reviewed: null
current-version: null
upstream-source: https://github.com/jellyfin/jellyfin/releases
- name: automounter
type: ansible
last-reviewed: null
current-version: null
upstream-source: null
notes: Custom systemd service, no upstream