Fix services-check and update docs for Frigate migration to ringtail
Move mosquitto, ntfy, frigate, and frigate-notify pod checks from minikube-indri to k3s-ringtail context in services-check. Add nvidia-device-plugin check. Update documentation across 8 files to reflect the migration completed in PRs #216 and #217. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
d5d32fe91f
commit
703cbc59e7
10 changed files with 62 additions and 29 deletions
|
|
@ -14,7 +14,7 @@ blumeops is Erich Blume's GitOps repository for personal infrastructure, orchest
|
|||
|
||||
1. **Always run `mise run zk-docs -- --style=header --color=never --decorations=always` at session start**
|
||||
This will refresh your context with important information you will be assumed to know and follow.
|
||||
2. **Always use `--context=minikube-indri` with kubectl** - work contexts must never be touched
|
||||
2. **Always use `--context=minikube-indri` with kubectl** (or `--context=k3s-ringtail` for ringtail services) - work contexts must never be touched
|
||||
3. **Feature branches only** - checkout main, pull, create branch, commit often
|
||||
4. **Create PRs via `tea pr create`** - user reviews before deploy, merges after
|
||||
5. **Check PR comments with `mise run pr-comments <pr_number>`** before proceeding
|
||||
|
|
@ -52,7 +52,7 @@ encounter wiki-links (`[[like-this]]`) it is referring to docs/ cards.
|
|||
|
||||
### Kubernetes (ArgoCD)
|
||||
|
||||
Most services run in minikube on indri via ArgoCD (app-of-apps, manual sync).
|
||||
Most services run in minikube on indri via ArgoCD (app-of-apps, manual sync). GPU workloads (Frigate, Mosquitto, ntfy) run on ringtail's k3s cluster, also managed by ArgoCD.
|
||||
|
||||
**PR workflow:**
|
||||
1. Create branch, modify `argocd/manifests/<service>/`
|
||||
|
|
|
|||
1
docs/changelog.d/fix-services-check-ringtail-docs.doc.md
Normal file
1
docs/changelog.d/fix-services-check-ringtail-docs.doc.md
Normal file
|
|
@ -0,0 +1 @@
|
|||
Update services-check and documentation to reflect Frigate, Mosquitto, and ntfy migration from indri minikube to ringtail k3s (PRs #216, #217).
|
||||
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
title: Architecture
|
||||
modified: 2026-02-09
|
||||
modified: 2026-02-19
|
||||
last-reviewed: 2026-02-09
|
||||
tags:
|
||||
- explanation
|
||||
|
|
@ -15,7 +15,7 @@ How all the BlumeOps pieces fit together.
|
|||
|
||||
## Physical Layer
|
||||
|
||||
Two always-on devices form the infrastructure backbone:
|
||||
Three always-on devices form the infrastructure backbone:
|
||||
|
||||
```
|
||||
┌─────────────────┐ ┌─────────────────┐
|
||||
|
|
@ -23,8 +23,13 @@ Two always-on devices form the infrastructure backbone:
|
|||
│ Mac Mini M1 │────▶│ Synology NAS │
|
||||
│ (compute) │ │ (storage) │
|
||||
└─────────────────┘ └─────────────────┘
|
||||
│
|
||||
│ Tailscale
|
||||
│ ▲
|
||||
│ Tailscale │ NFS
|
||||
│ ┌──────┴──────────┐
|
||||
│ │ Ringtail │
|
||||
│ │ NixOS PC │
|
||||
│ │ (GPU compute) │
|
||||
│ └─────────────────┘
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ Gilbert │
|
||||
|
|
@ -33,7 +38,8 @@ Two always-on devices form the infrastructure backbone:
|
|||
└─────────────────┘
|
||||
```
|
||||
|
||||
- **[[indri]]** runs all services (native and containerized)
|
||||
- **[[indri]]** runs most services (native and containerized)
|
||||
- **[[ringtail]]** runs GPU workloads (Frigate NVR) and related services (MQTT, ntfy)
|
||||
- **[[sifaka]]** provides bulk storage and backup targets
|
||||
- **[[gilbert]]** is the development workstation
|
||||
|
||||
|
|
@ -61,11 +67,13 @@ See [[routing]] for the full service URL table and port map.
|
|||
|
||||
## Compute Layer
|
||||
|
||||
Services run in two places on [[indri]]:
|
||||
Services run across three compute targets:
|
||||
|
||||
**Native (Ansible)** — services that need host-level access run directly on macOS, managed via Ansible roles in `ansible/roles/`. See [[indri]] for the full list.
|
||||
**Native on indri (Ansible)** — services that need host-level access run directly on macOS, managed via Ansible roles in `ansible/roles/`. See [[indri]] for the full list.
|
||||
|
||||
**Kubernetes (ArgoCD)** — most services run in minikube, managed via ArgoCD from `argocd/manifests/`. See [[apps]] for the application registry.
|
||||
**Minikube on indri (ArgoCD)** — most services run in minikube, managed via ArgoCD from `argocd/manifests/`. See [[apps]] for the application registry.
|
||||
|
||||
**K3s on ringtail (ArgoCD)** — GPU workloads and related services run on [[ringtail]]'s single-node k3s cluster. Frigate NVR uses the RTX 4080 for object detection; Mosquitto and ntfy support its alerting pipeline.
|
||||
|
||||
## Data Flow
|
||||
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
title: "Plan: Operationalize ReoLink Camera"
|
||||
modified: 2026-02-11
|
||||
modified: 2026-02-19
|
||||
tags:
|
||||
- how-to
|
||||
- plans
|
||||
|
|
@ -277,6 +277,10 @@ Camera settings to apply: enable RTSP and ONVIF, set "fluency first" encoding mo
|
|||
| `argocd/manifests/prometheus/configmap.yaml` | Prometheus scrape target config |
|
||||
| `docs/reference/storage/sifaka.md` | NFS export documentation |
|
||||
|
||||
## Post-Completion Update
|
||||
|
||||
Frigate, Mosquitto, and ntfy were migrated from indri's minikube to [[ringtail]]'s k3s cluster with RTX 4080 GPU acceleration (PRs #216, #217). The ZMQ Apple Silicon Detector has been retired in favour of ONNX with CUDA execution provider. Object detection now runs on the GPU rather than CPU.
|
||||
|
||||
## Related
|
||||
|
||||
- [[add-unifi-pulumi-stack]] — network segmentation (IoT VLAN for camera)
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
title: Indri
|
||||
modified: 2026-02-09
|
||||
modified: 2026-02-19
|
||||
tags:
|
||||
- infrastructure
|
||||
- host
|
||||
|
|
@ -32,7 +32,7 @@ Primary BlumeOps server. Mac Mini M1 (2020).
|
|||
- [[caddy]] - Reverse proxy for `*.ops.eblu.me`
|
||||
|
||||
**Kubernetes (via minikube):**
|
||||
- [[apps|All k8s applications]]
|
||||
- [[apps|Most k8s applications]] (Frigate, Mosquitto, ntfy migrated to [[ringtail]] k3s)
|
||||
|
||||
**GUI Applications (manual start required):**
|
||||
- Docker Desktop - Container runtime for minikube
|
||||
|
|
|
|||
|
|
@ -63,7 +63,13 @@ Sync order: `1password-connect-ringtail` -> `external-secrets-crds-ringtail` ->
|
|||
|
||||
### Workloads
|
||||
|
||||
No k8s workloads currently deployed. K3s is available for future workloads (e.g. Frigate, running nix-built containers).
|
||||
| Workload | Namespace | Notes |
|
||||
|----------|-----------|-------|
|
||||
| [[frigate]] | `frigate` | NVR with GPU-accelerated detection (RTX 4080) |
|
||||
| [[frigate]]-notify | `frigate` | MQTT-to-ntfy alert bridge |
|
||||
| Mosquitto | `mqtt` | MQTT broker for Frigate events |
|
||||
| [[ntfy]] | `ntfy` | Push notification server |
|
||||
| nvidia-device-plugin | `nvidia-device-plugin` | Exposes GPU to pods via CDI + nvidia RuntimeClass |
|
||||
|
||||
### Manual Cluster Registration
|
||||
|
||||
|
|
|
|||
|
|
@ -1,13 +1,13 @@
|
|||
---
|
||||
title: Cluster
|
||||
modified: 2026-02-07
|
||||
modified: 2026-02-19
|
||||
tags:
|
||||
- kubernetes
|
||||
---
|
||||
|
||||
# Kubernetes Cluster
|
||||
|
||||
Single-node Minikube cluster running on [[indri]].
|
||||
BlumeOps runs two Kubernetes clusters: a Minikube cluster on [[indri]] (most services) and a k3s cluster on [[ringtail]] (GPU workloads, MQTT, notifications). Both are managed by [[argocd]] on indri.
|
||||
|
||||
## Cluster Specifications
|
||||
|
||||
|
|
@ -33,6 +33,16 @@ Containerd uses [[zot]] as a pull-through cache at `host.minikube.internal:5050`
|
|||
|
||||
Mirrors configured: `registry.ops.eblu.me`, `docker.io`, `ghcr.io`, `quay.io`
|
||||
|
||||
## K3s on Ringtail
|
||||
|
||||
Single-node k3s cluster for workloads requiring amd64 or GPU access. See [[ringtail]] for cluster specs, workload list, and secrets management.
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| **Context** | `k3s-ringtail` |
|
||||
| **API Server** | `https://ringtail.tail8d86e.ts.net:6443` |
|
||||
| **Workloads** | Frigate (GPU), Mosquitto, ntfy, frigate-notify, nvidia-device-plugin |
|
||||
|
||||
## Related
|
||||
|
||||
- [[apps|Apps]] - ArgoCD applications
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
title: Reference
|
||||
modified: 2026-02-17
|
||||
modified: 2026-02-19
|
||||
tags:
|
||||
- reference
|
||||
---
|
||||
|
|
@ -21,7 +21,7 @@ Individual service reference cards with URLs and configuration details.
|
|||
| [[caddy]] | Reverse proxy & TLS termination | indri |
|
||||
| [[1password]] | Secrets management | cloud + k8s |
|
||||
| [[forgejo]] | Git forge & CI/CD | indri |
|
||||
| [[frigate]] | Network video recorder | k8s |
|
||||
| [[frigate]] | Network video recorder | k8s (ringtail) |
|
||||
| [[grafana]] | Dashboards & visualization | k8s |
|
||||
| [[immich]] | Photo management | k8s |
|
||||
| [[jellyfin]] | Media server | indri |
|
||||
|
|
@ -29,7 +29,7 @@ Individual service reference cards with URLs and configuration details.
|
|||
| [[loki]] | Log aggregation | k8s |
|
||||
| [[miniflux]] | RSS feed reader | k8s |
|
||||
| [[navidrome]] | Music streaming | k8s |
|
||||
| [[ntfy]] | Push notifications | k8s |
|
||||
| [[ntfy]] | Push notifications | k8s (ringtail) |
|
||||
| [[postgresql]] | Database cluster | k8s |
|
||||
| [[prometheus]] | Metrics collection | k8s |
|
||||
| [[teslamate]] | Tesla data logger | k8s |
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
title: Frigate
|
||||
modified: 2026-02-17
|
||||
modified: 2026-02-19
|
||||
tags:
|
||||
- service
|
||||
- surveillance
|
||||
|
|
@ -17,7 +17,7 @@ Open-source network video recorder (NVR) with object detection. Runs cloud-free
|
|||
| **URL** | https://nvr.ops.eblu.me |
|
||||
| **Tailscale URL** | https://nvr.tail8d86e.ts.net |
|
||||
| **Namespace** | `frigate` |
|
||||
| **Image** | `ghcr.io/blakeblackshear/frigate:0.17.0-rc2-standard-arm64` |
|
||||
| **Image** | `ghcr.io/blakeblackshear/frigate:0.17.0-rc2-tensorrt` |
|
||||
| **Upstream** | https://github.com/blakeblackshear/frigate |
|
||||
| **Manifests** | `argocd/manifests/frigate/` |
|
||||
|
||||
|
|
@ -30,7 +30,7 @@ ReoLink Camera (GableCam)
|
|||
Frigate pod (ringtail k3s)
|
||||
├── go2rtc — RTSP restream proxy
|
||||
├── FFmpeg — stream decoding
|
||||
├── detector — GPU-accelerated (RTX 4080, pending migration)
|
||||
├── detector — ONNX with CUDA (RTX 4080)
|
||||
├── /media/frigate — NFS recordings (sifaka)
|
||||
└── /db — SQLite (local PVC)
|
||||
│
|
||||
|
|
@ -47,7 +47,7 @@ Camera credentials are stored in 1Password and synced via [[external-secrets]] t
|
|||
|
||||
## Detection
|
||||
|
||||
Object detection will use GPU-accelerated inference on [[ringtail]]'s RTX 4080 (migration pending). The previous Apple Silicon Detector on [[indri]] has been retired.
|
||||
Object detection runs on [[ringtail]]'s RTX 4080 via the ONNX detector with CUDA execution provider. The model is YOLO-NAS-S (`yolo_nas_s.onnx`). The previous Apple Silicon Detector on [[indri]] has been retired.
|
||||
|
||||
Two zones are configured: `driveway_entrance` (triggers review alerts for person/car) and `driveway` (triggers review detections).
|
||||
|
||||
|
|
@ -66,7 +66,7 @@ Two zones are configured: `driveway_entrance` (triggers review alerts for person
|
|||
|-------|---------|------|
|
||||
| `/media/frigate` | NFS PV on [[sifaka]] (`/volume1/frigate`) | 2 Ti |
|
||||
| `/db` | Local PVC (`frigate-database`) | SQLite |
|
||||
| `/dev/shm` | Memory-backed `emptyDir` | 256 Mi |
|
||||
| `/dev/shm` | Memory-backed `emptyDir` | 512 Mi |
|
||||
|
||||
## Alerting (frigate-notify)
|
||||
|
||||
|
|
|
|||
|
|
@ -91,6 +91,14 @@ check_service "k3s" "ssh ringtail 'KUBECONFIG=/etc/rancher/k3s/k3s.yaml k3s kube
|
|||
check_service "k3s-apiserver (remote)" "kubectl --context=k3s-ringtail get --raw /healthz"
|
||||
check_service "forgejo-runner" "ssh ringtail 'systemctl is-active gitea-runner-nix_container_builder.service'"
|
||||
|
||||
echo ""
|
||||
echo "Ringtail k3s pods:"
|
||||
check_service "mosquitto" "kubectl --context=k3s-ringtail -n mqtt get pods -l app=mosquitto -o jsonpath='{.items[0].status.phase}' | grep -q Running"
|
||||
check_service "ntfy" "kubectl --context=k3s-ringtail -n ntfy get pods -l app=ntfy -o jsonpath='{.items[0].status.phase}' | grep -q Running"
|
||||
check_service "frigate" "kubectl --context=k3s-ringtail -n frigate get pods -l app=frigate -o jsonpath='{.items[0].status.phase}' | grep -q Running"
|
||||
check_service "frigate-notify" "kubectl --context=k3s-ringtail -n frigate get pods -l app=frigate-notify -o jsonpath='{.items[0].status.phase}' | grep -q Running"
|
||||
check_service "nvidia-device-plugin" "kubectl --context=k3s-ringtail -n nvidia-device-plugin get pods -l app=nvidia-device-plugin -o jsonpath='{.items[0].status.phase}' | grep -q Running"
|
||||
|
||||
echo ""
|
||||
echo "Public services (via Fly.io):"
|
||||
check_http "Docs (public)" "https://docs.eblu.me/"
|
||||
|
|
@ -102,17 +110,13 @@ echo "Database:"
|
|||
check_service "PostgreSQL (k8s)" "pg_isready -h pg.ops.eblu.me -p 5432"
|
||||
|
||||
echo ""
|
||||
echo "Kubernetes pods:"
|
||||
echo "Indri minikube pods:"
|
||||
check_service "prometheus-0" "kubectl --context=minikube-indri -n monitoring get pod prometheus-0 -o jsonpath='{.status.phase}' | grep -q Running"
|
||||
check_service "loki-0" "kubectl --context=minikube-indri -n monitoring get pod loki-0 -o jsonpath='{.status.phase}' | grep -q Running"
|
||||
check_service "grafana" "kubectl --context=minikube-indri -n monitoring get pods -l app.kubernetes.io/name=grafana -o jsonpath='{.items[0].status.phase}' | grep -q Running"
|
||||
check_service "miniflux" "kubectl --context=minikube-indri -n miniflux get pods -l app=miniflux -o jsonpath='{.items[0].status.phase}' | grep -q Running"
|
||||
check_service "teslamate" "kubectl --context=minikube-indri -n teslamate get pods -l app=teslamate -o jsonpath='{.items[0].status.phase}' | grep -q Running"
|
||||
check_service "blumeops-pg" "kubectl --context=minikube-indri -n databases get pods -l cnpg.io/cluster=blumeops-pg -o jsonpath='{.items[0].status.phase}' | grep -q Running"
|
||||
check_service "mosquitto" "kubectl --context=minikube-indri -n mqtt get pods -l app=mosquitto -o jsonpath='{.items[0].status.phase}' | grep -q Running"
|
||||
check_service "ntfy" "kubectl --context=minikube-indri -n ntfy get pods -l app=ntfy -o jsonpath='{.items[0].status.phase}' | grep -q Running"
|
||||
check_service "frigate" "kubectl --context=minikube-indri -n frigate get pods -l app=frigate -o jsonpath='{.items[0].status.phase}' | grep -q Running"
|
||||
check_service "frigate-notify" "kubectl --context=minikube-indri -n frigate get pods -l app=frigate-notify -o jsonpath='{.items[0].status.phase}' | grep -q Running"
|
||||
|
||||
echo ""
|
||||
echo "ArgoCD app sync status:"
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue