blumeops/docs/reference/kubernetes/cluster.md
Erich Blume bb55fa9566 Recurring review sweep: 4 doc cards + nvidia-device-plugin v0.19.2 (#366)
Knocks out the two daily recurring review tasks (doc review + service review) in one PR.

## Doc review (4 never-reviewed reference cards, `last-reviewed: 2026-06-04`)
- **cluster.md** — Kubernetes version v1.34.0 → **v1.35.0**; refreshed the stale ringtail workload list and noted the in-progress minikube→k3s migration (points to `[[ringtail]]` as the canonical list).
- **ntfy.md / tempo.md / alloy.md** — corrected image references: these are now **locally-built `registry.ops.eblu.me/blumeops/*` nix containers** (ntfy v2.19.2, tempo v2.10.3, alloy-k8s v1.16.0), not upstream Docker Hub. Fly.io alloy binary bumped to v1.16.1.

## Service review
- **nvidia-device-plugin** (ringtail GPU): v0.19.0 → **v0.19.2**. Upstream patch releases — CDI/Tegra fixes + dependency bumps, no breaking changes for our manifest-based CDI + RuntimeClass setup (the service-account change in the notes is helm-only).

## Not in this PR (need container rebuilds, deferred)
The other stale services are locally-built nix images, so upgrading them is a forge-runner rebuild rather than a clean tag bump — left untouched (not date-bumped, so they resurface): **prometheus** (v3.10.0→v3.12.0), **loki** (3.6.7→3.7.2), **kube-state-metrics**, **homepage**. Happy to do these as a follow-up rebuild PR.

## Deploy / verify
Not yet deployed — `nvidia-device-plugin` still points at `main`. After review:
```
argocd app set nvidia-device-plugin --revision reviews-jun4 && argocd app sync nvidia-device-plugin
# after merge:
argocd app set nvidia-device-plugin --revision main && argocd app sync nvidia-device-plugin
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #366
2026-06-04 13:37:02 -07:00

1.8 KiB

title modified last-reviewed tags
Cluster 2026-06-04 2026-06-04
kubernetes

Kubernetes Cluster

BlumeOps runs two Kubernetes clusters: a Minikube cluster on indri (most services) and a k3s cluster on ringtail (GPU workloads, notifications). Both are managed by argocd on indri.

Cluster Specifications

Property Value
Driver docker
Container Runtime docker
Kubernetes Version v1.35.0
CPUs 6
Memory 11GB
Disk 200GB
API Server https://k8s.tail8d86e.ts.net

Prerequisites: Docker Desktop with at least 12GB memory allocated.

Volume Mounting

Pods mount NFS directly from sifaka. Docker NATs outbound traffic through indri's LAN IP (192.168.1.50), allowing access to Sifaka's NFS exports.

Registry Mirror

Containerd uses zot as a pull-through cache at host.minikube.internal:5050.

Mirrors configured: registry.ops.eblu.me, docker.io, ghcr.io, quay.io

K3s on Ringtail

Single-node k3s cluster for workloads requiring amd64 or GPU access. See ringtail for cluster specs, workload list, and secrets management.

Property Value
Context k3s-ringtail
API Server https://ringtail.tail8d86e.ts.net:6443
Workloads GPU workloads (Frigate, Ollama), notifications (ntfy, frigate-notify), authentik, and services migrated off indri minikube (Immich, Mealie, Paperless, TeslaMate). See ringtail for the authoritative list.

Services are being progressively migrated from indri's minikube to ringtail's k3s; the split above reflects an in-progress state, not a fixed boundary.

  • apps - ArgoCD applications
  • argocd - GitOps deployment
  • zot - Registry mirror