Deploy Ollama LLM server on ringtail #277

Merged
eblume merged 5 commits from feature/ollama-ringtail into main 2026-03-02 20:39:52 -08:00
Owner

Summary

  • Deploy Ollama as a new ArgoCD-managed service on ringtail's k3s cluster with GPU acceleration
  • Declarative model management via models.txt + sidecar sync script (mirrors kiwix torrent pattern)
  • Initial models: qwen2.5:14b, deepseek-r1:14b, phi4:14b, gemma3:12b
  • hostPath PV on /mnt/storage1/ollama for fast local model storage (200Gi)
  • Tailscale ingress at ollama.ops.eblu.me for API access from tailnet
  • Enable GPU time-slicing (replicas: 2) on nvidia-device-plugin so Frigate and Ollama share the RTX 4080

Deployment and Testing

  • Deploy nvidia-device-plugin changes first: argocd app sync nvidia-device-plugin
  • Verify GPU time-slicing: kubectl describe node ringtail --context=k3s-ringtail shows nvidia.com/gpu: 2
  • Sync apps app with --revision feature/ollama-ringtail
  • Set ollama app to branch: argocd app set ollama --revision feature/ollama-ringtail && argocd app sync ollama
  • Verify model-sync sidecar pulls models: kubectl logs -n ollama deploy/ollama -c model-sync --context=k3s-ringtail
  • Test API: curl https://ollama.ops.eblu.me/api/tags
  • Test inference: curl https://ollama.ops.eblu.me/api/generate -d '{"model":"qwen2.5:14b","prompt":"Hello"}'
  • Verify Frigate still works after GPU sharing change
  • After merge: argocd app set ollama --revision main && argocd app sync ollama
## Summary - Deploy Ollama as a new ArgoCD-managed service on ringtail's k3s cluster with GPU acceleration - Declarative model management via `models.txt` + sidecar sync script (mirrors kiwix torrent pattern) - Initial models: `qwen2.5:14b`, `deepseek-r1:14b`, `phi4:14b`, `gemma3:12b` - hostPath PV on `/mnt/storage1/ollama` for fast local model storage (200Gi) - Tailscale ingress at `ollama.ops.eblu.me` for API access from tailnet - Enable GPU time-slicing (`replicas: 2`) on nvidia-device-plugin so Frigate and Ollama share the RTX 4080 ## Deployment and Testing - [ ] Deploy nvidia-device-plugin changes first: `argocd app sync nvidia-device-plugin` - [ ] Verify GPU time-slicing: `kubectl describe node ringtail --context=k3s-ringtail` shows `nvidia.com/gpu: 2` - [ ] Sync `apps` app with `--revision feature/ollama-ringtail` - [ ] Set ollama app to branch: `argocd app set ollama --revision feature/ollama-ringtail && argocd app sync ollama` - [ ] Verify model-sync sidecar pulls models: `kubectl logs -n ollama deploy/ollama -c model-sync --context=k3s-ringtail` - [ ] Test API: `curl https://ollama.ops.eblu.me/api/tags` - [ ] Test inference: `curl https://ollama.ops.eblu.me/api/generate -d '{"model":"qwen2.5:14b","prompt":"Hello"}'` - [ ] Verify Frigate still works after GPU sharing change - [ ] After merge: `argocd app set ollama --revision main && argocd app sync ollama`
Add Ollama as a new ArgoCD-managed service on ringtail's k3s cluster:
- Deployment with main ollama container and model-sync sidecar
- Declarative model list (qwen2.5:14b, deepseek-r1:14b, phi4:14b, gemma3:12b)
- hostPath PV on /mnt/storage1/ollama for fast local model storage
- Tailscale ingress at ollama.ops.eblu.me
- Enable GPU time-slicing (replicas: 2) on nvidia-device-plugin so
  Frigate and Ollama can share the RTX 4080

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ -0,0 +81,4 @@
- name: sync-script
configMap:
name: ollama-sync-script
defaultMode: 493
Author
Owner

this... can't be right. there must be a way to use octal or u=,g=,o= or something. Try a string?

this... can't be right. there must be a way to use octal or u=,g=,o= or something. Try a string?
eblume marked this conversation as resolved
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The ollama/ollama container image doesn't include curl. Use `ollama list`
and `ollama pull` commands directly, which are always available.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Proxies to the Tailscale ingress at ollama.tail8d86e.ts.net.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
eblume merged commit 31d925814f into main 2026-03-02 20:39:52 -08:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
eblume/blumeops!277
No description provided.