Deploy Ollama LLM server on ringtail #277

Merged
eblume merged 5 commits from feature/ollama-ringtail into main 2026-03-02 20:39:52 -08:00

5 commits

Author SHA1 Message Date
bc5aa65491 Add ollama.ops.eblu.me to Caddy reverse proxy
Proxies to the Tailscale ingress at ollama.tail8d86e.ts.net.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 20:37:12 -08:00
dd678e7454 Use ollama CLI instead of curl in sync script
The ollama/ollama container image doesn't include curl. Use `ollama list`
and `ollama pull` commands directly, which are always available.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 20:20:19 -08:00
07376cc970 Fix nvidia-device-plugin config flag: --config-file not --config
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 20:10:19 -08:00
9cb235ab8e Use octal 0755 for defaultMode with yamllint inline disable
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 20:05:27 -08:00
0d1c2eb81a Deploy Ollama LLM server on ringtail with GPU time-slicing
Add Ollama as a new ArgoCD-managed service on ringtail's k3s cluster:
- Deployment with main ollama container and model-sync sidecar
- Declarative model list (qwen2.5:14b, deepseek-r1:14b, phi4:14b, gemma3:12b)
- hostPath PV on /mnt/storage1/ollama for fast local model storage
- Tailscale ingress at ollama.ops.eblu.me
- Enable GPU time-slicing (replicas: 2) on nvidia-device-plugin so
  Frigate and Ollama can share the RTX 4080

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 20:01:45 -08:00