## Summary - Deploy Ollama as a new ArgoCD-managed service on ringtail's k3s cluster with GPU acceleration - Declarative model management via `models.txt` + sidecar sync script (mirrors kiwix torrent pattern) - Initial models: `qwen2.5:14b`, `deepseek-r1:14b`, `phi4:14b`, `gemma3:12b` - hostPath PV on `/mnt/storage1/ollama` for fast local model storage (200Gi) - Tailscale ingress at `ollama.ops.eblu.me` for API access from tailnet - Enable GPU time-slicing (`replicas: 2`) on nvidia-device-plugin so Frigate and Ollama share the RTX 4080 ## Deployment and Testing - [ ] Deploy nvidia-device-plugin changes first: `argocd app sync nvidia-device-plugin` - [ ] Verify GPU time-slicing: `kubectl describe node ringtail --context=k3s-ringtail` shows `nvidia.com/gpu: 2` - [ ] Sync `apps` app with `--revision feature/ollama-ringtail` - [ ] Set ollama app to branch: `argocd app set ollama --revision feature/ollama-ringtail && argocd app sync ollama` - [ ] Verify model-sync sidecar pulls models: `kubectl logs -n ollama deploy/ollama -c model-sync --context=k3s-ringtail` - [ ] Test API: `curl https://ollama.ops.eblu.me/api/tags` - [ ] Test inference: `curl https://ollama.ops.eblu.me/api/generate -d '{"model":"qwen2.5:14b","prompt":"Hello"}'` - [ ] Verify Frigate still works after GPU sharing change - [ ] After merge: `argocd app set ollama --revision main && argocd app sync ollama` Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/277
26 lines
660 B
YAML
26 lines
660 B
YAML
---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: Ingress
|
|
metadata:
|
|
name: ollama-tailscale
|
|
namespace: ollama
|
|
annotations:
|
|
tailscale.com/proxy-class: "default"
|
|
tailscale.com/proxy-group: "ingress"
|
|
gethomepage.dev/enabled: "true"
|
|
gethomepage.dev/name: "Ollama"
|
|
gethomepage.dev/group: "AI"
|
|
gethomepage.dev/icon: "ollama.png"
|
|
gethomepage.dev/description: "LLM inference server"
|
|
gethomepage.dev/href: "https://ollama.ops.eblu.me"
|
|
gethomepage.dev/pod-selector: "app=ollama"
|
|
spec:
|
|
ingressClassName: tailscale
|
|
defaultBackend:
|
|
service:
|
|
name: ollama
|
|
port:
|
|
number: 11434
|
|
tls:
|
|
- hosts:
|
|
- ollama
|