blumeops

History

Erich Blume 6d4929a66c Add qwen3.5:27b to Ollama and bump memory limit to 22Gi The 27B Q4_K_M model is ~17 GB, exceeding the 16 GB VRAM on the RTX 4080 by ~1 GB. Ollama will offload a few layers to CPU RAM, so the pod memory limit needs headroom beyond the previous 16Gi. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>		2026-03-11 18:55:51 -07:00
..
deployment.yaml	Add qwen3.5:27b to Ollama and bump memory limit to 22Gi	2026-03-11 18:55:51 -07:00
ingress-tailscale.yaml	Deploy Ollama LLM server on ringtail (#277 )	2026-03-02 20:39:51 -08:00
kustomization.yaml	Remove ollama LAN NodePort service	2026-03-03 10:00:05 -08:00
models.txt	Add qwen3.5:27b to Ollama and bump memory limit to 22Gi	2026-03-11 18:55:51 -07:00
pv-hostpath.yaml	Deploy Ollama LLM server on ringtail (#277 )	2026-03-02 20:39:51 -08:00
pvc.yaml	Deploy Ollama LLM server on ringtail (#277 )	2026-03-02 20:39:51 -08:00
service.yaml	Deploy Ollama LLM server on ringtail (#277 )	2026-03-02 20:39:51 -08:00
sync-models.sh	Deploy Ollama LLM server on ringtail (#277 )	2026-03-02 20:39:51 -08:00