blumeops/argocd/manifests/ollama/models.txt at c86b5d777254154024b47799a84a5193d646c184 - eblume/blumeops - Forgejo: Beyond coding. We Forge.

eblume/blumeops

Erich Blume 6d4929a66c Add qwen3.5:27b to Ollama and bump memory limit to 22Gi

The 27B Q4_K_M model is ~17 GB, exceeding the 16 GB VRAM on the RTX 4080
by ~1 GB. Ollama will offload a few layers to CPU RAM, so the pod memory
limit needs headroom beyond the previous 16Gi.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-11 18:55:51 -07:00

8 lines

148 B

Text

Raw Blame History

 # Models to pull from Ollama registry
 # One model per line. Comments with #.
 qwen2.5:14b
 deepseek-r1:14b
 phi4:14b
 gemma3:12b
 qwen3.5:9b
 qwen3.5:27b