blumeops/argocd
Erich Blume c26026f4e9 Bump Ollama memory to 24Gi and enable flash attention
The 27B Q4_K_M model needs ~7.3 GiB system RAM for CPU-offloaded layers
but only 6.8 GiB was available within the 22Gi cgroup. Bumping to 24Gi
and enabling flash attention (reduces KV cache memory) should provide
enough headroom.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 20:33:22 -07:00
..
apps Remove unused Mosquitto MQTT broker from ringtail 2026-03-11 18:37:31 -07:00
manifests Bump Ollama memory to 24Gi and enable flash attention 2026-03-11 20:33:22 -07:00