blumeops

History

Erich Blume c26026f4e9 Bump Ollama memory to 24Gi and enable flash attention The 27B Q4_K_M model needs ~7.3 GiB system RAM for CPU-offloaded layers but only 6.8 GiB was available within the 22Gi cgroup. Bumping to 24Gi and enabling flash attention (reduces KV cache memory) should provide enough headroom. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 20:33:22 -07:00
..
apps	Remove unused Mosquitto MQTT broker from ringtail	2026-03-11 18:37:31 -07:00
manifests	Bump Ollama memory to 24Gi and enable flash attention	2026-03-11 20:33:22 -07:00

Erich Blume c26026f4e9 Bump Ollama memory to 24Gi and enable flash attention

The 27B Q4_K_M model needs ~7.3 GiB system RAM for CPU-offloaded layers
but only 6.8 GiB was available within the 22Gi cgroup. Bumping to 24Gi
and enabling flash attention (reduces KV cache memory) should provide
enough headroom.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-11 20:33:22 -07:00

apps

Remove unused Mosquitto MQTT broker from ringtail

2026-03-11 18:37:31 -07:00

manifests

Bump Ollama memory to 24Gi and enable flash attention

2026-03-11 20:33:22 -07:00