The 27B Q4_K_M model needs ~7.3 GiB system RAM for CPU-offloaded layers but only 6.8 GiB was available within the 22Gi cgroup. Bumping to 24Gi and enabling flash attention (reduces KV cache memory) should provide enough headroom. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| apps | ||
| manifests | ||