Resources were under wrong Helm value keys (server.resources, machine-learning.resources) and never applied to pods. Move to correct bjw-s chart paths (*.controllers.main.containers.main.resources). Increase liveness/readiness probe timeouts from 1s to 5s to prevent kubelet from killing healthy-but-busy pods during ML inference load. Remove CPU limits (keep requests only) to avoid throttling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| apps | ||
| manifests | ||