blumeops/docs/changelog.d
Erich Blume 41dfae1f80 Add CNI conflict troubleshooting to restart-indri how-to (#139)
## Summary
- Documents a troubleshooting procedure for broken pod networking after unclean shutdown
- During minikube recovery, a stale `1-k8s.conflist` CNI config can override kindnet's `10-kindnet.conflist`, causing new pods to use bridge+firewall networking instead of kindnet's ptp — breaking pod-to-pod communication
- Covers symptoms (DNS failures, liveness probe timeouts), diagnosis steps, and the fix

## Context
Encountered this during the 2026-02-10 power outage. Immich, kiwix, and transmission were all crash-looping for ~8 hours due to the CNI conflict. The minikube ansible role's clean boot detection has been improved (#137) so this may not recur, but the troubleshooting guide is valuable if it does.

## Test plan
- [x] Documentation only — no code changes
- [x] Pre-commit hooks pass

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/139
2026-02-10 07:24:42 -08:00
..
.gitkeep Add towncrier changelog system (#86) 2026-02-03 11:48:13 -08:00
doc-cni-conflict-troubleshooting.doc.md Add CNI conflict troubleshooting to restart-indri how-to (#139) 2026-02-10 07:24:42 -08:00
docs-power-infrastructure.doc.md Add power infrastructure reference card (#138) 2026-02-09 23:03:13 -08:00
feature-fly-proxy-error-page.feature.md Serve friendly error page when Fly.io proxy upstreams are unreachable (#133) 2026-02-09 12:01:24 -08:00
feature-op-backup.feature.md Add op-backup mise task for encrypted 1Password disaster recovery (#136) 2026-02-09 20:37:39 -08:00
feature-sifaka-ops-observability.feature.md Operations and observability for sifaka NAS (#135) 2026-02-09 17:44:05 -08:00
fix-deploy-healthcheck-race.bugfix.md Fix 502 errors during Fly.io proxy deploys (#131) 2026-02-09 11:07:36 -08:00
fix-minikube-status-check.bugfix.md Fix minikube role skipping start when kubelet/apiserver are stopped (#137) 2026-02-09 23:03:01 -08:00
fix-real-client-ip-logging.bugfix.md Log real client IPs via Fly-Client-IP header (#130) 2026-02-09 11:02:06 -08:00
fix-zero-downtime-deploy.infra.md Zero-downtime Fly.io deploys (#132) 2026-02-09 11:34:19 -08:00