Runbook: Pod Not Ready
Alert name: PodNotReady
A Kubernetes pod has been in a not-ready state for 5+ minutes.
Diagnostic Steps
-
Identify the pod from the alert labels (
pod,namespace):kubectl describe pod <pod> -n <namespace> --context=minikube-indri -
Check events — look for scheduling failures, image pull errors, or probe failures:
kubectl get events -n <namespace> --context=minikube-indri --sort-by='.lastTimestamp' | tail -20 -
Check logs:
kubectl logs <pod> -n <namespace> --context=minikube-indri --tail=50 -
Check node resources:
kubectl top nodes --context=minikube-indri kubectl top pods -n <namespace> --context=minikube-indri
Common Causes
- CrashLoopBackOff — app is crashing on startup, check logs
- ImagePullBackOff — container image not found or registry unreachable
- Pending — insufficient resources (CPU/memory), or PVC not bound
- Readiness probe failing — service is running but not healthy
- NFS mount issue — services depending on sifaka (kiwix, transmission, navidrome, jellyfin) will fail if NFS is down
Silencing
- Grafana → Alerting → Silences → Create Silence
- Match
alertname = PodNotReady - Optionally match
namespace = <namespace>to silence a specific service
Related
- deploy-infra-alerting — Alerting pipeline overview