I have a cluster running k8s 1.23 and several workloads. Workloads are spread in 3 priorityClasses
high 10000000
medium 100000
low 1000
All the pods have their priorityClass set correctly, but I had a big failure and some nodes got unavailable and several pods with high priority class remained in pending state while all the low and medium were still running.
I could see no preemption reference on the kubectl get events One log from the FailedScheduling shows:
43m Warning FailedScheduling pod/redacted-78455bb4b7-mtq5m 0/13 nodes are available: 1 Too many pods, 1 node(s) had taint {taint.kubernetes.io/zone: redacted}, that the pod didn't tolerate, 2 Insufficient cpu, 2 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 2 node(s) had taint {taint.kubernetes.io/redacted: redacted}, that the pod didn't tolerate, 6 node(s) were unschedulable.
Summarizing that, after the failure I had only two nodes "available" ( not tainted / unschedulable ) but one was rejected due to "Too many pods" and the other due to "Insufficient cpu".
- Why didn't all the low prio pods got evicted to allow the high prio pods to be scheduled?
- Is somehow "Too many pods" and "Insufficient cpu" nodes ignored by the preemption ?