k8s priotityClass ignored

Question

I have a cluster running k8s 1.23 and several workloads. Workloads are spread in 3 priorityClasses

high                                     10000000
medium                                   100000
low                                      1000

All the pods have their priorityClass set correctly, but I had a big failure and some nodes got unavailable and several pods with high priority class remained in pending state while all the low and medium were still running.

I could see no preemption reference on the kubectl get events One log from the FailedScheduling shows:

43m         Warning   FailedScheduling       pod/redacted-78455bb4b7-mtq5m      0/13 nodes are available: 1 Too many pods, 1 node(s) had taint {taint.kubernetes.io/zone: redacted}, that the pod didn't tolerate, 2 Insufficient cpu, 2 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 2 node(s) had taint {taint.kubernetes.io/redacted: redacted}, that the pod didn't tolerate, 6 node(s) were unschedulable.

Summarizing that, after the failure I had only two nodes "available" ( not tainted / unschedulable ) but one was rejected due to "Too many pods" and the other due to "Insufficient cpu".

Why didn't all the low prio pods got evicted to allow the high prio pods to be scheduled?
Is somehow "Too many pods" and "Insufficient cpu" nodes ignored by the preemption ?

The below document has the information for the query posted. Can you check and let me know if it is useful to you. https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#higher-priority-pods-are-preempted-before-lower-priority-pods — Kiran Kotturi, Jun 22 '23 at 13:50
As the error contains different kind of node failures,the following blog has solution for different scenarios related to scheduling failures. Can you check and let me know. https://www.datadoghq.com/blog/debug-kubernetes-pending-pods/ — Kiran Kotturi, Jun 23 '23 at 10:11

Kiran Kotturi · Answer 1 · 2023-06-23T14:10:04.100

As you have mentioned, the following errors related to FailedScheduling occurred once the kubectl get events command used

Error 1: FailedScheduling

node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate

Taints and toleration work together to ensure that pods are not scheduled on inappropriate nodes.

To check if node has taint or not

kubectl describe node <nodename> | grep Taints

A similar output is provided if any taint present on node

node-role.kubernetes.io/master:NoSchedule

If you want to keep the taint on the node as it is and still you want to schedule a particular pod on that node then include this in your pod/deployment.yaml file.

tolerations:
- key: "key"
  operator: "Exists"
  effect: "NoSchedule"

If you want to remove taint from the node, update the below command.

kubectl taint node master node-role.kubernetes.io/master:NoSchedule-

Make sure to add - in front of NoSchedule

For more information, you can follow the official document.

Error 2:

Insufficient CPU

If a pod specifies resource requests—the minimum amount of CPU and/or memory it needs in order to run—the Kubernetes scheduler will attempt to find a node that can allocate resources to satisfy those requests. If it is unsuccessful, the pod will remain Pending until more resources become available.

You can find more information in the blog written by Emily Chang which provides solutions for different scenarios related to scheduling failures.

@Jose, Let me know if the provided information was helpful.If yes,you can check this link https://stackoverflow.com/help/someone-answers. I am happy to assist you in case of any queries. — Kiran Kotturi, Jun 29 '23 at 03:34

k8s priotityClass ignored

1 Answers1