3

I have a K8s cluster which runs independent jobs (each job has one pod) and I expect them to run to completion. The scheduler, however, sometimes reschedules them on a different node. My jobs need to be single-run, and restarting them on a different node is not an acceptable outcome for me.

I was looking at Pod disruption budgets (PDB), but from what I understand their selectors apply to a label of pods. Since every one of my job is different and has a separate label, how do I use PDB to tell K8s that all of my pods have a maxUnavailable of 0?

I have also used this annotation

"cluster-autoscaler.kubernetes.io/safe-to-evict": false

but this does not affect pod evictions on resource pressures.

Ideally, I should be able to tell K8s that none of my Pods should be evicted unless they are complete.

20kLeagues
  • 134
  • 1
  • 6

1 Answers1

3

You should specify resources in order for your jobs to become Guaranteed quality of service:

resources:
  limits:
    memory: "200Mi"
    cpu: "700m"
  requests:
    memory: "200Mi"
    cpu: "700m"

Requests should be equal to limits - then your pod will become Guaranteed and will not be anymore evicted.

Read more: https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod

Vasili Angapov
  • 8,061
  • 15
  • 31