Kubernetes Pod anti-affinity - evenly spread pods based on a label?

Question

We are finding that our Kubernetes cluster tends to have hot-spots where certain nodes get far more instances of our apps than other nodes.

In this case, we are deploying lots of instances of Apache Airflow, and some nodes have 3x more web or scheduler components than others.

Is it possible to use anti-affinity rules to force a more even spread of pods across the cluster?

E.g. "prefer the node with the least pods of label component=airflow-web?"

If anti-affinity does not work, are there other mechanisms we should be looking into as well?

score 6 · Answer 1 · answered Dec 31 '20 at 00:33

6

Try adding this to the Deployment/StatefulSet .spec.template:

      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: "component"
                  operator: In
                  values:
                  - airflow-web
              topologyKey: "kubernetes.io/hostname"

answered Dec 31 '20 at 00:33

Lukman

18,462
6
56
66

Can you explain what this does a bit more? – John Humphreys Jan 02 '21 at 14:07
Guessing it's this -> https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/ and requires 1.19. We're on 1.16, but still good to know. Up-voted, will accept if another solution doesn't present itself. Thanks! – John Humphreys Jan 02 '21 at 14:32

score 2 · Answer 2 · answered Dec 30 '20 at 16:33

2

Have you tried configuring the kube-scheduler?

kube-scheduler selects a node for the pod in a 2-step operation:

Filtering: finds the set of Nodes where it's feasible to schedule the Pod.
Scoring: ranks the remaining nodes to choose the most suitable Pod placement.

Scheduling Policies: can be used to specify the predicates and priorities that the kube-scheduler runs to filter and score nodes.

kube-scheduler --policy-config-file <filename>

Sample config file

One of the priorities for your scenario is:

BalancedResourceAllocation: Favors nodes with balanced resource usage.

answered Dec 30 '20 at 16:33

Kamol Hasan

12,218
1
37
46

One of our problems here is that the nodes all have decent / low resource usage and lots of ephemeral pods which are tiny. We get 2 problems with the tiny ephemeral pods - (1) is docker rate limiting, and (2) is PLEG issues - i.e. low memory/cpu, but pod management overhead becomes problematic as the node is creating/destorying too many pods. Would this let us achieve something resembling "round-robin-over-all-reasonable-node-options"? E.g. find all the nodes sensible for balancing and then select a random one so it doesn't hot spot to the first? (which seems to happen alot). – John Humphreys Jan 02 '21 at 14:10

score 0 · Answer 3 · answered Feb 25 '22 at 08:22

The right solution here is pod topology spread constraints: https://kubernetes.io/blog/2020/05/introducing-podtopologyspread/

Anti-affinity only works until each node has at least 1 pod. Spread constraints actually balances based on the pod count per node.

Kubernetes Pod anti-affinity - evenly spread pods based on a label?

3 Answers3