Is podAntiAffinity "required" enough to sign a pod shared equally to nodes?

Question

The environment is in production. I have 156 GKE Node worker in a cluster. And I wanna sign 1 (max) nginx pod to 1 node. It means, I must be using PodAntiAffinity.

          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: project
                operator: In
                values:
                - nginx-web
            topologyKey: kubernetes.io/hostname

When I tested this in my staging environment, the result is expected. My staging GKE cluster is High Availability (Zonal) it means worker nodes deployed to A, B, and C zone. Will the PodAntiAffinity with "required" model spread the pod to A, B, C zone or it's automatically controlled by CloudProvider (GKE) ?

I am just curious how it works behind

I need some suggestions from your probably you have experienced on this.

==============================

Second try

          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: project
                  operator: In
                  values:
                  - ingress-web
              topologyKey: topology.kubernetes.io/zone
            weight: 100
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: project
                operator: In
                values:
                - nginx-web
            topologyKey: kubernetes.io/hostname

Hello. Have you considered using a `Daemonset`? It's designed to run exactly one replica on each node. This would quality it for the requirement: `And I wanna sign 1 (max) nginx pod to 1 node`. Please take a look on it's official documentation: https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/ . Please let me know if you would like to see an answer going more in depth with this solution. Also please check if you are running a `Zonal` or `Regional` cluster as it looks like you are using a `Regional` one instead of the `Zonal ` one. — Dawid Kruk, Jul 22 '20 at 13:35

Arghya Sadhu · Answer 1 · 2020-07-22T11:14:58.017

Managing Pods distribution across a cluster is hard. The well-known Kubernetes features for Pod affinity and anti-affinity, allow some control of Pod placement in different topologies. However, these features only resolve part of Pods distribution use cases: either place unlimited Pods to a single topology, or disallow two Pods to co-locate in the same topology. In between these two extreme cases, there is a common need to distribute the Pods evenly across the topologies, so as to achieve better cluster utilization and high availability of applications.

The PodTopologySpread scheduling plugin (originally proposed as EvenPodsSpread) was designed to fill that gap. This was promoted to beta in 1.18.

The field pod.spec.topologySpreadConstraints is introduced as below:

apiVersion: v1
kind: Pod
metadata:
  name: mypod
spec:
  topologySpreadConstraints:
    - maxSkew: <integer>
      topologyKey: <string>
      whenUnsatisfiable: <string>
      labelSelector: <object>

You can define one or multiple topologySpreadConstraint to instruct the kube-scheduler how to place each incoming Pod in relation to the existing Pods across your cluster. The fields are:

maxSkew describes the degree to which Pods may be unevenly distributed. It's the maximum permitted difference between the number of matching Pods in any two topology domains of a given topology type. It must be greater than zero.

topologyKey is the key of node labels. If two Nodes are labelled with this key and have identical values for that label, the scheduler treats both Nodes as being in the same topology. The scheduler tries to place a balanced number of Pods into each topology domain.

whenUnsatisfiable indicates how to deal with a Pod if it doesn't satisfy the spread constraint:

DoNotSchedule (default) tells the scheduler not to schedule it.
ScheduleAnyway tells the scheduler to still schedule it while prioritizing nodes that minimize the skew.

labelSelector is used to find matching Pods. Pods that match this label selector are counted to determine the number of Pods in their corresponding topology domain. See Label Selectors for more details

A nice explanation in this blog here

Thanks for your time to share this important information. When i read the doc, it's written "Kubernetes v1.18 [beta]". Does it mean the feature will exist in GKE 1.18 right ? And what happened if my GKE 1.15 ? anyway, i just add "prefered" methood in my thread. Will it help us to spread pod in zone ? — Nicky Puff, Jul 22 '20 at 11:11
Yes its beta in 1,18 and alpha in 1.16..now the older methodology is all or nothing meaning either place unlimited Pods to a single topology, or disallow two Pods to co-locate in the same topology — Arghya Sadhu, Jul 22 '20 at 11:16
In your opinion `preferredDuringSchedulingIgnoredDuringExecution` if add that in the same time, it will try to spread the pod divided equally ? — Nicky Puff, Jul 22 '20 at 11:43

Is podAntiAffinity "required" enough to sign a pod shared equally to nodes?

1 Answers1