How to allocate redis-sentinel pods on different nodes?

Question

I'm running the redis chart (https://artifacthub.io/packages/helm/bitnami/redis/15.7.0) as a dependency of a custom chart. I enabled sentinel, then the pods are running two containers (redis and sentinel). I'm using the default values for the chart and I defined 4 replicas. The cluster have 10 nodes and I notice that three redis-sentinel's pods run on a single node and only one runs in another node:

myapp-redis-node-0    2/2    Running    8d     ip    k8s-appname-ctw9v
myapp-redis-node-1    2/2    Running    34d    ip    k8s-appname-ctw9v
myapp-redis-node-2    2/2    Running    34d    ip    k8s-appname-ctw9v
myapp-redis-node-3    2/2    Running    34d    ip    k8s-appname-crm3k

This is the affinity section for the pod's:

spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          labelSelector:
            matchLabels:
              app.kubernetes.io/component: node
              app.kubernetes.io/instance: myapp
              app.kubernetes.io/name: redis
          namespaces:
          - test
          topologyKey: kubernetes.io/hostname
        weight: 1

How I can do to have each pod on diferent nodes?

Thanks!

Can you please share the values.yaml file you're using here in order to try to replicate this behavior. — rock'n rolla, Jul 07 '22 at 17:07

score 1 · Answer 1 · answered Jul 07 '22 at 19:46

1

You need to update the podAntiAffinity section of the pod template to add a certain k/v pair. This will ensure that for a node, if a pod with that k/v pair already exists, the schedular will attempt to schedule the pod on another node that doesn't have a pod with that k/v pair. I say attempt, because anti-affinity rules are soft rules and if there are no nodes available, a pod will be scheduled on a node that might possibly violate the anti-affinity. Details here.

Try patching the template as:

spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: <ADD_LABEL_HERE>
              operator: In
              values:
              - <ADD_VALUE_HERE>

answered Jul 07 '22 at 19:46

zer0

2,153
10
12

Are you suggesting that `matchLabels` is the problem in the OP's configuration? – rock'n rolla Jul 07 '22 at 23:06
Probably, yes. They appear strange to me. – zer0 Jul 08 '22 at 00:15
Well, they are accepted `selector` as well. Found something here: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#resources-that-support-set-based-requirements And the configuration posted by the OP works for me. 6 replica's on a 6 node cluster - all pods on a different node. – rock'n rolla Jul 08 '22 at 10:23
Interesting. I wonder if it is the type of the cluster that is an issue. For instance, if they are ec2 instances, the node configuration can effect the scheduling policy of k8s. – zer0 Jul 08 '22 at 18:28

score 0 · Accepted Answer · answered Jul 08 '22 at 19:04

0

Thank you all for your answers. Finally I solved it with:

spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchLabels:
            app.kubernetes.io/component: node
            app.kubernetes.io/instance: myapp
            app.kubernetes.io/name: redis
        namespaces:
        - test
        topologyKey: kubernetes.io/hostname

BTW, this is generated automatically by the chart by setting in values for master and replicas (I'm using v15.7.0):

podAntiAffinityPreset: hard

answered Jul 08 '22 at 19:04

Martín C.

33
5

1

Yes @Martin, that is one of the solution but you might wonder why no one recommended it. The reason is it has some downsides. If the required no. of `pods` are more than the no. of nodes in the cluster, the `pod` will go into `Pending` state and wait for `cluster-autoscaler` to create a new node - which is done one at a time - can be time consuming if too many pods. Also, for some this might add unnecessary cost because you have extra nodes running just to support this `hard` requirement & there might be nodes with enough free resources to host the extra `pods` in case of `soft` requirement. – rock'n rolla Jul 08 '22 at 19:35

score 0 · Answer 3 · answered Jul 10 '22 at 19:42

There's a dedicated configuration to ensure what you want, called: PodDisruptionBudget.

https://kubernetes.io/docs/tasks/run-application/configure-pdb/

It ensures that your pods are distributed among nodes for high availability and will help you if you want to replace a node etc.

How to allocate redis-sentinel pods on different nodes?

3 Answers3