20

I've setup a Kubernetes 1.5 cluster with the three master nodes tainted dedicated=master:NoSchedule. Now I want to deploy the Nginx Ingress Controller on the Master nodes only so I've added tolerations:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: nginx-ingress-controller
  namespace: kube-system
  labels:
    kubernetes.io/cluster-service: "true"
spec:
  replicas: 3
  template:
    metadata:
      labels:
        k8s-app: nginx-ingress-lb
        name: nginx-ingress-lb
      annotations:
        scheduler.alpha.kubernetes.io/tolerations: |
          [
            {
              "key": "dedicated",
              "operator": "Equal",
              "value": "master",
              "effect": "NoSchedule"
            }
          ]
    spec:
    […]

Unfortunately this does not have the desired effect: Kubernetes schedules all Pods on the workers. When scaling the number of replicas to a larger number the Pods are deployed on the workers, too.

How can I achieve scheduling to the Master nodes only?

Thanks for your help.

Stephan
  • 488
  • 2
  • 4
  • 15

4 Answers4

27

A toleration does not mean that the pod must be scheduled on a node with such taints. It means that the pod tolerates such a taint. If you want your pod to be "attracted" to specific nodes you will need to attach a label to your dedicated=master tainted nodes and set nodeSelector in the pod to look for such label.

Attach the label to each of your special use nodes:

kubectl label nodes name_of_your_node dedicated=master

Kubernetes 1.6 and above syntax

Add the nodeSelector to your pod:

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: nginx-ingress-controller
  namespace: kube-system
  labels:
    kubernetes.io/cluster-service: "true"
spec:
  replicas: 3
  template:
    metadata:
      labels:
        k8s-app: nginx-ingress-lb
        name: nginx-ingress-lb
      annotations:
    spec:
      nodeSelector:
        dedicated: master
      tolerations:
      - key: dedicated
        operator: Equal
        value: master
        effect: NoSchedule
    […]

If you don't fancy nodeSelector you can add affinity: under spec: instead:

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
        matchExpressions:
        - key: dedicated
          operator: Equal
          values: ["master"]

Pre 1.6 syntax

Add the nodeSelector to your pod:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: nginx-ingress-controller
  namespace: kube-system
  labels:
    kubernetes.io/cluster-service: "true"
spec:
  replicas: 3
  template:
    metadata:
      labels:
        k8s-app: nginx-ingress-lb
        name: nginx-ingress-lb
      annotations:
        scheduler.alpha.kubernetes.io/tolerations: |
          [
            {
              "key": "dedicated",
              "operator": "Equal",
              "value": "master",
              "effect": "NoSchedule"
            }
          ]
    spec:
      nodeSelector:
        dedicated: master
    […]

If you don't fancy nodeSelector you can also add an annotation like this:

scheduler.alpha.kubernetes.io/affinity: >
  {
    "nodeAffinity": {
      "requiredDuringSchedulingIgnoredDuringExecution": {
        "nodeSelectorTerms": [
          {
            "matchExpressions": [
              {
                "key": "dedicated",
                "operator": "Equal",
                "values": ["master"]
              }
            ]
          }
        ]
      }
    }
  }

Keep in mind that NoSchedule will not evict pods that are already scheduled.

The information above is from https://kubernetes.io/docs/user-guide/node-selection/ and there are more details there.

Janos Lenart
  • 25,074
  • 5
  • 73
  • 75
  • Works. According to the docs _nodeSelector continues to work as usual, but will eventually be deprecated, as node affinity can express everything that nodeSelector can express_. So better go with Affinity and AntiAffinity directly… – Stephan Feb 14 '17 at 08:08
  • 1
    At least in my case (kubernetes version 1.11) tolerations requires an array, so I had to add a '-' in front of the 'key' label, like: "- key: dedicated" . I don't know if this is a new addition in v.1.11 or is some previous version since 1.6, but it seemed worthy to mention it. – Liquid Oct 09 '18 at 13:33
  • @JanosLenart could the `dedicated->master` label be swapped with `node-role.kubernetes.io/master->` flag (label without a value) that is placed automatically by k8s? – Adirio Feb 06 '19 at 09:15
3
  tolerations:
  - key: node-role.kubernetes.io/master
    effect: NoSchedule
Boeboe
  • 2,070
  • 1
  • 17
  • 21
  • 4
    Please don't post only code as an answer, but also provide an explanation what your code does and how it solves the problem of the question. Answers with an explanation are usually of higher quality, and are more like to attract upvotes. – Mark Rotteveel Apr 12 '20 at 07:34
  • 1
    I was searching the last 15 minutes with multiple search engine results, this answer immediately helped me and gave me the answer I wanted. Just those two lines was the thing I was looking for. My mistake was I was not using /master at the end so my pod was not scheduled as needed. Sometimes there is no need for many words.. Just code. – maiky Oct 22 '21 at 19:45
0

you might want to dive into the Assigning Pods to Nodes documentation. Basically you should add some labels to your nodes with sth like this:

kubectl label nodes <node-name> <label-key>=<label-value>

and then reference that within your Pod specification like this:

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx
    image: nginx
  nodeSelector:
    label: value

But I'm not sure if this works for non-critical addons when the specific node is tainted. More details could be found here

pagid
  • 13,559
  • 11
  • 78
  • 104
0

In my case I had to specify the following

tolerations:
- effect: NoSchedule
  operator: Exists
- key: CriticalAddonsOnly
  operator: Exists
- effect: NoExecute
  operator: Exists
papanito
  • 2,349
  • 2
  • 32
  • 60