9

While following the kubernetes article on Using kubeadm to Create a Cluster, I was stuck when the AddOn pods I was trying to install (Nginx, Tiller, Grafana, InfluxDB, Dashboard) would always stay in a state of Pending.

Checking the message from kubectl describe pod tiller-deploy-df4fdf55d-jwtcz --namespace=kube-system resulted in the following message:

Type     Reason            Age                From               Message
----     ------            ----               ----               -------
Warning  FailedScheduling  51s (x15 over 3m)  default-scheduler  0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.

When I ran the command from the Master Isolation section kubectl taint nodes --all node-role.kubernetes.io/master-, the AddOns would install as expected.

At this point I can only suspect (because they are already installed on the master node) that the reason was that I hadn't connected a worker node to the cluster yet for the scheduler to schedule the pods on.

The documentation states "your cluster will not schedule pods on the master for security reasons". I know that this is a non-production environment so there is little risk in this situation but what is the risk of removing that taint in a production cluster?

Follow-up: If this is a risk, how can I re-add that taint so I can then uninstall the AddOn pods and try to have the scheduler install them on my Worker Node?

Environment Details: Operating System - CentOS 7.4.1708 (Core) Kubernetes Version - 1.10

Flea
  • 1,490
  • 2
  • 19
  • 43

1 Answers1

10

the reason was that I hadn't connected a worker node to the cluster yet for the scheduler to schedule the pods on.

100% correct. You will for sure want some worker nodes, otherwise the idea of "scheduling work" becomes very weird.

but what is the risk of removing that taint in a production cluster?

I am not a kubernetes security expert, but a pragmatic risk is CPU, I/O, and/or memory exhaustion on the master nodes, which would have very severe consequences to the health of the cluster. There is almost never a reason to run any workload on a master node, and almost entirely an increase in risk, so the advice "just don't do it" is well founded.

how can I re-add that taint so I can then uninstall the AddOn pods and try to have the scheduler install them on my Worker Node?

I'm not sure I follow that question, but I would for sure start by just adding a worker node before trying to do complicated stuff with taints and tolerations.

mdaniel
  • 31,240
  • 5
  • 55
  • 58
  • 2
    kubectl taint nodes --all node-role.kubernetes.io/master- Apparently that removed the taint that was preventing the installation. I was wanting to re-instate it. – Flea Apr 09 '18 at 14:47
  • 1
    According to [this blog post](https://obviate.io/2017/06/26/kubernetes-1-6-taints-and-tolerances-for-monitoring-your-cluster/) it appears the taint is `node-role.kubernetes.io/master="":NoSchedule` – mdaniel Apr 10 '18 at 05:23
  • 1
    According the [Creating a single master cluster with kubeadm](https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#master-isolation), execute the command `kubectl taint nodes --all node-role.kubernetes.io/master-` – Marslo Jun 28 '18 at 08:57
  • @Flea I tried kubectl taint nodes --all node-role.kubernetes.io/master and got error: at least one taint update is required – Ashish Karpe Dec 13 '19 at 10:49
  • admin1@POC-k8s-master:~/poc-cog/kafka/kubernetes-kafka$ kubectl taint nodes --all node-role.kubernetes.io/master- node/poc-k8s-master untainted error: taint "node-role.kubernetes.io/master" not found – Ashish Karpe Dec 13 '19 at 10:52
  • admin1@POC-k8s-master:~/poc-cog/kafka/kubernetes-kafka$ kubectl get nodes -o json | jq .items[].spec.taints null [ { "effect": "NoSchedule", "key": "node.kubernetes.io/unreachable", "timeAdded": "2019-12-13T09:40:40Z" } ] admin1@POC-k8s-master:~/poc-cog/kafka/kubernetes-kafka$ kubectl taint nodes --all node-role.kubernetes.io/unreachable error: at least one taint update is required – Ashish Karpe Dec 13 '19 at 10:58