im trying to install 3 node k8s cluster with kubespray-2.16.0 following https://github.com/kubernetes-sigs/kubespray . all the installation steps go fine, except this
TASK [kubernetes-apps/ansible : Kubernetes Apps | Register coredns deployment annotation `createdby`] ***********************************************************************************************************************************
fatal: [node1]: FAILED! => {"changed": false, "cmd": "/usr/local/bin/kubectl get deploy -n kube-system coredns -o jsonpath='{ .spec.template.metadata.annotations.createdby }'", "delta": "0:00:00.152435", "end": "2021-08-10 08:34:56.875744", "msg": "non-zero return code", "rc": 1, "start": "2021-08-10 08:34:56.723309", "stderr": "Error from server (NotFound): deployments.apps \"coredns\" not found", "stderr_lines": ["Error from server (NotFound): deployments.apps \"coredns\" not found"], "stdout": "", "stdout_lines": []}
...ignoring
Tuesday 10 August 2021 08:34:56 +0000 (0:00:00.497) 0:10:30.178 ********
it finishes like this:
PLAY RECAP ******************************************************************************************************************************************************************************************************************************
localhost : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
node1 : ok=568 changed=76 unreachable=0 failed=0 skipped=1169 rescued=0 ignored=1
node2 : ok=506 changed=73 unreachable=0 failed=0 skipped=1015 rescued=0 ignored=0
node3 : ok=425 changed=53 unreachable=0 failed=0 skipped=685 rescued=0 ignored=0
but when i check the status of the nodes after the installation, i can see that the (potentially) worker node is in notReady status.
[root@node1 kubespray-2.16.0]# kubectl get nodes -o=wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
node1 Ready control-plane,master 5m44s v1.20.7 10.32.51.91 <none> CentOS Linux 7 (Core) 3.10.0-957.12.2.el7.x86_64 docker://20.10.8
node2 Ready control-plane,master 5m12s v1.20.7 10.32.51.78 <none> CentOS Linux 7 (Core) 3.10.0-957.12.2.el7.x86_64 docker://20.10.8
node3 NotReady <none> 4m1s v1.20.7 10.32.51.61 <none> CentOS Linux 7 (Core) 3.10.0-957.12.2.el7.x86_64 docker://20.10.8
[root@node1 kubespray-2.16.0]#
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-7c5b64bf96-zkdcx 0/1 Pending 0 82s
calico-node-cdzgx 1/1 Running 0 111s
calico-node-h5cz7 0/1 Running 0 111s
calico-node-vpsch 1/1 Running 0 111s
coredns-657959df74-lqpqw 1/1 Running 0 41s
coredns-657959df74-qj6md 1/1 Running 0 46s
dns-autoscaler-b5c786945-gvhff 1/1 Running 0 43s
kube-apiserver-node1 1/1 Running 0 3m34s
kube-apiserver-node2 1/1 Running 0 3m11s
kube-controller-manager-node1 1/1 Running 0 3m34s
kube-controller-manager-node2 1/1 Running 0 3m11s
kube-proxy-2pnf6 1/1 Running 0 2m7s
kube-proxy-4sxp9 1/1 Running 0 2m7s
kube-proxy-8mns8 1/1 Running 0 2m7s
kube-scheduler-node1 1/1 Running 0 3m34s
kube-scheduler-node2 1/1 Running 0 3m11s
nginx-proxy-node3 0/1 CrashLoopBackOff 2 2m13s
nodelocaldns-gvfst 1/1 Running 0 42s
nodelocaldns-ktc89 0/1 Pending 0 42s
nodelocaldns-vwtxn 1/1 Running 0 42s
NAME READY STATUS RESTARTS AGE
ingress-nginx-controller-9pz4g 1/1 Running 0 68s
ingress-nginx-controller-lwb44 1/1 Running 0 68s
ingress-nginx-controller-sztht 0/1 Pending 0 68s
i checked the not running pods but as im newbie in k8s, i dont know which of them could be the rootcause and which nodes are just consequences. could you please tell/help me what
- could be the rootcause
- how to investigate
- and how to fix this issue?
i tried the installation at least 10x with different parameters, order of nodes, etc to debug, but the result was always the same.
thank you, s.