I recently upgraded my GKE cluster from 1.10.x to 1.11.x and since then my calico-node
pods fail to connect to the etcd cluster and end up in a CrashLoopBackOff
due to livenessProbe error.
I saw that the calico-etcd
DaemonSet has desired state 0 and was wondering about that. nodeSelector is at node-role.kubernetes.io/master=
.
From the logs of such calico-node
s:
2018-12-19 19:18:28.989 [INFO][7] etcd.go 373: Unhandled error: client: etcd cluster is unavailable or misconfigured; error #0: client: endpoint http://10.96.232.136:6666 exceeded header timeout
2018-12-19 19:18:28.989 [INFO][7] startup.go 254: Unable to query node configuration Name="gke-brokerme-ubuntu-pool-852d0318-j5ft" error=client: etcd cluster is unavailable or misconfigured; error #0: client: endpoint http://10.96.232.136:6666 exceeded header timeout
State of the DaemonSets:
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
calico-etcd 0 0 0 0 0 node-role.kubernetes.io/master= 3d
calico-node 2 2 0 2 0 <none> 3d
k get nodes --show-labels
:
NAME STATUS ROLES AGE VERSION LABELS
gke-brokerme-ubuntu-pool-852d0318-7v4m Ready <none> 4d v1.11.5-gke.5 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/fluentd-ds-ready=true,beta.kubernetes.io/instance-type=n1-standard-2,beta.kubernetes.io/os=linux,cloud.google.com/gke-nodepool=ubuntu-pool,cloud.google.com/gke-os-distribution=ubuntu,failure-domain.beta.kubernetes.io/region=europe-west1,failure-domain.beta.kubernetes.io/zone=europe-west1-b,kubernetes.io/hostname=gke-brokerme-ubuntu-pool-852d0318-7v4m,os=ubuntu
gke-brokerme-ubuntu-pool-852d0318-j5ft Ready <none> 1h v1.11.5-gke.5 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/fluentd-ds-ready=true,beta.kubernetes.io/instance-type=n1-standard-2,beta.kubernetes.io/os=linux,cloud.google.com/gke-nodepool=ubuntu-pool,cloud.google.com/gke-os-distribution=ubuntu,failure-domain.beta.kubernetes.io/region=europe-west1,failure-domain.beta.kubernetes.io/zone=europe-west1-b,kubernetes.io/hostname=gke-brokerme-ubuntu-pool-852d0318-j5ft,os=ubuntu
I did not modify any calico manifests, they should be 1:1 provisioned by GKE.
I would expect either the calico-node
s connect to the etc of my Kubernetes cluster, or to a calico-etcd
provisioned by the DaemonSet. As there is no master node that I can control in GKE, I kind of get why calico-etcd
is at state 0, but then, to which etc are the calico-node
s supposed to connect? What's wrong with my small and basic setup?