0

Today our lab k8s cluster stopped allowing work on the cluster. When I dug in, it was because the certificates expired.

I regenerated the certs and configs but when I restart kubelet, we still get connection refused to the apiserver. The apiserver is running, but port 6443 is not open. When debugging logs, the following error is in the output (maybe a red herring, but not sure):

Current certificate CN (system:node:lab-02) does not match requested CN (system:node:control-plane-xm2c9)

This is a kubeadm created cluster. I used kubeadm to regenerate all of the certs (except the CA). The following commands were used:

# kubeadm init phase certs apiserver --apiserver-cert-extra-sans control-plane-xm2c9 --apiserver-advertise-address 192.168.2.78
# kubeadm init phase certs apiserver-etcd-client
# kubeadm init phase certs apiserver-kubelet-client

I created the configs with:

# kubeadm init phase kubeconfig admin --apiserver-advertise-address 192.168.2.78
# kubeadm init phase kubeconfig kubelet  --apiserver-advertise-address 192.168.2.78
# kubeadm init phase kubeconfig controller-manager --apiserver-advertise-address 192.168.2.78
# kubeadm init phase kubeconfig scheduler --apiserver-advertise-address 192.168.2.78

I'm still sifting through logs. I think this is the root of my problem, but I'm not sure how to resolve it. Any assistance would be great! TIA

Jim
  • 355
  • 1
  • 4
  • 14
  • is `Current certificate CN (system:node:lab-02) does not match requested CN (system:node:control-plane-xm2c9)` the only you found useful in logs? – Vit Aug 29 '20 at 00:20
  • Yeah, in the k8s logs. One thing I failed to do was check the `docker` logs, which finally indicated that the `sa` keys were missing from the `pki` directory. Then I also noticed that `/var/lib/kubernetes/pki`contained the `apiserver-client-*` key pair. That one was the old set of keys trying to use the old hostname. Lastly, `etcd` was also trying to use the old hostname, so I fixed that in `/etc/kubernetes/etcd/*`. It was a PITA. – Jim Aug 30 '20 at 07:22
  • 1
    great to hear you fixed issue - ill ask you provide your own answer in case you resolved everythng you asked in this question. Thanks – Vit Aug 30 '20 at 09:09

1 Answers1

0

The problem, if I remember correctly, was that there was on cert that the above-mentioned commands did not update but was required. It was the sa cert. Fortunately, before I started any work on this, I backed up all of the certs to /var/tmp and the sa cert was in there and still valid. When I copied the sa cert back to the cert directory and restart kubelet then everything worked again.

Jim
  • 355
  • 1
  • 4
  • 14