1

OS: Ubuntu 20.04 LTS, x86 64

After I rebooted my system with a kubernetes cluster, all the deployments, pods and everything else stoped wokring. How do I diagnose what the problem here is?

The response to the command sudo kubectl get status is: The connection to the server localhost:8080 was refused - did you specify the right host or port?.

EDIT:

The output of cat ~.kube/config:

clusters:
- cluster:
    certificate-authority-data: xxx
    server: https://xxx:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: kubernetes-admin
  name: kubernetes-admin@kubernetes
current-context: kubernetes-admin@kubernetes
kind: Config
preferences: {}
users:
- name: kubernetes-admin
  user:
    client-certificate-data: xxx
    client-key-data: xxx

Output of sudo systemctl status kubelet:

2042 kuberuntime_manager.go:815] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to setup network for sandbox>
Oct 20 15:30:16 xxx kubelet[2042]: E1020 15:30:16.872177    2042 pod_workers.go:951] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"coredns-6d4b75cb6d-hk8sz_kube-system(3a7dc6>
Oct 20 15:30:26 xxx kubelet[2042]: E1020 15:30:26.870800    2042 remote_runtime.go:201] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to setup network for sa>
Oct 20 15:30:26 xxx kubelet[2042]: E1020 15:30:26.870917    2042 kuberuntime_sandbox.go:70] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = failed to setup network for sandbox>
Oct 20 15:30:26 xxx kubelet[2042]: E1020 15:30:26.870977    2042 kuberuntime_manager.go:815] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to setup network for sandbox>
Oct 20 15:30:26 xxx kubelet[2042]: E1020 15:30:26.871089    2042 pod_workers.go:951] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"coredns-6d4b75cb6d-hxqws_kube-system(3579f3>
Oct 20 15:30:29 xxx kubelet[2042]: E1020 15:30:29.873159    2042 remote_runtime.go:201] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to setup network for sa>
Oct 20 15:30:29 xxx kubelet[2042]: E1020 15:30:29.873268    2042 kuberuntime_sandbox.go:70] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = failed to setup network for sandbox>
Oct 20 15:30:29 xxx kubelet[2042]: E1020 15:30:29.873319    2042 kuberuntime_manager.go:815] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to setup network for sandbox>
Oct 20 15:30:29 xxx kubelet[2042]: E1020 15:30:29.873415    2042 pod_workers.go:951] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"coredns-6d4b75cb6d-hk8sz_kube-system(3a7dc6>

output of kubectl get nodes:

NAME        STATUS   ROLES           AGE   VERSION
xxx         Ready    control-plane   6d    v1.24.3

output of sudo ~/.kube/config:

/home/xxx/.kube/config: 1: apiVersion:: not found
/home/xxx/.kube/config: 2: clusters:: not found
/home/xxx/.kube/config: 3: -: not found
/home/xxx/.kube/config: 4: certificate-authority-data:: not found
/home/xxx/.kube/config: 5: server:: not found
/home/xxx/.kube/config: 6: name:: not found
/home/xxx/.kube/config: 7: contexts:: not found
/home/xxx/.kube/config: 8: -: not found
/home/xxx/.kube/config: 9: cluster:: not found
/home/xxx/.kube/config: 10: user:: not found
/home/xxx/.kube/config: 11: name:: not found
/home/xxx/.kube/config: 12: current-context:: not found
/home/xxx/.kube/config: 13: kind:: not found
/home/xxx/.kube/config: 14: preferences:: not found
/home/xxx/.kube/config: 15: users:: not found
/home/xxx/.kube/config: 16: -: not found
/home/xxx/.kube/config: 17: user:: not found
/home/xxx/.kube/config: 18: client-certificate-data:: not found
/home/xxx/.kube/config: 19: client-key-data:: not found
Ishan Hegde
  • 92
  • 11

2 Answers2

0

This could be the solution:

Every time you reboot your servers, the swapp will be on again. (You can see that via service kubelet status)

You should disable the swap again:

sudo swapoff -a

Run this on every node that has been rebooted.


If that wasn't work, you can check for the kubelet logs:

  • Check the kubelet status:
    service kubelet status
    
  • Check extended logs of Kubelet service with journalctl:
    journalctl -u kubelet
    
  • Check where the kubelet is:
    which kubelet
    
  • Open the kubelet configuration (Check the path via service kubelet status):
    sudo vim /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
    
  • in the ExecStart section, check the /usr/bin/kubelet path is the same path as you got from which kubelet.

Be sure to check this repo as well:

Troubleshooting In Kubernetes

Ali
  • 922
  • 1
  • 9
  • 24
-1

It can be that you have permission issue with ~/.kube directory

If so then it can be fixed using

sudo chmod -R 777 ~/.kube