Troubleshooting a fresh install of K3s is made easier thanks to the Rancher DNS troubleshooting page which gives plenty of sensible advice, including testing DNS resolution by spinning up one-time Busybox instances and invoking nslookup kubernetes.default
:
Check if internal cluster names are resolving (in this example, kubernetes.default), the IP shown after Server: should be the same as the CLUSTER-IP from the kube-dns service.
kubectl run -it --rm --restart=Never busybox --image=busybox:1.28 -- nslookup kubernetes.default
For instance:
vagrant@ubuntu-hirsute:~$ k3s kubectl run -it --rm --restart=Never busybox --image=busybox:1.28 -- nslookup kubernetes.default
If you don't see a command prompt, try pressing enter.
Server: 10.43.0.10
Address 1: 10.43.0.10 kube-dns.kube-system.svc.cluster.local
Name: kubernetes.default
Address 1: 10.43.0.1 kubernetes.default.svc.cluster.local
pod "busybox" deleted
I wondered if version 1.28 was simply outdated documentation, but this only seems to work with 1.28. Using, say, 1.33:
vagrant@ubuntu-hirsute:~$ k3s kubectl run -it --rm --restart=Never busybox --image=busybox:1.33 -- nslookup kubernetes.default
If you don't see a command prompt, try pressing enter.
Server: 10.43.0.10
Address: 10.43.0.10:53
** server can't find kubernetes.default: NXDOMAIN
*** Can't find kubernetes.default: No answer
pod "busybox" deleted
pod default/busybox terminated (Error)
I couldn't get it to resolve kubernetes.default
with Busybox 1.29, 1.30, 1.31, 1.32 and 1.33, but I can with ubuntu:hirsute
or centos:7
, and the same if instead of the host system for K3s being Ubuntu Hirsute, it's Rocky Linux.
What's special about Busybox 1.28, or rather, what's out of sorts with Busybox 1.29 and later that this simple test wouldn't work?
Using K3s v1.21.4+k3s1 but this was already the case in the v1.20 lineage.