Kubernetes CoreDNS in CrashLoopBackOff

Question

I understand that this question is asked dozen times, but nothing has helped me through internet searching.

My set up:

CentOS Linux release 7.5.1804 (Core)
Docker Version: 18.06.1-ce
Kubernetes: v1.12.3

Installed by official guide and this one:https://www.techrepublic.com/article/how-to-install-a-kubernetes-cluster-on-centos-7/

CoreDNS pods are in Error/CrashLoopBackOff state.

kube-system   coredns-576cbf47c7-8phwt                 0/1     CrashLoopBackOff   8          31m
kube-system   coredns-576cbf47c7-rn2qc                 0/1     CrashLoopBackOff   8          31m

My /etc/resolv.conf:

nameserver 8.8.8.8

Also tried with my local dns-resolver(router)

nameserver 10.10.10.1

Setup and init:

kubeadm init --apiserver-advertise-address=10.10.10.3 --pod-network-cidr=192.168.1.0/16
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

I tried to solve this with: Editing the coredns: root@kub~]# kubectl edit cm coredns -n kube-system and changing

proxy . /etc/resolv.conf

directly to

proxy . 10.10.10.1

or proxy . 8.8.8.8

Also tried to:

kubectl -n kube-system get deployment coredns -o yaml |   sed 's/allowPrivilegeEscalation: false/allowPrivilegeEscalation: true/g' |   kubectl apply -f -

And still nothing helps me.

Error from the logs:

plugin/loop: Seen "HINFO IN 7847735572277573283.2952120668710018229." more than twice, loop detected

The other thread - coredns pods have CrashLoopBackOff or Error state didnt help at all, becouse i havent hit any solutions that were described there. Nothing helped.

Can you post the logs? You can get them through `kubectl logs coredns-576cbf47c7-8phwt` for example — Clorichel, Nov 30 '18 at 14:51
Possible duplicate of [coredns pods have CrashLoopBackOff or Error state](https://stackoverflow.com/questions/53075796/coredns-pods-have-crashloopbackoff-or-error-state) — Utku Özdemir, Nov 30 '18 at 14:58
I didnt hit any of the presented solutions except a "hacky" - removing the loop plugin. — Hakon89, Nov 30 '18 at 15:01

Narendranath Reddy · Answer 1 · 2020-06-13T11:34:58.697

13

Even I have got such error and I successfully managed to work by below steps.

However, you missed 8.8.4.4

sudo nano /etc/resolv.conf

nameserver 8.8.8.8
nameserver 8.8.4.4

run following commands to restart daemon and docker service

sudo systemctl daemon-reload

sudo systemctl restart docker

If you are using kubeadm make sure you delete an entire cluster from master and provision cluster again.

kubectl drain <node_name> --delete-local-data --force --ignore-daemonsets
kubectl delete node <node_name>
kubeadm reset

Once You Provision the new cluster

kubectl get pods --all-namespaces

It Should give below expected Result

NAMESPACE     NAME                       READY   STATUS    RESTARTS   AGE
kube-system   calico-node-gldlr          2/2     Running   0          24s
kube-system   coredns-86c58d9df4-lpnj6   1/1     Running   0          40s
kube-system   coredns-86c58d9df4-xnb5r   1/1     Running   0          40s
kube-system   kube-proxy-kkb7b           1/1     Running   0          40s
kube-system   kube-scheduler-osboxes     1/1     Running   0          10s

edited Jun 13 '20 at 11:34

answered Mar 26 '19 at 07:01

Narendranath Reddy

3,833
3
13
32

1

I resolve just with: ```nameserver 8.8.8.8``` ```nameserver 8.8.4.4``` ```sudo systemctl daemon-reload``` ```sudo systemctl restart docker``` – Altieres de Matos Apr 08 '19 at 18:33
2

Hey @Altieres de Matos Even I have mentioned same commands man!! – Narendranath Reddy Aug 21 '19 at 08:50
@NarendranathReddy GREAT! Worked for me. – Sahil Gulati Sep 05 '19 at 05:19
1

@SahilGulati Enjoy buddy, Happy Kubernetes!! – Narendranath Reddy Sep 05 '19 at 06:26
For me just `sudo systemctl daemon-reload` and `sudo systemctl restart docker` worked. Even though coreDNS stays in loopback, one of the service IPs that wasn't reachable started responding again. – Anuraag Vaidya Jan 16 '20 at 09:47
Why do you have to restart the entire cluster just to get the coredns pods back up running ? – Ciasto piekarz Feb 14 '21 at 18:59
1

I just found a working solution here , https://github.com/coredns/coredns/issues/2087#issuecomment-432387727 tried and tested. – Ciasto piekarz Feb 14 '21 at 19:07

score 4 · Answer 2 · answered Dec 02 '18 at 18:09

4

$kubectl edit cm coredns -n kube-system delete ‘loop’ ,save and exit restart master node. It was work for me.

answered Dec 02 '18 at 18:09

k''

702
1
8
19

1

This means coredns can't detect loops -- I did this and it worked (coredns could run and hostnames could resolve), but coredns had very high CPU utilization – Andre Jul 01 '19 at 04:59
2

@Hk Can you please put your answer in understandable format? – ankit Jul 17 '19 at 10:20

score 2 · Answer 3 · edited May 02 '21 at 02:28

I faced the the same issue in my local k8s in Docker (KIND) setup. CoreDns pod gets crashloop backoff error.

Steps followed to make the pod into running state:

As Tim Chan said in this post and by referring the github issues link, I did the following

kubectl -n kube-system edit configmaps coredns -o yaml
modify the section forward . /etc/resolv.conf with forward . 172.16.232.1 (mycase i set 8.8.8.8 for the timebeing)
Delete one of the Coredns Pods, or can wait for sometime - the pods will be in running state.

score 1 · Answer 4 · answered Dec 01 '18 at 01:51

Usually happens when coredns can't talk to the kube-apiserver:

Check that your kubernetes service is in the default namespace:

$ kubectl get svc kubernetes
NAME         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
kubernetes   ClusterIP   10.96.0.1      <none>        443/TCP        130d

Then (you might have to create a pod):

$ kubectl -n kube-system exec -it <any-pod-with-shell> sh
# ping kubernetes.default.svc.cluster.local
PING kubernetes.default.svc.cluster.local (10.96.0.1): 56 data bytes

Also, try hitting port 443 from the port:

# telnet kubernetes.default.svc.cluster.local 443 # or
# curl kubernetes.default.svc.cluster.local:443

Tim C. · Answer 5 · 2021-03-21T02:23:43.420

I got the error is:

connect: no route to host","time":"2021-03-19T14:42:05Z"} crashloopbackoff

in the log showed by kubectl -n kube-system logs coredns-d9fdb9c9f-864rz

The issue is mentioned in https://github.com/coredns/coredns/tree/master/plugin/loop#troubleshooting-loops-in-kubernetes-clusters

tldr; Reason: /etc/resolv.conf got updated somehow. The original one is at /run/systemd/resolve/resolv.conf: e.g:

nameserver 172.16.232.1

Quick fix, edit Corefile:

$ kubectl -n kube-system edit configmaps coredns -o yaml

to replace forward . /etc/resolv.conf with forward . 172.16.232.1 e.g:

apiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        health {
           lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           fallthrough in-addr.arpa ip6.arpa
           ttl 30
        }
        prometheus :9153
        forward . 172.16.232.1 {
           max_concurrent 1000
        }
        cache 30
        loop
        reload
        loadbalance
    }
kind: ConfigMap
metadata:
  creationTimestamp: "2021-03-18T15:58:07Z"
  name: coredns
  namespace: kube-system
  resourceVersion: "49996"
  uid: 428a03ff-82d0-4812-a3fa-e913c2911ebd

Done, after that, may need to restart the docker

sudo systemctl restart docker

Update: it could be fixed by just sudo systemctl restart docker

Kubernetes CoreDNS in CrashLoopBackOff

5 Answers5