0

We have upgraded our Kubernetes version from v1.24 to v1.25. We have used kubespray (version v1.2.21) for cluster creation. The cluster is upgraded successfully to v1.25. But once we deploy pods, not able to connect to outside networks like google.com from Kubernetes pods. It is throwing the below error.

user@vm-util-mtm-wes-k8-upgrade-rnd:~$ kubectl exec -i -t dnsutils – nslookup google.com
Server: 169.254.25.10
Address: 169.254.25.10#53

** server can’t find google.com.reddog.microsoft.com: SERVFAIL

command terminated with exit code 1

We have tried the steps mentioned in this link: Debug DNS resolution, but the issue still persists. Any suggestion?

Cluster information:


 - Kubernetes version: v1.25
 - Cloud being used: (put bare-metal if not on a public cloud) : Azure VMs
 - Installation method: using kubespray
 - Host OS: ubuntu 20.04 LTS
 - CNI and version: Weave , v2.8.1
 - CRI and version: docker, v20.10

Below are some steps we have tried as of now

  1. In coredns configmap corefile, by default it was pointing to 8.8.8.8 8.8.4.4
Corefile: |
    .:53 {
        errors
        health {
            lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        forward . 8.8.8.8 8.8.4.4 {
          prefer_udp
          max_concurrent 1000
        }
        cache 30

        loop
        reload
        loadbalance
    }

made proper changes to point it to /etc/resolv.conf file

Corefile: |
    .:53 {
        errors
        health {
            lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        forward . /etc/resolv.conf {
          prefer_udp
          max_concurrent 1000
        }
        cache 30

        loop
        reload
        loadbalance
    }
  1. Entries of /etc/resolv.conf file on master nodes
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients directly to
# all known uplink DNS servers. This file lists all configured search domains.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 10.23.64.41
nameserver 10.23.64.42
nameserver 10.23.0.41
search reddog.microsoft.com

Where you can see 3 nameserver has been defined, but when we execute resolvctl command

resolvectl | grep "Current DNS Server"

it shows output as below

 Current DNS Server: 10.23.64.41
  1. Tried to keep only one nameserver entry (i.e 10.23.64.41) in /etc/resolv.conf file and restarted kubelet and daemon-reload.
systemctl daemon-reload
systemctl restart kubelet

But issue still persisted.

Tapan Hegde
  • 101
  • 2
  • It seems `google.com.reddog.microsoft.com` the suffix is the issue. Can you check the `pod.spec.dnsPolicy` of the `dnsutils` pod and its `/etc/resolv.conf` content – Amila Senadheera Jun 07 '23 at 05:56

0 Answers0