0

I want to deploy a little k8s test cluster using 3 VMS running on my host, one master and two workers. On all of them, needed software are installed : docker, kubeadm, kubectl, kubelet . I've followed the steps described in the official documentation but I have a problem when trying to download flannel cni. I can't fathom why ...

Here is the version of the tools I installed

root@kubernetes-master:~# docker version
Client: Docker Engine - Community
 Version:           20.10.19
 API version:       1.41
 Go version:        go1.18.7
 Git commit:        d85ef84
 Built:             Thu Oct 13 16:46:17 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
root@kubernetes-master:~# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.3", GitCommit:"434bfd82814af038ad94d62ebe59b133fcb50506", GitTreeState:"clean", BuildDate:"2022-10-12T10:55:36Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"linux/amd64"}
root@kubernetes-master:~# 

So the steps I've followed

# init cluster
sudo kubeadm init --cri-socket=/var/run/containerd/containerd.sock

# deploy flannel
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

But the flannel pods status is Init:ErrImagePull :

oot@kubernetes-master:/home/awadmin# kubectl get pods -A -o wide
NAMESPACE      NAME                                        READY   STATUS              RESTARTS       AGE   IP                NODE                 NOMINATED NODE   READINESS GATES
kube-flannel   kube-flannel-ds-c4q56                       0/1     Init:ErrImagePull   0              9h    192.168.122.135   kubernetes-master    <none>           <none>
kube-flannel   kube-flannel-ds-ktfh4                       0/1     Init:ErrImagePull   0              9h    192.168.122.211   kubernetes-worker2   <none>           <none>
kube-flannel   kube-flannel-ds-ztcxk                       0/1     Init:ErrImagePull   0              9h    192.168.122.202   kubernetes-worker1   <none>           <none>
kube-system    coredns-565d847f94-5zmgs                    0/1     Pending             0              9h    <none>            
...

Then I tried to fetch the image with ctr, crictl on worker and master but to no avail.

root@kubernetes-worker1:/home/awadmin# ctr images pull docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0                                                                                    
docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0: resolving      |--------------------------------------|                                                                                  
elapsed: 11.1s                                                   total:   0.0 B (0.0 B/s)                                                                                                                 
INFO[0011] trying next host                              error="failed to authorize: failed to fetch anonymous token: Get \"https://auth.docker.io/token?scope=repository%3Arancher%2Fmirrored-flannelcni-flannel-cni-plugin%3Apull&service=registry.docker.io\": net/http: TLS handshake timeout" host=registry-1.docker.io                                                                                         
ctr: failed to resolve reference "docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0": failed to authorize: failed to fetch anonymous token: Get "https://auth.docker.io/token?scope=repository%3Arancher%2Fmirrored-flannelcni-flannel-cni-plugin%3Apull&service=registry.docker.io": net/http: TLS handshake timeout                                                                                  
root@kubernetes-worker1:/home/awadmin# crictl pull docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0
WARN[0000] image connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead. 
ERRO[0000] unable to determine image API version: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory" 
E1020 11:26:29.791933   79655 remote_image.go:242] "PullImage from image service failed" err="rpc error: code = Unknown desc = failed to pull and unpack image \"docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0\": failed to resolve reference \"docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0\": failed to authorize: failed to fetch anonymous token: Get \"https://auth.docker.io/token?scope=repository%3Arancher%2Fmirrored-flannelcni-flannel-cni-plugin%3Apull&service=registry.docker.io\": dial tcp 44.207.96.114:443: i/o timeout" image="docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0"
FATA[0030] pulling image: rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0": failed to resolve reference "docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0": failed to authorize: failed to fetch anonymous token: Get "https://auth.docker.io/token?scope=repository%3Arancher%2Fmirrored-flannelcni-flannel-cni-plugin%3Apull&service=registry.docker.io": dial tcp 44.207.96.114:443: i/o timeout 
root@kubernetes-worker1:/home/awadmin# 

The registry registry.k8s.io works fine apparently.

I try to check if there was some connectivity issues with the Docker Registry but apparently no

root@kubernetes-master:~# curl -v https://docker.io/v2/_catalog
*   Trying 54.165.156.197:443...
* TCP_NODELAY set
* Connected to docker.io (54.165.156.197) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: CN=*.docker.com
*  start date: Jun 12 00:00:00 2022 GMT
*  expire date: Jul 11 23:59:59 2023 GMT
*  subjectAltName: host "docker.io" matched cert's "docker.io"
*  issuer: C=US; O=Amazon; OU=Server CA 1B; CN=Amazon
*  SSL certificate verify ok.
> GET /v2/_catalog HTTP/1.1
> Host: docker.io
> User-Agent: curl/7.68.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 301 Moved Permanently
< content-length: 0
< location: https://www.docker.com/v2/_catalog
<
* Connection #0 to host docker.io left intact

It must have something to do with the cri endpoint but I explicitly specified it at init stage. Any idea ?

  • Hi creatldd1 creatldd1 welcome to S.F. One will observe it is requesting `auth.docker.io` and you requested `docker.io`. It's almost certainly some silly firewall something, but it's very hard to troubleshoot those things over S.F. Good luck! – mdaniel Oct 21 '22 at 02:25
  • @mdaniel thanks for the tip. There's also this post (https://stackoverflow.com/a/69294295/12512199) that makes me wonder if this issue is not ubuntu 20 x docker related. I'll work around that idea and see what it yields – creatldd1 creatldd1 Oct 21 '22 at 06:56

0 Answers0