0

I'm studing k8s operator by practicing the project :https://github.com/kubernetes-operators-book/chapters.git,after runing the following step: cwd: chapter/03

k create -f etcd-operator-crd.yaml
k create -f etcd-operator-sa.yaml
k create -f etcd-operator-role.yaml
k create -f etcd-operator-rolebinding.yaml
k create -f etcd-operator-deployment.yaml
k create -f etcd-cluster-cr.yaml

every resource is created except etcd cluster :(,

k version --short
Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
Client Version: v1.26.3
Kustomize Version: v4.5.7
Server Version: v1.26.1
k get pods
NAME                              READY   STATUS     RESTARTS   AGE

dnsutils                          1/1     Running    0          25h
etcd-operator-58bc7fbd7-n27kz     1/1     Running    0          24h
example-etcd-cluster-7f7fxlt9d7   0/1     Init:0/1   0          90m



kubectl exec -i -t dnsutils -- nslookup example-etcd-cluster-7f7fxlt9d7.example-etcd-cluster

Server:     10.96.0.10
Address:    10.96.0.10#53
server can't find example-etcd-cluster-7f7fxlt9d7.example-etcd-cluster: NXDOMAIN
command terminated with exit code 1


k describe pod example-etcd-cluster-7f7fxlt9d7
Name:             example-etcd-cluster-7f7fxlt9d7
Namespace:        default
Priority:         0
Service Account:  default
Node:             minikube/192.168.49.2
Start Time:       Mon, 01 May 2023 17:56:29 +0800
Labels:           app=etcd
                  etcd_cluster=example-etcd-cluster
                  etcd_node=example-etcd-cluster-7f7fxlt9d7
Annotations:      etcd.version: 3.1.10
Status:           Pending
IP:               10.244.0.13
IPs:
  IP:           10.244.0.13
Controlled By:  EtcdCluster/example-etcd-cluster
Init Containers:
  check-dns:
    Container ID:  containerd://00c3cda3e6eac0dad82750b36575187fc7bde83c9c04a4264bd6553468cdaff7
    Image:         busybox:1.28.0-glibc
    Image ID:      docker.io/library/busybox@sha256:0b55a30394294ab23b9afd58fab94e61a923f5834fba7ddbae7f8e0c11ba85e6
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
      -c

                TIMEOUT_READY=0
                while ( ! nslookup example-etcd-cluster-7f7fxlt9d7.example-etcd-cluster.default.svc )
                do
                  # If TIMEOUT_READY is 0 we should never time out and exit
                  TIMEOUT_READY=$(( TIMEOUT_READY-1 ))
                              if [ $TIMEOUT_READY -eq 0 ];
                                  then
                                      echo "Timed out waiting for DNS entry"
                                      exit 1
                                  fi
                              sleep 1
                            done
    State:          Running
      Started:      Mon, 01 May 2023 17:56:30 +0800
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:         <none>
Containers:
  etcd:
    Container ID:
    Image:         :v3.1.10
    Image ID:
    Ports:         2380/TCP, 2379/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      /usr/local/bin/etcd
      --data-dir=/var/etcd/data
      --name=example-etcd-cluster-7f7fxlt9d7
      --initial-advertise-peer-urls=http://example-etcd-cluster-7f7fxlt9d7.example-etcd-cluster.default.svc:2380
      --listen-peer-urls=http://0.0.0.0:2380
      --listen-client-urls=http://0.0.0.0:2379
      --advertise-client-urls=http://example-etcd-cluster-7f7fxlt9d7.example-etcd-cluster.default.svc:2379
      --initial-cluster=example-etcd-cluster-7f7fxlt9d7=http://example-etcd-cluster-7f7fxlt9d7.example-etcd-cluster.default.svc:2380
      --initial-cluster-state=new
      --initial-cluster-token=b6e75adf-c7e2-44fd-9fe5-7086de349a29
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Liveness:       exec [/bin/sh -ec ETCDCTL_API=3 etcdctl endpoint status] delay=10s timeout=10s period=60s #success=1 #failure=3
    Readiness:      exec [/bin/sh -ec ETCDCTL_API=3 etcdctl endpoint status] delay=1s timeout=5s period=5s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /var/etcd from etcd-data (rw)
Conditions:
  Type              Status
  Initialized       False
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  etcd-data:
    Type:        EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:   <unset>
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  36m   default-scheduler  Successfully assigned default/example-etcd-cluster-7f7fxlt9d7 to minikube
  Normal  Pulled     36m   kubelet            Container image "busybox:1.28.0-glibc" already present on machine
  Normal  Created    36m   kubelet            Created container check-dns
  Normal  Started    36m   kubelet            Started container check-dns

kubectl logs --namespace=kube-system -l k8s-app=kube-dns
[INFO] 10.244.0.13:48040 - 51805 "A IN example-etcd-cluster-7f7fxlt9d7.example-etcd-cluster.default.svc. udp 82 false 512" NXDOMAIN qr,aa,rd,ra 157 0.000162346s
[INFO] 10.244.0.13:56155 - 55592 "PTR IN 10.0.96.10.in-addr.arpa. udp 41 false 512" NOERROR qr,aa,rd 116 0.000186442s
[INFO] 10.244.0.13:56624 - 40757 "AAAA IN example-etcd-cluster-7f7fxlt9d7.example-etcd-cluster.default.svc.default.svc.cluster.local. udp 108 false 512" NXDOMAIN qr,aa,rd 201 0.000150069s
[INFO] 10.244.0.13:56624 - 34085 "A IN example-etcd-cluster-7f7fxlt9d7.example-etcd-cluster.default.svc.default.svc.cluster.local. udp 108 false 512" NXDOMAIN qr,aa,rd 201 0.000155803s
[INFO] 10.244.0.13:38005 - 62747 "A IN example-etcd-cluster-7f7fxlt9d7.example-etcd-cluster.default.svc.svc.cluster.local. udp 100 false 512" NXDOMAIN qr,aa,rd 193 0.000183339s
[INFO] 10.244.0.13:38005 - 49428 "AAAA IN example-etcd-cluster-7f7fxlt9d7.example-etcd-cluster.default.svc.svc.cluster.local. udp 100 false 512" NXDOMAIN qr,aa,rd 193 0.000297269s
[INFO] 10.244.0.13:42662 - 42938 "A IN example-etcd-cluster-7f7fxlt9d7.example-etcd-cluster.default.svc.cluster.local. udp 96 false 512" NXDOMAIN qr,aa,rd 189 0.000113432s
[INFO] 10.244.0.13:42662 - 20593 "AAAA IN example-etcd-cluster-7f7fxlt9d7.example-etcd-cluster.default.svc.cluster.local. udp 96 false 512" NXDOMAIN qr,aa,rd 189 0.000230624s
[INFO] 10.244.0.13:50139 - 20051 "A IN example-etcd-cluster-7f7fxlt9d7.example-etcd-cluster.default.svc. udp 82 false 512" NXDOMAIN qr,aa,rd,ra 157 0.000120582s
[INFO] 10.244.0.13:50139 - 8531 "AAAA IN example-etcd-cluster-7f7fxlt9d7.example-etcd-cluster.default.svc. udp 82 false 512" NXDOMAIN qr,aa,rd,ra 157 0.00021452s

it seems like the dns record did't set correctly,cause the pod stuck at init stage while check-dns,i'm new to k8s,is there anyone has the same issuse and fix it T-T,any help will be thankful

I have tried upgrade the etcd and operator version but did'nt help,some people has the same issuse while using high version busybox,and someone has the same issue but can't reproduce:https://github.com/sensu/sensu-operator/issues/5 expecting any kind of help T-T

JohnSmith
  • 1
  • 2
  • format body,make it better – JohnSmith May 01 '23 at 12:51
  • It seems that your link is not correct : ",after" is included in the URL. As a rule of thumb, I would recommend you not to learn using such an old exercise. What is the output of "kubectl version --short" ? – aboitier May 01 '23 at 14:17
  • Hi,thanks for helping,the version output is k version --short Flag --short has been deprecated, and will be removed in the future. The --short output will become the default. Client Version: v1.26.3 Kustomize Version: v4.5.7 Server Version: v1.26.1 I think the exercise is old too,considering upgrade the k8s to 1.16 to fit the operator:( – JohnSmith May 01 '23 at 15:37
  • Seems like known issue --> https://github.com/kubernetes-operators-book/chapters/issues/12 , also Sensu is a whole different cup of tea and most likely an unrelated issue to what you are seeing. – Rick Rackow May 02 '23 at 09:28
  • yeah,after dozens of searching I think the code is out of date that cause the issue,give up for that T-T – JohnSmith May 03 '23 at 08:36

0 Answers0