1

When I join a new node as a master to my cluster. I get the error. my cluster version is 1.17.0 . The command I exec on the node is: kubeadm join 192.168.1.120:6443 --token 5hbl78.99jlbgerstlkecss --discovery-token-ca-cert-hash sha256:0beb43185fa6a346fe57bd97cbb22afb128e6267bb80403ba2e7f388588e3256 --control-plane --certificate-key a056ad6f0ba73e736401027a1f078d7195b1aadaf2ac2eca6d773edc98d01483

I receive the following errors:

[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
error execution phase preflight: 
One or more conditions for hosting a new control plane instance is not satisfied.

unable to add a new control plane instance a cluster that doesn't have a stable controlPlaneEndpoint address

Please ensure that:
* The cluster has a stable controlPlaneEndpoint address.
* The certificates that must be shared among control plane instances are provided.


To see the stack trace of this error execute with --v=5 or higher 

The kubeadm config on the master node is:

root@k8s-master01:kubernetes#kubectl -n kube-system get cm kubeadm-config -oyaml
apiVersion: v1
data:
  ClusterConfiguration: |
    apiServer:
      certSANs:
      - 192.168.1.120
      - 192.168.1.121
      - 192.168.1.122
      extraArgs:
        authorization-mode: Node,RBAC
      timeoutForControlPlane: 4m0s
    apiVersion: kubeadm.k8s.io/v1beta2
    certificatesDir: /etc/kubernetes/pki
    clusterName: kubernetes
    controllerManager: {}
    dns:
      type: CoreDNS
    etcd:
      external:
        caFile: /work/deploy/kubernetes/security/ca.pem
        certFile: /work/deploy/kubernetes/security/etcd.pem
        endpoints:
        - https://192.168.1.120:2379
        - https://192.168.1.121:2379
        - https://192.168.1.122:2379
        keyFile: /work/deploy/kubernetes/security/etcd.key
    imageRepository: registry.aliyuncs.com/google_containers
    kind: ClusterConfiguration
    kubernetesVersion: v1.17.0
    networking:
      dnsDomain: cluster.local
      podSubnet: 192.168.0.0/16
      serviceSubnet: 10.10.0.0/16
    scheduler: {}
  ClusterStatus: |
    apiEndpoints:
      k8s-master01:
        advertiseAddress: 192.168.1.120
        bindPort: 6443
    apiVersion: kubeadm.k8s.io/v1beta2
    kind: ClusterStatus
kind: ConfigMap
metadata:
  creationTimestamp: "2020-02-20T05:27:10Z"
  name: kubeadm-config
  namespace: kube-system
  resourceVersion: "8315"
  selfLink: /api/v1/namespaces/kube-system/configmaps/kubeadm-config
  uid: a32b2f9b-41c3-4822-b8cb-c30c922fbddb

This problem has been mentioned in StackOverflow, but has not been solved

I reset my cluster, And clean up the etcd data. Then i configtured a VIP with keepalive, And I configtured the VIP load balancing to two node(which i plan to be as two master node ) by haproxy. After that i configtured the kubeadm-config.yaml. modify the controlPlaneEndpoint value to be the LB's VIP:PORT. then I exec the "kubeadm init --config kubeadm.conf --upload-certs". I receive the following errors:

[control-plane] Creating static Pod manifest for "kube-scheduler"
W0221 10:58:26.827277   18370 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
    timed out waiting for the condition

This error is likely caused by:
    - The kubelet is not running
    - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
    - 'systemctl status kubelet'
    - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
    - 'docker ps -a | grep kube | grep -v pause'
    Once you have found the failing container, you can inspect its logs with:
    - 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
**root@k8s-master01:kubernetes#journalctl -xeu kubelet**
2月 21 11:05:04 k8s-master01 kubelet[22546]: E0221 11:05:04.698260   22546 reflector.go:156] k8s.io/kubernetes/pkg/kubelet/kubelet.go:449: Failed to list *v1.Service: Get https://192.168.1.121:6443/api/v1/services?limit=500&resour
2月 21 11:05:04 k8s-master01 kubelet[22546]: E0221 11:05:04.781676   22546 kubelet.go:2263] node "k8s-master01" not found
2月 21 11:05:04 k8s-master01 kubelet[22546]: E0221 11:05:04.881928   22546 kubelet.go:2263] node "k8s-master01" not found
2月 21 11:05:04 k8s-master01 kubelet[22546]: E0221 11:05:04.895805   22546 reflector.go:156] k8s.io/kubernetes/pkg/kubelet/kubelet.go:458: Failed to list *v1.Node: Get https://192.168.1.121:6443/api/v1/nodes?fieldSelector=metadata
2月 21 11:05:04 k8s-master01 kubelet[22546]: E0221 11:05:04.983615   22546 kubelet.go:2263] node "k8s-master01" not found
2月 21 11:05:05 k8s-master01 kubelet[22546]: E0221 11:05:05.084247   22546 kubelet.go:2263] node "k8s-master01" not found
2月 21 11:05:05 k8s-master01 kubelet[22546]: E0221 11:05:05.106561   22546 reflector.go:156] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://192.168.1.121:6443/api/v1/pods?fieldSelector=sp
2月 21 11:05:05 k8s-master01 kubelet[22546]: E0221 11:05:05.184665   22546 kubelet.go:2263] node "k8s-master01" not found
2月 21 11:05:05 k8s-master01 kubelet[22546]: E0221 11:05:05.284792   22546 kubelet.go:2263] node "k8s-master01" not found

Other info: My VIP's endpoint is: 192.168.1.200:6001 . And the haproxy LB the VIP' endpoint to the two master apiserver endpoint(192.168.1.120:6443, 192.168.1.121:6443)

Esc
  • 521
  • 13
  • 30

1 Answers1

2

When you setup the first master node using kubeadm you should have run below command :

sudo kubeadm init --config kubeadm-config.yaml --upload-certs

Check the content of kubeadm-config.yaml file. It should have controlPlaneEndpoint. The value should be LOAD_BALANCER_DNS:LOAD_BALANCER_PORT.

Now if you don't have a Loadabalancer in front of your Kubernetes API Server which is recommended you can set this to public IP of the master node.

The --upload-certs flag should take care of the error related to certificate.

Also you can edit the ConfigMap and add controlPlaneEndpoint.

The content of kubeadm-config.yaml should look like

    apiVersion: kubeadm.k8s.io/v1beta2
    kind: ClusterConfiguration
    kubernetesVersion: stable
    controlPlaneEndpoint: "192.168.1.200:6001"
    apiServer:
      certSANs:
      - 192.168.1.120
      - 192.168.1.121
      - 192.168.1.122
      - 192.168.1.200
      extraArgs:
        authorization-mode: Node,RBAC
    etcd:
      external:
        endpoints:
        - https://192.168.1.120:2379
        - https://192.168.1.121:2379
        - https://192.168.1.122:2379
        caFile: /work/deploy/kubernetes/security/ca.pem
        certFile: /work/deploy/kubernetes/security/etcd.pem
        keyFile: /work/deploy/kubernetes/security/etcd.key
    networking:
      dnsDomain: cluster.local
      podSubnet: 192.168.0.0/16
      serviceSubnet: 10.10.0.0/16
Arghya Sadhu
  • 41,002
  • 9
  • 78
  • 107
  • hi,Sadhu. I take your suggestion. And modify the kubeadm-config.yaml. but I get new problem. you can see above in the question – Esc Feb 21 '20 at 08:32
  • before doing kubeadm init can you do kubeadm reset --force and sudo rm -rf ~/.kube – Arghya Sadhu Feb 21 '20 at 08:33
  • Yes, I added the flag “--force” and removed the ~/.kube. but I still get the same error – Esc Feb 21 '20 at 10:59
  • Are you using stacked etcd or external etcd mode of kubeadm? can you try this command sudo kubeadm init --control-plane-endpoint "LOAD_BALANCER_DNS:LOAD_BALANCER_PORT" --upload-certs – Arghya Sadhu Feb 21 '20 at 11:05
  • I use the external etcd mode. you can see that in my question i paste the kubeadm config above. I am not add the ```--control-plane-endpoint "LOAD_BALANCER_DNS:LOAD_BALANCER_PORT"``` in the command. But I add it in the kubeadm config. part of the config like this: ```kind: InitConfiguration controlPlaneEndpoint: 192.168.1.200:6001 localAPIEndpoint: advertiseAddress: 192.168.1.121 bindPort: 6443 nodeRegistration: criSocket: /var/run/dockershim.sock name: k8s-master01``` – Esc Feb 21 '20 at 16:25
  • I have updated my answer with a kubeadm conf yaml file..try that – Arghya Sadhu Feb 21 '20 at 16:57
  • My configuration is essentially the same as what you wrote. – Esc Feb 21 '20 at 18:59
  • But in the comment above you have k8s-master-01 – Arghya Sadhu Feb 22 '20 at 03:22