openstack pod keeps failing for this x509 error

Question

I have an account that can access a openstack site to do my work, and everytime when I was to execute any openstack cli command, I have to provide the extra "--insecure" option to make it work, something as follows:

>> openstack server list --insecure
+--------------------------------------+------------------------------+--------+-----------------------------------------------------+--------------------------+-----------+
| ID                                   | Name                         | Status | Networks                                            | Image                    | Flavor    |
+--------------------------------------+------------------------------+--------+-----------------------------------------------------+--------------------------+-----------+
| 57bea5...                            | US-280-1                     | ACTIVE | main_network=10.31.1.162, 10.96.129.112             | N/A (booted from volume) | m1.xlarge |
| 7ace60...                            | US-280-2                     | ACTIVE | main_network=10.31.0.200, 10.96.130.120             | N/A (booted from volume) | m1.xlarge |

Anyway, today I want to create a k8s cluster by using the kubespray framework, and I have set the external_cloud_provider to be "openstack" too! Basic I am trying to learn how to do the k8s setup.

I have checked out code from this link, https://github.com/kubernetes-sigs/kubespray, and have run the setup without any error.

But after everything has setup and I was to check the pod status, I have seen a failed one:

>>kubectl get pods -A
NAMESPACE     NAME                                                   READY   STATUS             RESTARTS       AGE
kube-system   openstack-cloud-controller-manager-v2qb8               0/1     CrashLoopBackOff   12 (38s ago)   23m
...

And in the pod log, it says:

I1117 00:09:29.487677       1 serving.go:348] Generated self-signed cert in-memory
W1117 00:09:29.642451       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
W1117 00:09:29.642451       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
W1117 00:09:29.668751       1 openstack.go:173] New openstack client created failed with config: Post "http://<my_original_openstack_site>:5000/v3/auth/tokens": x509: certificate signed by unknown authority
F1117 00:09:29.668907       1 main.go:84] Cloud provider could not be initialized: could not init cloud provider "openstack": Post "https://<my_original_openstack_site>:5000/v3/auth/tokens": x509: certificate signed by unknown authority

I kind of have a feeling I need to set this flag "insecure=true" during this openstack cloud provider setup. Does anyone know where I should put this flag ?

Thanks a lot for the help.

Jack

You need to provide the CA certificate so the server certificates can be verified. — eblock, Nov 17 '22 at 22:50
Thanks for the tips. Do I need to create the CA certificate manually, and have them copied over to the nodes ? Or it will be handled by the clusters.yml ? — user3595231, Nov 18 '22 at 04:07
I'm not really familiar with k8s, but the CA is the certificate authority that created/signed the certificates in use with keystone etc. So you'll need to get the CA and inject them into the containers. [Here's](https://github.com/rancher/rancher/issues/21011) something similar described. — eblock, Nov 18 '22 at 08:04

score 0 · Answer 1 · answered Dec 08 '22 at 14:54

The client-side validation of server certificates is the cause for the "x509: certificate signed by unknown authority" error.**

The client certificate is generated by an auth-provider that is present in the kubeconfig file, I am assuming you're using Google Cloud, for example:

    - name: kubectl-user
      user:
        auth-provider:
          config:
            cmd-args: config config-helper --format=json
            cmd-path: /usr/lib/google-cloud-sdk/bin/gcloud
            expiry-key: '{.credential.token_expiry}'
            token-key: '{.credential.access_token}'
          name: gcp

If the certificate-authority-data field is missing or has a different issuer than the issuer of the certificate provided by the k8s API server, then any kubectl commands that require access to k8s API server will fail with the error "x509: certificate signed by unknown authority".

Troubleshooting and mitigation steps :

To check if the CA in the kubeconfig file has the same Issuer as the Issuer for the certificate presented by kube API server the following steps can help.

1)Get the certificate from kubeconfig file:

kubectl config view --minify --raw --output 'jsonpath={..cluster.certificate-authority-data}' |
 base64 -d > /tmp/kubectl-cacert

2)Get the certificate that is presented by k8s API server:

  CLUSTER_IP=$(kubectl config view --minify --output 'jsonpath={..cluster.server}' |
     cut -d"/" -f3)

if [ -n "${CLUSTER_IP}" ]; then openssl s_client -connect $CLUSTER_IP:443 2>/dev/null </dev/null |
 sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > /tmp/kube-api-cacert; else echo 'Cluster IP not set.'; fi

3)Check the certificates, they should have the same issuer:

openssl x509 -in /tmp/kube-api-cacert -issuer -noout
openssl x509 -in /tmp/kubectl-cacert -issuer -noout

example output

$ openssl x509 -in /tmp/kube-api-cacert -issuer -noout
issuer=CN = 38d76ff6-cc21-474b-b919-c746d845d03d
$ openssl x509 -in /tmp/kubectl-cacert -issuer -noout
issuer=CN = 38d76ff6-cc21-474b-b919-c746d845d03d

If the issuers are different, you'll get the "Unable to connect to the server: x509: certificate signed by unknown authority" error. It is possible to use the same certificate from the k8s API server in kubeconfig and kubectl should work, for example:

kubectl config set clusters.$(kubectl config current-context).certificate-authority-data $(cat /tmp/kube-api-cacert | base64 -w0)

4) run kubectl it should work just fine:

$ kubectl get node
NAME                                                STATUS   ROLES    AGE   VERSION
gke-cert-error-cluster-default-pool-32170571-1f1r   Ready    <none>   21h   v1.21.10-gke.2000
gke-cert-error-cluster-default-pool-32170571-1r6h   Ready    <none>   21h   v1.21.10-gke.2000
gke-cert-error-cluster-default-pool-32170571-4mbj   Ready    <none>   21h   v1.21.10-gke.2000

The Error "CrashLoopBackOff" represents a Kubernetes state representing a restart loop that is happening in a Pod: a container in the Pod is started, but crashes and is then restarted, over and over again. Kubernetes will wait an increasing back-off time between restarts to give you a chance to fix the error.

Common reasons for a CrashLoopBackOff :

Some of the errors linked to the actual application are:

1)This means that there is an error in your docker image and the container/POD is not able to start. I would advise you to double check the nginx conf file etc/nginx/conf.d/project.conf for any misconfiguration.

2)A resource is not available: Like a PersistentVolume that is not mounted.

3)Wrong command line arguments: Either missing, or the incorrect ones.

4)Bugs & Exceptions: That can be anything, very specific to your application.

And finally, errors from the network and permissions are:

1)You tried to bind an existing port.

2)The memory limits are too low, so the container is OOM killed.

3)Errors in the liveness probes are not reporting the Pod as ready.

4)Read-only filesystems, or lack of permissions in general.

A similar back-off period is ImagePullBackOff, which is a waiting status when a container image couldn’t be pulled. The above all is just a list of possible causes but there could be many others.

Check How to debug, troubleshoot and fix a CrashLoopBackOff state :

1)Check the pod description.

2)Check the pod logs.

3)Check the events.

4)Check the deployment.

Please refer to the CrashLoopBackOff and how to fix it for more information.

openstack pod keeps failing for this x509 error

1 Answers1

example output