we are running an Anthos cluster on VMWare and having some issues pulling container images from the registry.k8s.io registry. We are seeing error messages for e.g.
Failed to pull image "registry.k8s.io/csi-secrets-store/driver-crds:v1.3.3": rpc error: code = Unknown desc = failed to pull and unpack image "registry.k8s.io/csi-secrets-store/driver-crds:v1.3.3": failed to resolve reference "registry.k8s.io/csi-secrets-store/driver-crds:v1.3.3": failed to do request: Head "https://europe-west2-docker.pkg.dev/v2/k8s-artifacts-prod/images/csi-secrets-store/driver-crds/manifests/v1.3.3": x509: certificate signed by unknown authority
Warning Failed 3s (x4 over 88s) kubelet Error: ErrImagePull
We've checked our firewall rules and there is nothing blocking access so I'm thinking its an issue with the trusted certs in the Anthos node images. We're using the default Ubuntu containerd image, which i believe is based on Ubuntu 18.04. Our Anthos version is 1.14.1-gke.39.
If I try and curl https://europe-west2-docker.pkg.dev/v2/k8s-artifacts-prod/images/csi-secrets-store/driver-crds/manifests/v1.3.3
from the Anthos Admin workstation (also uses a Google provided OS image) I also get an error: curl failed to verify the legitimacy of the server and therefore could not establish a secure connection to it.
If I do the same from our jumpbox (in same VLAN), which uses a standard Ubuntu 20.04.5 its all ok. I can also pull the image on that box.
So, I'm thinking the issue is with the Anthos OS images, and it seems the only solution is for Google to update them with the required certs. I don't think installing them ourselves is very practical.
And we're not the only ones experiencing this issues it seems https://www.googlecloudcommunity.com/gc/Anthos/Anthos-config-management-operator/m-p/542966#M275
Any suggestions for other things to try or possible solutions welcome!