GKE: metric server crashlooping (crosspost from r/googlecloud)

Asked Jul 14 '23 at 07:37

Active Jul 14 '23 at 07:37

Viewed 27 times

I have several (<10) gke clusters, all but one are all in the same condition and I can't figure out what and why is it happening. I hope to find someone that managed to solve the same issue :)

Some time ago, i noticed that our HPA stopped working, having no way to read metrics from pods. Long story short, our pod named "metrics-server-v0.5.2-*" crashloops outputting a stack trace like this one: https://nopaste.net/aA5rwAeBWI and this one: https://nopaste.net/3uQsD6EDfA .

I tried to restart the deployment, without any success. What I don't understand is why one of the clusters (the oldest to be created) is working...

From r/googlecloud (reddit) it seems aknowledged that is not specifically tied to our workloads.

All the clusters are updated to the same version: v1.24.12-gke.500 and have been created by terraform resources.

Do any of you have any pointer?

Thanks.

asked Jul 14 '23 at 07:37

Luca Gervasi

GKE: metric server crashlooping (crosspost from r/googlecloud)

0 Answers0