I have a long-time running GKE cluster with several pods based on the same Java environment and overall structure. Earlier today I upgraded the nodes to get the latest stable Kubernetes environment (upgrade was from v1.23.14 to v1.23.16). After the upgrade completed the majority of my pods recovered, however a few of them (7) are stuck in a crash loop where they exception when using the Java SecretManagerServiceClient class to read secrets with a java.lang.NullPointerException exception:
The Application Default Credentials are not available. They are available if running in Google Compute Engine. Otherwise, the environment variable GOOGLE_APPLICATION_CREDENTIALS must be defined pointing to a file defining the credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.
Note - all these pods worked BEFORE the GKE upgrade. Many services with identical logic (they all use the same library to get the secrets reading code) work just fine, but this small set is stuck.
Note, I do not define a GOOGLE_APPLICATION_CREDENTIALS in my pods because they are running in GKE.
Any thoughts on how to debug this issue?