0

I just updated my eks from 1.15 to 1.16 and I couldn't get my deployments in my namespaces up and running. when I do kubectl get po and try to list my pods they're all stuck in CrashLoopBackOff state. I tried describe one pod and this is what I get in the events section

Events:
  Type     Reason   Age                  From     Message
  ----     ------   ----                 ----     -------
  Normal   Pulling  56m (x8 over 72m)    kubelet  Pulling image "xxxxxxx.dkr.ecr.us-west-2.amazonaws.com/xxx-xxxx-xxxx:master.697.7af45fff8e0"
  Warning  BackOff  75s (x299 over 66m)  kubelet  Back-off restarting failed container

kuberntets version -

Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.5", GitCommit:"6b1d87acf3c8253c123756b9e61dac642678305f", GitTreeState:"clean", BuildDate:"2021-03-18T01:10:43Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.15-eks-e1a842", GitCommit:"e1a8424098604fa0ad8dd7b314b18d979c5c54dc", GitTreeState:"clean", BuildDate:"2021-07-31T01:19:13Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

2 Answers2

1

It seems like your container is stuck in image pull state, here are somethings that you can check.

  1. Ensure image is present in ECR
  2. Ensure the EKS cluster is able to connect to ECR - If it's a private repo it would require credentials.
  3. Run a docker pull and see if it's able to pull it directly (most likely it will fail or ask for credentials if not already passed)
Jay
  • 305
  • 2
  • 9
  • Its not just my app pods but the pods in kube-system namespace are in the same state coredns, kubeproxy and aws-node are running but here is what my metric server pod description look like Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning BackOff 3m19s (x702 over 155m) kubelet Back-off restarting failed container – DevopsinAfrica Aug 31 '21 at 08:12
1

So the problem is I was trying to deploy x86 containers on ARM node instance. Everything worked once I changed my launch template image for my node group