0

omsagent-win is the pod in the kube-system namespace that is supplied with aks included if you have azure insights enabled. I use a hybrid environment here. Win & Linux are used.

output: kubectl get nodes

NAME                            STATUS   ROLES   AGE     VERSION
aks-nplin-21116150-vmss000002   Ready    agent   2d21h   v1.21.2
aks-nplin-21116150-vmss000003   Ready    agent   2d21h   v1.21.2
aks-nplin-21116150-vmss000006   Ready    agent   2d21h   v1.21.2
aksnpwin000003                  Ready    agent   2d20h   v1.21.2
aksnpwin000004                  Ready    agent   2d20h   v1.21.2
aksnpwin000005                  Ready    agent   2d20h   v1.21.2

On a linux node everything works fine.

NAME                                    READY   STATUS    RESTARTS   AGE     IP             NODE                            NOMINATED NODE   READINESS GATES
omsagent-xscbv                          2/2     Running   0          2d21h   10.240.1.108   aks-nplin-21116150-vmss000006   <none>           <none>
omsagent-k2zlx                          2/2     Running   0          2d21h   10.240.0.137   aks-nplin-21116150-vmss000002   <none>           <none>
omsagent-pzd4s                          2/2     Running   0          2d21h   10.240.0.79    aks-nplin-21116150-vmss000003   <none>           <none>

But as soon as it goes to a windows node I have a restart all the time. NodeSelector was also checked.

NAME                                    READY   STATUS    RESTARTS   AGE     IP             NODE                            NOMINATED NODE   READINESS GATES
omsagent-win-2vwqd                      1/1     Running   283        2d20h   10.240.2.64    aksnpwin000005                  <none>           <none>
omsagent-win-5kz2h                      1/1     Running   73         2d20h   10.240.1.178   aksnpwin000003                  <none>           <none>
omsagent-win-gmwk6                      1/1     Running   25         2d20h   10.240.1.46    aksnpwin000004                  <none>           <none>

output: kubectl -n kube-system describe pod omsagent-win-2vwqd

Events:
  Type     Reason     Age                    From     Message
  ----     ------     ----                   ----     -------
  Warning  Unhealthy  10m (x950 over 2d20h)  kubelet  Liveness probe failed:
  Normal   Killing    19s (x708 over 2d20h)  kubelet  Container omsagent-win failed liveness probe, will be restarted

I have already tried to give the pods more cpu and ram that worked at the beginning but after a while (about 30 minutes) they go back to their old original values.

Any ideas on how to examine this in a different way?

Thanks in advance!

  • How did it go for you? We are experiencing the same. Suddenly the omsagent-win restarts a lot. Did you find a solution for this? – gorhal Jun 13 '22 at 05:53
  • i opened a similar ticket in the repo of aks. The answer was that it is a known bug and they are working on it. But I must also say that it has become much less (not completely gone but less) after upgrading to the latest k8s version (1.22.6). https://github.com/microsoft/OMS-docker/issues/436 – Andreas Enti Jun 14 '22 at 06:27
  • Thanks for the info. This was good to hear, then I know. We've recently upgraded to 1.22.6. Let's hope for the best. – gorhal Jun 14 '22 at 20:03

0 Answers0