0

We have a cluster on EKS and from 2 days ago we noticed some network issues in our cluster. Imagine a scenario in which we have 2 workers nodes (w1 and w2) and 3 pods with their own services (A, B and C). Pods A and B are located on w1 and pod C are located on w2.

The problem is A cannot reach C but B can. When I go inside pod A and try curl -vvv http://C/ the DNS resolves to the IP of the service of the C but after that it hangs and after some time I get a timeout. The strange thing is restarting/deleting pod A didn't solve the issue but when I deleted C it solved the issue.

I have never seen something like this and I checked the logs of the kube-proxy of the worker nodes but I didn't see an error or anything strange. Does someone have any idea what is going on here?

AVarf
  • 4,481
  • 9
  • 47
  • 74
  • Please provide your yaml file configuration for Pods and Services. It is not possible to check what was wrong without any details about it – RadekW Feb 04 '22 at 13:23
  • There is nothing strange in the yaml files and they are working since we are deploying the exact set of components on multiple K8s clusters and this is the first time in 2 years that we are seeing something like this. – AVarf Feb 04 '22 at 13:55
  • Does the problem exist again? It could be an issue with EKS CNI, not with K8S directly – RadekW Feb 07 '22 at 08:56
  • No, after I restarted the C everything is working fine. I agree that it might be the CNI but why does this happen and how we can prevent it? – AVarf Feb 07 '22 at 10:47

0 Answers0