Getting Intermittent timeout on EKS when pods are running on multiple Nodes

Question

We have an application running with two pods , if each pod is running in each nodes then we are facing intermittent timeouts (tried with ALB and also on the node itself) , if the both pods are running on single node then we don't face any issue.

Detailed Scenario:

We are using EKS Cluster with 2 Nodes
Using Calico over EKS

Use Case: Everything is working fine

Hello-word is running on 2 pods (A & B), both pods are running on Node1

Curl from ALB - 200
All good

Use Case: 504 timeout

Hello-word is running on 2 pods (A & B), now pod A is running on Node1 and pod B is running on Node2

Curl from anywhere - every alternate req 504
Curl from inside container all 200

Use Case: Everything is working fine

Hello- word is running with pod1 , all working fine

Summary: App is not working only when 2 pods are running on diff diff nodes

Did you take a look https://aws.amazon.com/premiumsupport/knowledge-center/eks-pod-status-troubleshooting/ and https://aws.amazon.com/premiumsupport/knowledge-center/eks-pod-connections/ ? — Malgorzata, Oct 29 '20 at 13:54
Thanks @Malgorzata but checked this too - all seems to be fine FYI - updated issue in detail to help you more — Kapil Yadav, Oct 30 '20 at 05:57
Make sure that you configured security groups and network ACLs to allow data to move between the load balancer and the backend targets. Did you follow Getting started with Application Load Balancers - https://docs.aws.amazon.com/elasticloadbalancing/latest/application/application-load-balancer-getting-started.html ? — Malgorzata, Oct 30 '20 at 12:31
Can you post your fix as an answer to be visible for future community ? — Malgorzata, Nov 02 '20 at 08:51

Getting Intermittent timeout on EKS when pods are running on multiple Nodes

0 Answers0