1

We have an application running with two pods , if each pod is running in each nodes then we are facing intermittent timeouts (tried with ALB and also on the node itself) , if the both pods are running on single node then we don't face any issue.

Detailed Scenario:

  1. We are using EKS Cluster with 2 Nodes
  2. Using Calico over EKS

Use Case: Everything is working fine

  1. Hello-word is running on 2 pods (A & B), both pods are running on Node1
  • Curl from ALB - 200
  • All good

Use Case: 504 timeout

  1. Hello-word is running on 2 pods (A & B), now pod A is running on Node1 and pod B is running on Node2
  • Curl from anywhere - every alternate req 504
  • Curl from inside container all 200

Use Case: Everything is working fine

  1. Hello- word is running with pod1 , all working fine

Summary: App is not working only when 2 pods are running on diff diff nodes

  • Did you take a look https://aws.amazon.com/premiumsupport/knowledge-center/eks-pod-status-troubleshooting/ and https://aws.amazon.com/premiumsupport/knowledge-center/eks-pod-connections/ ? – Malgorzata Oct 29 '20 at 13:54
  • Thanks @Malgorzata but checked this too - all seems to be fine FYI - updated issue in detail to help you more – Kapil Yadav Oct 30 '20 at 05:57
  • Make sure that you configured security groups and network ACLs to allow data to move between the load balancer and the backend targets. Did you follow Getting started with Application Load Balancers - https://docs.aws.amazon.com/elasticloadbalancing/latest/application/application-load-balancer-getting-started.html ? – Malgorzata Oct 30 '20 at 12:31
  • It is fixed. This was caused by some N/E restriction – Kapil Yadav Nov 01 '20 at 08:46
  • 1
    Can you post your fix as an answer to be visible for future community ? – Malgorzata Nov 02 '20 at 08:51

0 Answers0