I am struggling with 504 responses from my ELB sometimes. Not something frequent but still occurring.
The architecture is as follows:
- ELB with 180 seconds idle time.
- Target group of 2 instances, the balancing here is RR. I tried to follow this: https://aws.amazon.com/premiumsupport/knowledge-center/504-error-classic/ but it seems that my targets response time is lower than the idle timeout which is 180 seconds.
From the metric of targets response time I see that the max response time is around 170 seconds and it's very rare but still there are 504 coming from the load balancer. I suspect that the problem is in the default keep-alive time out of my back end server. The pre-opened idle TCP connections maybe closing before the idle timeout of the load balancer is reached.
As the backend I am using spring boot application with embedded tomcat 8.5.51 as the server. I haven't changed something in the default keepalive time out of the service. I didn't find any default keepalive timeout in the documentation of this tomcat version.
Any ideas what can I do and if it's really the problem RC?