0

I have a node (express) app running on AWS that is randomly returning 504 (GATEWAY_TIMEOUT) without reaching the actual timeout (60 sec) threshold:

504 requests

You can see that the requests following the failed ones take more time than the "timed out" ones...

On my express app I have:

server.keepAliveTimeout = 65000;

Any ideas?

EDIT: Adding ELB Logs:

2019-01-18T09:06:56.554353Z a38e67823174c11e9a984022fe7c311b 189.58.239.206:51399 - -1 -1 -1 504 0 0 0 "GET <app_endpoint> HTTP/1.1" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2

2019-01-18T09:06:56.564478Z a38e67823174c11e9a984022fe7c311b 189.58.239.206:51400 - -1 -1 -1 504 0 0 0 "GET <app_endpoint> HTTP/1.1" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2

2019-01-18T09:06:56.580591Z a38e67823174c11e9a984022fe7c311b 189.58.239.206:51401 - -1 -1 -1 504 0 0 0 "GET <app_endpoint> HTTP/1.1" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2

2019-01-18T09:06:56.602049Z a38e67823174c11e9a984022fe7c311b 189.58.239.206:51398 - -1 -1 -1 504 0 0 0 "GET <app_endpoint> HTTP/1.1" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2

Frankra
  • 133
  • 8
  • Is there any load balancer behind the web server? – laika Jan 17 '19 at 12:55
  • Are you using CloudFront distribution? – Matus Dubrava Jan 17 '19 at 12:59
  • @laika yep, ELB – Frankra Jan 17 '19 at 16:07
  • @MatusDubrava, no – Frankra Jan 17 '19 at 16:07
  • 1
    Have you reviewed the log entries for the ELB and your app server for these requests? – Michael - sqlbot Jan 17 '19 at 17:17
  • What @Michael-sqlbot suggested would be the best - check access and app logs, determine whether your app server was even hit, or whether the request _got lost_ between ELB and app server... 504 errors on ELB can have multiple reasons, e.g. app server under heavy load and hence dropping connections... If you're using Elastic Beanstalk and the app is being updated with all-at-once deployment procedure, then downtime is unavoidable. – laika Jan 18 '19 at 09:26
  • @laika I have added the logs on the description. Actually they don't seem to say anything... Also, I can ensure there is no heavy load on the server as I am the only one calling its APIs and the app is not restarting at any point according to our kibana logs... It just randomly 504 timeouts the request... – Frankra Jan 18 '19 at 13:18
  • Is your application not logging anything about these requests? Have you tried watching the server for unusual activity (including closed connections that correlate in time) with wireshark? Some implementations limit the number of requests on a keep-alive connection, after which the server closes the connection. I don't see a parameter for that limit in the Node docs but I may be overlooking it. – Michael - sqlbot Jan 18 '19 at 14:10

1 Answers1

1

Don't know if this a usable comment: I just registered a domain at Route53 I linked my ELB configuration (running HTTP Express server to the domain) -> Worked like a charm Then I added security certificates and created the 443 part in ELB Certificates where shown -> link went perfect... typed domain in browser -> 504 (spinning busy and after a certain time 504) I looked i lots of places - Even wanted to add hhtps to my webserver. Then I realised it was not needed because of the ELB setup after a reverse proxy What solved it for me was a total cache refresh in Chrom (command+option+R)! Litteraly nothing changed at all and my webapp is now HTTPS.

Rik D
  • 31
  • 1
  • 3