My ELB environment gives a warning every now and again:
Environment health has transitioned from Degraded to Warning. 4.1 % of the requests to the ELB are failing with HTTP 5xx (3 minutes ago).
However, when i check the instance logs i see no reference to 5xx errors. When i check environment health, there are no 5xx counts shown.
So i set up logging from the Load Balancer, i am able to see 20 cases of 503 errors out of 5,000 in a timespan?
Looking more, about 10 of those show a 503 for both load balance and instance. 10 of these show 503 for just the load balancer.
From some reading i'd assume that's because the load balancer was overloaded? But the queue length in the metrics is always really short and doesn't seem to suggest this.
I think as a result of the above ElasticBeanstalk is constantly adding and removing instances to the group.
Would love some insight into other things we can check?
Note: application errors are all logged to Slack, and we're not getting any of them through, so i really feel like this is an EC2 / ELB issue, but not sure what else we can check.
Thanks