Please note that this question is about ELB itself, not EC2 instances behind ELB
Situation
We have experienced the following ELB issue recently:
- 50% of requests were did not reach our backend and it seems that ELB itself too
- ELB monitoring via AWS console didn't show anything unusual (zero ELB 4xx and ELB 5xx)
- external checks verified that our backend EC2 instances were running well and could be reached
Our assumption is that EC2 instance that ELB is running on had connectivity issues. Ad hoc fix was to create new ELB (in front of the same set of our EC2 instances) and change DNS records.
Questions
- is this something that can happen often
- are there any tools that can detect this quickly enough (we always assume that this is our fault and only after a thorough checks we started to look at AWS)
- is there a way to avoid this happening at all