I have a network of one load balancer server (using nginx) lb1 which routes traffic between four web servers web1, web2, web3, web4. These four webservers are routed to using round-robin in nginx.
All servers are set to max_fails=1 and fail_timeout=5s, so when a server is down, it should be ignored fairly quickly if it is not online.
I should note that the average response time of the web pages from each web servers is around 50-150ms, if all four web servers are online. The issue arises when just ONE web server is offline. When one goes offline and a user tries to load another page, the response time varies anywhere from 50ms-25s. Yes, 25 seconds.
I am confused, because I would think that the round-robin and fail_timeout settings would make it so the offline server would be ignored.
Additional, possibly relevant notes: All four web servers are running apache with php5, and memcached is enabled between the four.