We have some timeouts that are driving me crazy, under virtually no load (probably a couple people hitting the servers per minute).
We use nginx to redirect non-SSL to SSL, terminate the SSL, and then reverse proxy the request to haproxy which sends it to one of our app servers.
Our app servers run passenger (rails) + nginx. We have a mysql master + slave and a memcached instance as well which we recently started using for some queries.
Here is a typical error I see in the first layer in the nginx error log that passes the requests to haproxy (with details obfuscated):
2012/02/25 06:42:15 [error] 7838#0: *60797 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 1.2.3.4, server: domain.com, request: "GET /api/v1/some_route HTTP/1.1", upstream: "http://127.0.0.1:82/api/v1/some_route", host: "domain.com"
I am not sure if it's haproxy, passenger+nginx, rails, memcached. One empirical data point is that they seem to happen in bunches, i.e. if we get one timeout, we see several others, then they go away.
Any help would be greatly appreciated. Happy to post any configs or anything that would help.