1

I am running several fastcgi servers behind Nginx. I run 3 Nginx workers and 6 fastcgi servers as upstream backends.

When I run load tests of 1 req per sec, I can clearly see that on avarage a reply is 0.1sec, but from time to time there are 3.1 sec responses.

It is a suspiciously deterministic number and it happens even at very small loads from time to time. Both CPU and Memory are of no issue at all.

Any idea where this delay may be comming from? Any suggestions how to debug this?

Many thanks, Barry.

Barry
  • 31
  • 4

2 Answers2

1

3 seconds is tcp timeout if server is not responding. Check your nginx logs for connection timeouts.

DukeLion
  • 3,259
  • 1
  • 18
  • 19
0

Are you running with parallell requests? If so, please see below. If not, I would suggest looking into your backend servers. Maybe your environment has some kind of profiling tools, which you can use too see the time from a request is received to a response is sent.

If you are running in parallell: if all your 6 fastcgi servers are busy serving requests, the next request would have to wait until one fastcgi server is ready to process it. If the requests you are making in your trials are of somewhat similar nature and have a similar response time, then you would see the same patterns over and over again.

What is your backend? Is it threaded?

Here's what I suggest:

  1. Set up some monitoring so you can see the response times over time, and maybe even create a nice graph
    • Try increasing the number of fastcgi processes and/or use threads if it suits your environment.
    • Inspect the response time, go back to step 2.

By the way, use just one nginx worker, you don't need more than one unless you have tons of traffic.

knutin
  • 101
  • Thanks for the answer. Let me add some details: I am running Django as the server. I have profiled the server from entry of the request to the reply, and the delay is NOT there. FCGI_METHOD=threaded. I see this delay even when the load is VERY low, just happens every hundred queries or so. I will try working with single worker to see how it goes. Any other suggestions? – Barry Nov 13 '10 at 08:09