0

Working on a project where we need to server a small static xml file ~40k / s.

All incoming requests are sent to the server from HAProxy. However, none of the requests will be persistent.

The issue is that when benchmarking with non-Persistent requests, the nginx instance caps out at 19 114 req/s. When persistent connections are enabled, performance increases by nearly an order of magnitude, to 168 867 req/s. The results are similar with G-wan.

When benchmarking non-persistent requests, CPU usage is minimal.

What can I do to increase performance with non-persistent connections and nginx?


[root@spare01 lighttpd-weighttp-c24b505]# ./weighttp -n 1000000 -c 100 -t 16 "http://192.168.1.40/feed.txt"
finished in 52 sec, 315 millisec and 603 microsec, 19114 req/s, 5413 kbyte/s
requests: 1000000 total, 1000000 started, 1000000 done, 1000000 succeeded, 0 failed, 0 errored
status codes: 1000000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 290000000 bytes total, 231000000 bytes http, 59000000 bytes data




[root@spare01 lighttpd-weighttp-c24b505]# ./weighttp -n 1000000 -c 100 -t 16 -k "http://192.168.1.40/feed.txt"
finished in 5 sec, 921 millisec and 791 microsec, 168867 req/s, 48640 kbyte/s
requests: 1000000 total, 1000000 started, 1000000 done, 1000000 succeeded, 0 failed, 0 errored
status codes: 1000000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 294950245 bytes total, 235950245 bytes http, 59000000 bytes data

1 Answers1

0

Your 2 tests are similar (except HTTP Keep-Alives):

 ./weighttp -n 1000000 -c 100 -t 16 "http://192.168.1.40/feed.txt"
 ./weighttp -n 1000000 -c 100 -t 16 -k "http://192.168.1.40/feed.txt"

And the one with HTTP Keep-Alives is 10x faster:

finished in 52 sec, 19114 req/s, 5413 kbyte/s
finished in 5 sec, 168867 req/s, 48640 kbyte/s

First, HTTP Keep-Alives (persistant connections) make HTTP requests run faster because:

  • Without HTTP Keep-Alives, the client must establish a new CONNECTION for EACH request (this is slow because of the TCP handshake).

  • With HTTP Keep-Alives, the client can send all requests at once (using the SAME CONNECTION). This is faster because there are less things to do.


Second, you say that the static file XML size is "small".

Is "small" nearer to 1 KB or 1 MB? We don't know. But that makes a huge difference in terms of available options to speedup things.

Huge files are usually served through sendfile() because it works in the kernel, freeing the usermode server from the burden of reading from disk and buffering.

Small files can use more flexible options available for application developers in usermode, but here also, file size matters (bytes and kilobytes are different animals).


Third, you are using 16 threads with your test. Are you really enjoying 16 PHYSICAL CPU Cores on BOTH the client and the server machines?

If that's not the case, then you are simply slowing-down the test to the point that you are no longer testing the web servers.


As you see, many factors have an influence on performance. And there are more with OS tuning (the TCP stack options, available file handles, system buffers, etc.).

To get the most of a system, you need to examinate all those parameters, and pick the best for your particular exercise.

Gil
  • 3,279
  • 1
  • 15
  • 25
  • Thanks @Gil. I will respond in kind in three parts. First, the distinction makes sense after your explanation. My analysis of the situation is, that because nginx can server 168k req/s WITH keepalive, the bottleneck keeping it at 19k must be in the OS or TCP tuning. Is this correct? Second, the XML file is 59 bytes. I have tried enabling and disabling sendfile() to (obviously) no effect. Third, you are correct that there are not 16 cores, but 8 with hyperthreading on both servers. I will adjust these settings and re-benchmark. – user2370628 May 12 '13 at 14:11
  • Finally, as the bottleneck seems to be in the connection stage, can you recommend any resources for addressing this? Most pages I have been able to find simply prescribe a huge array of sysctl TCP settings with no explanation. None have had any great effect. Thanks again! – user2370628 May 12 '13 at 14:11
  • Since TCP hanshaking is done in the kernel, tuning the TCP stack (implemented in the kernel) is your only option, unless you beleive you can do better in usermode. And this tuning is made by modifying the sysctl configuration, officially "documented" in http://linux.die.net/man/5/sysctl.conf but this other reference is, well, more useful: https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt . Can you accept my answer? – Gil May 14 '13 at 13:38