Unexpected test results with nginx as load balancer

Question

I am benchmarking nginx/node.js topologies with the following scenarios:

Benchmark a single node.js server directly
Benchmark 2 node.js servers behind nginx (RR-load balanced)

For both benchmarks, "wrk" is used with the following configuration:

wrk -t12 -c20 -d20s --timeout 2s

All node.js instances are identical. On each http GET-request, they iterate over a given number "n" and increment a variable on every loop.

When I perform the test cases, I get the somewhat surprising results outlined below. I do not understand, why the dual node.js setup (topology 2) performs worse on 1 million iterations - it is even worse than the same 1 million loops on topology 1.

1037 req/s (single) vs. 813 req/s (LB)

I certainly do expect a bit of overhead, since the single operation does not have nginx in front of the node.js instance - but the test results seem really strange.

The calls with 10 and 5 million iterations seem to be doing OK because the increase in throughput is as expected.

Is there a reasonable explanation for that behavior?

The test is executed on a single computer; each node.js instance is listening on a different port.

Nginx uses a standard configuration, with nothing else than:

port 80
2 upstream servers
proxy_pass on "/" route
1024 (default) Worker_connections (increase does not change results)

Scenario 1 (single node.js server):

n [millions]   req/s    avg/max [ms]     requests
          10     134    87.81/166.28         2633
           5     271     44.12/88.48         5413
           1    1037     11.48/24.99        20049

Scenario 2 (nginx as load balancer in front of 2 node.js servers):

n [millions]   req/s    avg/max [ms]     requests
          10     220    51.95/124.87         4512
           5     431    27.79/152.93         8376
           1     813      6.85/35.64        16156  --> ???

are you **100%** sure the requests are really being shared across both instances? *sticky* situation — EMX, Aug 30 '17 at 17:25
@EMX yeah, I added the port to the output and tested it from the browser... then, started wrk — JoeFrizz, Aug 30 '17 at 19:36

score 0 · Answer 1 · edited Jun 20 '20 at 09:12

I have been diggin'... and it is probably related to NGINX Default config. not being efficient enough...

Using HTTP/1.1 spares the overhead of establishing a connection between nginx and node.js with every proxied request and has a significant impact on response latency.

So this could be one of the reasons if you are using HTTP/1.0 (NGINX Default)

Interesting feature : Keepalive

Sets the maximum number of idle keepalive connections to upstream servers that are retained in the cache per one worker process

Sources :

http://www8.org/w8-papers/5c-protocols/key/key.html#SECTION00050000000000000000

https://engineering.gosquared.com/optimising-nginx-node-js-and-networking-for-heavy-workloads

http://blog.argteam.com/coding/hardening-node-js-for-production-part-2-using-nginx-to-avoid-node-js-load/

Thanks for research. I have added keepalive 512; to the proxy section as well as proxy_http_version 1.1; to the nginx.conf but the throughput numbers are just marginally increasing to 950 req/s. What I found interesting by looking at CPU utilization: after about 10s of running the test, the CPU utilization drops significantly to near zero. Before the drop, the utilization was just fine for 2 cores being involved... (doubling the utilization of a single node.js instance) — JoeFrizz, Aug 30 '17 at 21:20

Unexpected test results with nginx as load balancer

1 Answers1