9

I am investigating the possibility of using Node to act as a reverse proxy. One of the primary goals of my project is for it to be VERY high performance. So I've setup a node server to proxy requests to the target node server that will respond with 'hello world' no matter the request.

Using Apache Bench I've done some comparison on the number of requests processed per second. The proxy, target and caller are each on separate M1 Large instances in AWS. My results are frustrating and confusing.

Direct from caller to Target:

ab -c 100 -n 10000 http://target-instance/

= ~2600 requests/second

From caller through proxy to target

ab -c 100 -n 10000 http://proxy-instance/

= ~1100 requests/second

Using lighttpd I was able to get ~3500 requests/second on proxy and target

I'm disappointed that the proxy server is less performant than the target server. When comparing other products like lighttpd I've seen the proxy achieve comparable results to the target so I'm confused about when Node (supposed to be lightening fast) is not achieving the same.

Here's my proxy code in Node v0.5.9: Am I missing something?

var server =
http.createServer(function(req, res){
    var opts = { host: 'target-instance',
                 port: 80,
                 path: '/',
                 method: 'GET'};
    var proxyRequest = http.get(opts, function(response){
            response.on('data', function(chunk){
                    res.write(chunk);
            });
            response.on('end', function(){
                    res.end()
            });
    });
});
server.listen(80);
jonnysamps
  • 1,067
  • 1
  • 14
  • 20
  • Why do I think I need what? I am investigating building a proxy product and am trying to choose the best base technology. – jonnysamps Nov 11 '11 at 19:16

4 Answers4

8

While Node.js is very efficient, it is not multi-threaded so the proxy node is going to be handling more connections than the target but with only one thread and therefore become the bottleneck. There are two ways around this:

  1. Use a multi-threaded load balancer in front of multiple instances of your node proxy (e.g. nginx).
  2. Change your node proxy to use multiple processes. There are multiple node modules for doing this but node now includes "cluster" out of the box and appears to me to be the simplest method.
Community
  • 1
  • 1
ColinM
  • 13,367
  • 3
  • 42
  • 49
2

Try bouncy: https://github.com/substack/bouncy

It was optimized for very high performance.

thejh
  • 44,854
  • 16
  • 96
  • 107
  • Don't know about performance, is that true? The API is so damn simple to use that I thought of it as the "naïve approach" to proxying. – Camilo Martin Jul 09 '14 at 00:58
0

From the http.request docs:

Sending a 'Connection: keep-alive' will notify Node that the connection to the server should be persisted until the next request.

So I bet your proxy is re-connecting to the target-instance with each request, which is very inefficient. I think your options variable should look like this to speed it up:

var opts  {
    host: 'target-instance',
    port: 80,
    path: '/',
    headers: {
        "Connection": "keep-alive"
    }
};
Dan List
  • 113
  • 1
  • 1
  • 5
  • This doesn't seem to help. The Node docs also mention that it uses Connection: keep-alive by default. In fact, I get much better performance without adding that to the opts. – jonnysamps Nov 11 '11 at 20:18
0

after you add connection: keep-alive header, you should test with keep-alive (-k option) also:

ab -c 100 -n 10000 -k http://xxxx/

Tereska
  • 751
  • 1
  • 7
  • 25
  • This very much speeds up the number of request directly to the target up around 6k/second, but doesn't help through the proxy. – jonnysamps Nov 11 '11 at 20:15