5

I have recently set up a Node.js based web socket server that has been tested to handle around 2,000 new connection requests per second on a small EC2 instance (m1.small). Considering the cost of a m1.small instance, and the ability to put multiple instances behind a WebSocket capable proxy server such as HAProxy, we are very happy with the results.

However, we realised we had not done any testing using SSL yet, so looked into a number of SSL options. It became apparent that terminating SSL connections at the proxy server is ideal because then the proxy server can inspect the traffic and insert headers such as X-Forward-For so that the server knows which IP the request came from.

The SSL termination solutions I looked at where Pound, stunnel and stud, all of which allowed incoming connections on 443 to be terminated, and then passed onto HAProxy on port 80, which in turn passes the connection onto the web servers. Unfortunately however, I found that sending traffic to the SSL termination proxy server on a c1.medium (High CPU) instance very quickly consumed all CPU resources, and only at a rate of 50 or so requests per second. I tried using all three of the solution listed above, and all of them performed roughly the same as I assume under the hood they all rely on OpenSSL anyway. I tried using a 64 bit very large High CPU instance (c1.xlarge) and found that performance only scale linearly with cost. So based on EC2 pricing, I'd need to pay roughly $600p/m for 200 SSL requests per second, as opposed to $60p/m for 2,000 non SSL requests per second. The former price becomes economically unviable very quickly when we start planning to accept 1,000s or 10,000s of requests per second.

I also tried terminating the SSL using Node.js' https server, and the performance was very similar to Pound, stunnel and stud, so no clear advantage to that approach.

So what I am hoping someone can help with is advising how I can get around this ridiculous cost we have to absorb to provide SSL connections. I have heard that SSL hardware accelerators provide much better performance as the hardware is designed for SSL encryption and decryption, but as we are currently using Amazon EC2 for all of our servers, using SSL hardware accelerators is not an option unless we have a separate data centre with physical servers. I am just struggling to see how the likes of Amazon, Google, Facebook can provide all their traffic over SSL when the cost of this is so high. There must be a better solution out there.

Any advice or ideas would be greatly appreciated.

Thanks Matt

  • 2
    Amazon's Elastic Load Balancer can handle SSL termination. – ceejayoz Feb 01 '12 at 16:37
  • 1
    Are you assuming each of those SSL connections forms a new session? Or are you testing with some level of session reuse? It makes a *huge* difference. – David Schwartz Feb 01 '12 at 17:07
  • @ceejayoz, unfortunately ELB does not support WebSockets – Matthew O'Riordan Feb 01 '12 at 19:08
  • @DavidSchwartz unfortunately we cannot rely on session reuse because we're talking about new sessions from new clients – Matthew O'Riordan Feb 01 '12 at 19:09
  • I'm curious to know what was your ultimate decision on this topic? – Mxx Aug 14 '12 at 01:55
  • I went with ELB balancing TCP sockets, which seems to work really well. [See my tests of ELB with WebSockets.](http://blog.mattheworiordan.com/post/24620577877/part-2-how-elastic-are-amazon-elastic-load-balancers) The only downside is that because ELB is load balancing at a TCP level and not an HTTP level, the source IP cannot be determined. Apparently something Amazon are addressing, but no news on that as yet. – Matthew O'Riordan Aug 14 '12 at 15:46

3 Answers3

3

Firstly, good on you for benchmarking to start. My instinct from there makes me wonder what key size you're using. It seems to me you should be able to terminate far more than 200 connections per second. If you're using a key size larger than 1024, know that the performance drops off very quickly.

If you're using a smaller key and still running into issues, I'd take a strong look at the GPU offerings that EC2 has to offer. SSLShader might be a cost-effective change-over after a certain number of connections per second.

Also, investigating @ceejayoz's mention of Elastic Load Balancer has merit.

Jeff Ferland
  • 20,547
  • 2
  • 62
  • 85
  • Thanks @JeffFerland. FYI, since this post I had revisited my benchmarks with a 1024 bit key instead of a 2048 bit key, and fortunately I do get around a 5x performance increase, inline with [F5's benchmarking for different key size](http://support.f5.com/kb/en-us/solutions/public/13000/000/sol13067.html). However, that still means we're getting only 125 SSL connections per second per core, as opposed to the recommended 1,500 in the Imperial Violet article you sent. SSLShader looks like an interesting solution, I'm going to look at that now. – Matthew O'Riordan Feb 01 '12 at 19:20
  • SSLShader certainly looks interesting, but unfortunately it does not look like it's in production and the library itself is not yet available... what a shame. – Matthew O'Riordan Feb 01 '12 at 19:37
  • @MatthewO'Riordan Well, that's a shame. I guess never actually loading it up let me overlook that. What symmetric cipher are you using? For speed, RC4 would be the way to go. See also http://devcentral.f5.com/weblogs/macvittie/archive/2011/01/31/dispelling-the-new-ssl-myth.aspx – Jeff Ferland Feb 06 '12 at 21:54
2

You're possibly doing the benchmarking wrong. I doubt you're really expecting 200 unique new SSL visitors every second ? If any of those connections are re-connections from people who recently visited, you should be using SSL caching - this kind of thing:

server.on('newSession', function(id, data) { tlsSessionStore[id] = data; });

server.on('resumeSession', function(id, cb) { cb(null, tlsSessionStore[id] || null); });

And, of course, your benchmark needs to present itself in your tests as the correct proportion of virgin new connections and resumed/reused sessions as makes sense for your application.

Also - the ciphers you choose and key sizes, as mentioned earlier, probably also play roles in the speed.

anon
  • 21
  • 1
0

It appears SSL speed depends on the algorithm and desired security level, i haven't yet benchmarked my EC2 instance, but I wanted to share some tips with everyone anyway about enabling Google-style ECDHE key exchange with pre-selected SSL algorithms to avoid BEAST and other SSL mis-configurations.

Some good links to get started: (no one place yet has everything, I should write a handbook but until then I've made this post a community wiki if anyone wants to contribute links and tips!)

And take a look at https://www.vbulletin.com/forum/showthread.php/401411-Time-to-improve-the-site-security for some conversation about why SSL isn't "just SSL" these days.