0

I've seen something odd regarding OpenSSL performance.

This is the output of 'openssl speed aes-128-cbc' on a physical HP Bl460c Gen8 with dual E5-2680's running RHEL/OEL 6.4x64 and OpenSSL 1.0.0-fips;

Doing aes-128 cbc for 3s on 16 size blocks: 19853475 aes-128 cbc's in 2.99s
Doing aes-128 cbc for 3s on 64 size blocks: 5366868 aes-128 cbc's in 3.00s
Doing aes-128 cbc for 3s on 256 size blocks: 1364167 aes-128 cbc's in 3.00s
Doing aes-128 cbc for 3s on 1024 size blocks: 343297 aes-128 cbc's in 2.99s
Doing aes-128 cbc for 3s on 8192 size blocks: 43002 aes-128 cbc's in 3.00s

I installed OpenSSL 1.0.1f on the same blade and retested, getting these results;

Doing aes-128 cbc for 3s on 16 size blocks: 19887908 aes-128 cbc's in 2.99s
Doing aes-128 cbc for 3s on 64 size blocks: 5367604 aes-128 cbc's in 3.00s
Doing aes-128 cbc for 3s on 256 size blocks: 1365296 aes-128 cbc's in 3.00s
Doing aes-128 cbc for 3s on 1024 size blocks: 343261 aes-128 cbc's in 3.00s
Doing aes-128 cbc for 3s on 8192 size blocks: 42996 aes-128 cbc's in 2.99s

They're broadly similar.

But then for reference I ran the same test on an appliance VM (4 x vCPU, 8GB, ESXi 5.5 on a identical blade to above) running SuSE 11 and OpenSSL 0.9.8-fips and got the following result;

Doing aes-128 cbc for 3s on 16 size blocks: 31056333 aes-128 cbc's in 2.99s
Doing aes-128 cbc for 3s on 64 size blocks: 10296043 aes-128 cbc's in 3.00s
Doing aes-128 cbc for 3s on 256 size blocks: 2772200 aes-128 cbc's in 2.99s
Doing aes-128 cbc for 3s on 1024 size blocks: 712440 aes-128 cbc's in 3.00s
Doing aes-128 cbc for 3s on 8192 size blocks: 89701 aes-128 cbc's in 2.99s

More than double the performance in most cases!

Has anyone any idea what's going on here please - I've read a whole bunch of OpenSSL documents and Intel's OpenSSL documents regarding their hardware AES-NI components but I'm confused by this.

Chopper3
  • 101,299
  • 9
  • 108
  • 239
  • 1
    I know some of the Intel sample code for the AES instructions had issues with HT that resulted in 50% perf drops unless HT was disabled. If the VM is being allocated real cores here then you might be seeing something similar. I don't know enough about how OpenSSL handles multicore by default though so it's a long shot. the relevant article is here: http://software.intel.com/en-us/articles/ipp-crypto-sample-performance-for-openssl-too-slow-on-hyper-threading-systems – Helvick Jan 21 '14 at 13:31
  • @Helvick - I just tested that and it doesn't seem to be the issue, thanks for the suggestion though :) – Chopper3 Jan 21 '14 at 13:46
  • Compiler used, compiler options, library build options, intel turbo boost. I just ran the same command on my laptop and got slightly faster results than your first run on the HP. Would you mind adding the long list of build information openssl speed emits after it's finished? – etherfish Jan 21 '14 at 14:38

0 Answers0