I am writing code that uses AES in multi-core, It is very efficient on my laptop 8 core Intel. But when I moving on a machine with more core like 48 - 72 (Xeon) the performance is bad. I think because AES-NI works badly on multi-core because all the CPUs share the same memory for AES-NI instructions. Is there a way to disable hardware aes-ni usage when using the crypto/aes library?
Asked
Active
Viewed 216 times
1
-
Have you tried benchmarking or profiling with `pprof` for example? – Hymns For Disco Jan 15 '21 at 15:43
-
Each core has its own AES-NI instructions so that doesn't make much sense. The data must of course pass through the shared cache(s). Beware that not every thread will have it's own core if hyperthreading is enabled (presuming that you don't go over the amount of logical cores in the first place, of course). Make sure that any software that you use is actually configured to use AES-NI as well... – Maarten Bodewes Jan 15 '21 at 16:07
-
2I don't see that a software only AES implementation will outperform AES-NI by the way, so disabling AES-NI is very unlikely to help (much). – Maarten Bodewes Jan 15 '21 at 16:09
-
2Two questions. 1) Have you tried pinning your process to 8 cores on Xeon? I mean, you should try to rule out problems with your approach which simply scales badly (notice, we have zero information about your program). 2) Have you tried reading the source code? (it's freely available) to figure out what instruction set gets used on Xeon. In either case, this sounds like a question for [the mailing list](https://groups.google.com/forum/#!forum/golang-nuts), not for SO. Also please be sure to mention the version of Go you're using and its flavor (stock or a part of GCC). – kostix Jan 15 '21 at 16:09
-
I think that my problems depend on the fact that I do a lot of AES operations on a lower number of blocks (2) changing the key every time. – AntonioMuso Jan 15 '21 at 17:05
-
But you do that on both systems I suppose, so that would still mean that there is a difference. You could at least try and see what happens if you use more blocks vs key change so you can determine that this is actually causing the issue. Note that profiling applications from a distance is pretty hard thing to do... – Maarten Bodewes Jan 16 '21 at 11:58