I have a dual core machine with 4 logical processors thanks to hyper-threading. I am executing a SHA1 pre-image brute force test in C#. In each thread I basically have a for loop and compute a SHA1 hash and then compare the hash to what I am looking for. I made sure that all threads execute in complete separation. No memory is shared between them. (Except one variable: long count, which I increment in each thread using:
System.Threading.Interlocked.Increment(ref count);
I get about 1 mln sha1/s with 2 threads and 1.3 mln sha1/s with 4 threads. I fail to see why do I get a 30% bonus from HT in this case. Both cores should be busy doing their stuff, so increasing the number of threads beyond 2 should not give me any benefit. Can anyone explain why?