Hardware: We utilize a 24-core (2*12 core) machine. There are 2 separate controllers for an SSD-disk and a SAS-RAID 0-disk. OS: Windows 8.1. Hyper-threading is disabled.
Software:
2.1. There is a master which fills up a work-queue for the workers and collects the results from a result-queue thereafter.
2.2. There are n-workers which retrieve work from the work queue. They write small input files to the disk and start an external process to carry out the actual computation. After the external process has finished output-files in the size of 10-15 MB need to be read in from the file-system and to be parsed accordingly. Finally, the worker places the results in the result-queue and carry on with the next item from the work queue.
The access to the file-system utilizing both of the disks is distributed evenly among the worker-processes.
Observations
4.1. From 0 - 10 workers there is an almost linear speed-up for both multi-threading and multi-processing. Increasing from 10 to 28 workers there is a reasonable but sub-linear speed-up in case of multi-processing but almost no increase in case of multi-threading.
4.2. We did extensive timings for multi-threading and found that the time for the computation stays almost constant with a negligible increase when increasing the numbers of workers. In contrast, when increasing the number of workers from 10 - 40 the time for reading the files from the disks dramatically increases and causes the cores into idling.
4.3. In the case of multi-processing, the workers seem to be rightly able to take full advantage of the two independent file-IO-channels (RAID and SSD) and out-perform multi-threading by far.
Finally the question: What is the bottleneck in case of multi-threading and how can we circumvent it?
Note 1: Avoiding file-system access altogether is not an option, since the external process is a third party software.
Note 2: I'm aware of these answers, but they don't address my question.
Update 2019 On a different machine with 18 cores and Windows 10 we observe exactly the same behavior.