The documentation for concurrent.futures.ThreadPoolExecutor
says:
Changed in version 3.5: If max_workers is
None
or not given, it will default to the number of processors on the machine, multiplied by5
, assuming that ThreadPoolExecutor is often used to overlap I/O instead of CPU work and the number of workers should be higher than the number of workers for ProcessPoolExecutor.
I want to understand why the default max_workers
value depends on the number of CPUs. Regardless of how many CPUs I have, only one Python thread can run at any point in time.
Let us assume each thread is I/O intensive and it spends only 10% of its time in the CPU and 90% of its time waiting for I/O. Let us then assume we have 2 CPUs. We can only run 10 threads to utilize 100% CPU. We can't utilize any more CPU because only one thread runs at any point in time. This holds true even if there are 4 CPUs.
So why is the default max_workers
decided based on the number of CPUs?