Python multiprocessing.Pool with many processes

Question

I'm trying to create many parallel processes to leverage a 32-core machine but when I looked at top screen, it shown only 5 Python processes. This was my code:

max_processes = min(len(corpus_paths), cpu_count()*2)
__log.debug("Max processes being used: " + str(max_processes))
pool = Pool(max_processes)
for path in corpus_paths:
    pool.apply_async(...)
pool.close()
pool.join()

And this is the configuration of the machine:

[minh.lengoc@compute-1-5 ~]$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                32
On-line CPU(s) list:   0-31
Thread(s) per core:    2
Core(s) per socket:    8
CPU socket(s):         2
NUMA node(s):          4
Vendor ID:             AuthenticAMD
CPU family:            21
Model:                 1
Stepping:              2
CPU MHz:               2099.877
BogoMIPS:              4199.44
Virtualization:        AMD-V
L1d cache:             16K
L1i cache:             64K
L2 cache:              2048K
L3 cache:              6144K
NUMA node0 CPU(s):     0,2,4,6,8,10,12,14
NUMA node1 CPU(s):     16,18,20,22,24,26,28,30
NUMA node2 CPU(s):     1,3,5,7,9,11,13,15
NUMA node3 CPU(s):     17,19,21,23,25,27,29,31

Thank you!

It works now. There must have been something wrong with my code but I couldn't roll back to see what was it. Closed.

And what is the implication of reading from the same disk, please? — minhle_r7, Mar 07 '13 at 10:44
The disks are quite often a bottleneck in multicore programs. Processes will suffer from resource starvation if they cannot fetch work from the disk at a fast enough rate. — Fred Foo, Mar 07 '13 at 10:50
@larsmans: There would still be more than 5 processes spawned though. — Hubro, Oct 10 '14 at 12:02

score 4 · Answer 1 · answered Mar 06 '13 at 15:55

4

One possible reason why not all the cores are used is if the target function being run by pool.apply_async completes too fast. The solution in that case would be to send more data to the target function (so it does more work per call).

It's like shoveling coal into 32 furnaces. If you use a tiny shovel, you might only get to the 5th furnace before the coal in the first furnace is used up. Then you have to refill the first furnace. You never get to use all the furnaces, even if you have a huge pile of coal. If you use a large enough shovel, then you can get all the furnaces burning.

answered Mar 06 '13 at 15:55

unutbu

842,883
184
1,785
1,677

thank you but it wasn't the case because each data chunk is about 120MB – minhle_r7 Mar 06 '13 at 16:07
Perhaps a silly question, but what was the value reported for `max_processes`? – unutbu Mar 06 '13 at 16:15
Yes, I asked that question myself. I printed it out and it was 64. – minhle_r7 Mar 06 '13 at 16:22
What is the `len(corpus_paths)`? – unutbu Mar 06 '13 at 16:26
it was 128 (I had about 16GB data so I divided it to many small parts) – minhle_r7 Mar 06 '13 at 17:00

score 0 · Answer 2 · answered Mar 31 '13 at 02:09

I have a similar issue, in my case I am using gearman and want to have workers per core, initially used "Pool" but notice that only one worker was processing the messages, so I replace the "Pool" with code below to use all the "cores - 1" so that I can have workers reading queues simultaneously:

if __name__ == '__main__':
jobs = []
for i in range(multiprocessing.cpu_count() - 1): 
    p = multiprocessing.Process(target=start_worker)
    jobs.append(p)
    p.start()

for j in jobs:
    j.join()
    print '%s.exitcode = %s' % (j.name, j.exitcode)

what do you think ? any better way/ideas to handle this?

Python multiprocessing.Pool with many processes

2 Answers2