0

I'm trying to create many parallel processes to leverage a 32-core machine but when I looked at top screen, it shown only 5 Python processes. This was my code:

max_processes = min(len(corpus_paths), cpu_count()*2)
__log.debug("Max processes being used: " + str(max_processes))
pool = Pool(max_processes)
for path in corpus_paths:
    pool.apply_async(...)
pool.close()
pool.join()

And this is the configuration of the machine:

[minh.lengoc@compute-1-5 ~]$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                32
On-line CPU(s) list:   0-31
Thread(s) per core:    2
Core(s) per socket:    8
CPU socket(s):         2
NUMA node(s):          4
Vendor ID:             AuthenticAMD
CPU family:            21
Model:                 1
Stepping:              2
CPU MHz:               2099.877
BogoMIPS:              4199.44
Virtualization:        AMD-V
L1d cache:             16K
L1i cache:             64K
L2 cache:              2048K
L3 cache:              6144K
NUMA node0 CPU(s):     0,2,4,6,8,10,12,14
NUMA node1 CPU(s):     16,18,20,22,24,26,28,30
NUMA node2 CPU(s):     1,3,5,7,9,11,13,15
NUMA node3 CPU(s):     17,19,21,23,25,27,29,31

Thank you!


It works now. There must have been something wrong with my code but I couldn't roll back to see what was it. Closed.

Kev
  • 118,037
  • 53
  • 300
  • 385
minhle_r7
  • 771
  • 9
  • 20

2 Answers2

4

One possible reason why not all the cores are used is if the target function being run by pool.apply_async completes too fast. The solution in that case would be to send more data to the target function (so it does more work per call).

It's like shoveling coal into 32 furnaces. If you use a tiny shovel, you might only get to the 5th furnace before the coal in the first furnace is used up. Then you have to refill the first furnace. You never get to use all the furnaces, even if you have a huge pile of coal. If you use a large enough shovel, then you can get all the furnaces burning.

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
0

I have a similar issue, in my case I am using gearman and want to have workers per core, initially used "Pool" but notice that only one worker was processing the messages, so I replace the "Pool" with code below to use all the "cores - 1" so that I can have workers reading queues simultaneously:

if __name__ == '__main__':
jobs = []
for i in range(multiprocessing.cpu_count() - 1): 
    p = multiprocessing.Process(target=start_worker)
    jobs.append(p)
    p.start()

for j in jobs:
    j.join()
    print '%s.exitcode = %s' % (j.name, j.exitcode)

what do you think ? any better way/ideas to handle this?

nbari
  • 25,603
  • 10
  • 76
  • 131