0

I am currently using joblib's Parallel and it is working great but running into issues when I run in the server because the processes are not killed.

I have a function that is called on a list of tuples.

I will use a simplified example with numbers but my actual function is more complicated and thats why multiprocessing is necessary -- I have a tuple that has two numbers. function_to_call will return the product and the sum of the two numbers

lst_of_tuples = [(10, 5), (0.5, 10), (8, 3)]
from joblib import Parallel, delayed
def multiprocess(lst_of_tuples):
    return Parallel(n_jobs=-1)(delayed(function_to_call)(t[0], t[1]) for t in lst_of_tuples)

products, sums = list(map(list, zip(*multiprocess(my_lst))))

The expected result of products is (50, 5, 24) and sums is (15, 10.5, 11).

Is it possible to convert this line of code with multiprocessing pool so I can use pool.close() and pool.join() and kill all processes?

codenoodles
  • 133
  • 1
  • 11

1 Answers1

0

This is how I would do it:

from multiprocessing import Pool, cpu_count


def function_to_call(x, y):
    return x * y, x + y

def multiprocess(lst_of_tuples):
    # No sense in creating more processes than can be used:
    pool_size = min(cpu_count(), len(lst_of_tuples))
    pool = Pool(pool_size)
    results = pool.starmap(function_to_call, lst_of_tuples)
    pool.close()
    pool.join()
    return results

# Required for Windows:
if __name__ == '__main__':
    lst_of_tuples = [(10, 5), (0.5, 10), (8, 3)]
    products, sums = zip(*multiprocess(lst_of_tuples))
    print(products, sums)

Note

The call pool.close() will prevent further tasks from being submitted, then wait for all submitted tasks to complete and then the processes will terminate by themselves. The call to pool.join() then just waits (blocks) until all processes have terminated. In the code above, after the map call returns there are no uncompleted tasks left and so the processes will immediately terminate upon calling pool.close().

You can also call pool.terminate() to kill all pool processes immediately. This call will automatically be done when you use the pool as a context handler, as in:

    with Pool() as pool:
        # submit tasks
        ...
    # on exit from the above block a call to pool.terminate() is made
Booboo
  • 38,656
  • 3
  • 37
  • 60