6

Given the following Python code:

import multiprocessing

def unique(somelist):
    return len(set(somelist)) == len(somelist)


if __name__ == '__main__':
    somelist = [[1,2,3,4,5,6,7,8,9,10,11,12,13,2], [1,2,3,4,5], [1,2,3,4,5,6,7,8,9,1], [0,1,5,1]]

    pool = multiprocessing.Pool()
    reslist = pool.map(unique, somelist)
    pool.close()
    pool.join()
    print "Done!"

    print reslist

Now imagine, that the lists with integers in this toy example are extremely long, and what I'd like to achieve here is the following: if one of the lists in somelist returns True, kill all running processes.

This leads to two questions (and probably more which I haven't come up with):

  • How can I "read" or "listen" from a finished process the result, while other processes are running? If e.g. a process is dealing with [1,2,3,4,5] from somelist, and is finished before all other processes, how can I read out the result from that process in this very moment?

  • Given the case that it is possible to "read" out the result of a finished process while other are running: how can I use this result as a condition to terminate all other running processes?

    e.g. If one process has finished and returned True, how I can use this as a condition to terminate all other (still) running processes?

martineau
  • 119,623
  • 25
  • 170
  • 301
Daniyal
  • 885
  • 3
  • 16
  • 28

2 Answers2

8

Use pool.imap_unordered to view the results in any order they come up.

reslist = pool.imap_unordered(unique, somelist)
pool.close()
for res in reslist:
    if res:  # or set other condition here
        pool.terminate()
        break
pool.join()

You can iterate over an imap reslist in your main process while the pool processes are still generating results.

Tore Eschliman
  • 2,477
  • 10
  • 16
  • Also break the loop, otherwise you could get stuck waiting for the next result after the pool terminates. – Bi Rico Sep 10 '16 at 19:05
2

Without fancy IPC (inter-process communication) tricks, easiest is to use a Pool method with a callback function instead. The callback runs in the main program (in a thread created by multiprocessing), and consumes each result as it becomes available. When the callback sees a result you like, it can terminate the Pool. For example,

import multiprocessing as mp

def worker(i):
    from time import sleep
    sleep(i)
    return i, (i == 5)

def callback(t):
    i, quit = t
    result[i] = quit
    if quit:
        pool.terminate()

if __name__ == "__main__":
    N = 50
    pool = mp.Pool()
    result = [None] * N
    for i in range(N):
        pool.apply_async(func=worker, args=(i,), callback=callback)
    pool.close()
    pool.join()
    print(result)

Which will almost certainly display the following (OS scheduling vagaries may allow another input or two to be consumed):

[False, False, False, False, False, True, None, None, None, None,
 None, None, None, None, None, None, None, None, None, None,
 None, None, None, None, None, None, None, None, None, None,
 None, None, None, None, None, None, None, None, None, None,
 None, None, None, None, None, None, None, None, None, None]
Tim Peters
  • 67,464
  • 13
  • 126
  • 132
  • The same basic idea could also be used to terminate a `Pool` when an error occurs in one of the worker processes because `apply_async()` also supports an `error_callback` function. I just used it to answer [Terminating all processes in Multiprocessing Pool](https://stackoverflow.com/questions/69320469/terminating-all-processes-in-multiprocessing-pool). – martineau Sep 24 '21 at 23:47