I am having a lot of data (more than one million) and need to do some calculation on it that finds a certain value out of the millions. This is time consuming. To speed up the process I am trying to use all cores of the CPU. To do this, I am using multiprocessing-pool and I am calling the worker process with starmap_async (I need to hand over multiple arguments). Basically, it works so far with the limitation that I have to wait until all values of the list are executed and all processes are finished before I can continue. Is there a possibility to end the starmap process once one of the processes finds a correct value?
I have already tried several different things such as terminating from the worker process changing the whole structure to a for loop but it seems that the starmap process needs to run to its end and that they cannot be stopped. The only way seems to extract each list-value individually and feed it to a separate Process which creates again a big overhead and slows down the process significantly. Does anyone have an idea?
The solution described here Terminate a Python multiprocessing program once a one of its workers meets a certain condition looks the same but it is not. I have tried this but it doesn’t work. The difference seems to be that none of the arguments is the iterable in the described issue. I have played with this and I couldn’t get it to end the processes before the starmap process finished completely. In the recommended solution the processes are started and ran independently until one finds a solution. In my case starmap seems to continue feeding the processes without checking termination conditions.
import multiprocessing
def worker(x, arg1, arg2):
some calculation with all arguments
**#here I need a possibility to cancel all processes and return the current x value**
if __name__ == '__main__':
arg1 = somthing
arg2 = somthing_else
value_list = (a,b,c,d,e,.......)
pool = multiprocessing.Pool(cpu_count())
p = pool.starmap_async(worker, [(value_list, arg1, arg2) for x in value_list])
pool.close()
pool.join()
for y in p:
if y = correct_value:
print(correct_value)
break