We submit large CPU intensive jobs in Python 2.7 (that consist of many independent parallel processes) on our development machine which last for days at a time. The responsiveness of the machine slows down a lot when these jobs are running with a large number of processes. Ideally, I would like to limit the number of CPU available during the day when we're developing code and over night run as many processes as efficiently possible.
The Python multiprocessing library allows you to specify the number of process when you initiate a Pool. Is there a way to dynamically change this number each time a new task is initiated?
For instance, allow 20 processes to run during the hours 19-07 and 10 processes from hours 07-19.
One way would be to check the number of active processes using significant CPU. This is how I would like it to work:
from multiprocessing import Pool
import time
pool = Pool(processes=20)
def big_task(x):
while check_n_process(processes=10) is False:
time.sleep(60*60)
x += 1
return x
x = 1
multiple_results = [pool.apply_async(big_task, (x)) for i in range(1000)]
print([res.get() for res in multiple_results])
But I would need to write the 'check_n_process' function.
Any other ideas how this problem could be solved?
(The code needs to run in Python 2.7 - a bash implementation is not feasible).