The problem
Say I have 20 processors available. I want to pass arguments to an external
program from IPython that runs best with 4 threads at a time, and use map_async to keep adding jobs until all jobs are finished. Below is example code where I believe just one process would be assigned to each job at a time. Is this an example where you would use the 'chunksize' flag? It seems that would do the opposite, i.e., send multiple jobs to one processor.
Start engines outside of IPython
ipcluster start -n 20 --daemon
IPython code
import ipyparallel as ipp
import subprocess
def func(args):
""" function that calls external prog w/ 4 threads """
subprocess.call([some_external_program, args, nthreads=4])
args = [...]
ipyclient = ipp.Client().load_balanced_view()
results = ipyclient.map_async(func, args)
results.get()