I am trying to use dask.delayed to build up a task graph. This mostly works quite nicely, but I regularly run into situations like this, where I have a number of delayed objects that have a method returning a list of objects of a length that is not easily computed from information I have available at this point:
items = get_collection() # known length
def do_work(item):
# get_list_of_things returns list of "unknown" length
return map(lambda x: x.DoStuff(), item.get_list_of_things())
results = [delayed(do_work(x)) for x in items]
This gives a
TypeError: Delayed objects of unspecified length are not iterable
Is there any way in dask to work around this issue, preferably without having to call .compute() on the intermediate results, as that would destroy most of the upside of having a task graph? It basically means that the graph cannot be fully resolved until after some of its steps have run, but the only thing that is variable is the width of a parallel section, it doesn't change the structure or the depth of the graph.