I need help about a problem that I'm pretty sure dask can solve. But I don't know how to tackle it.
I need to construct a tree recursively.
For each node if a criterion is met a computation (compute_val
) is done else 2 new childs are created. The same treament is performed on the childs (build
).
Then if all the childs of node had performed a computation we can proceed to a merge (merge
). The merge can perform a fusion of the childs (if they both meet a criterion) or nothing.
For the moment I was able to parallelize only the first level and I don't know which tools of dask I should use to be more effective.
This is a simplified MRE sequential of what I want to achieve:
import numpy as np
import time
class Node:
def __init__(self, level):
self.level = level
self.val = None
def merge(node, childs):
values = [child.val for child in childs]
if all(values) and sum(values)<0.1:
node.val = np.mean(values)
else:
node.childs = childs
return node
def compute_val():
time.sleep(0.1)
return np.random.rand(1)
def build(node):
print(node.level)
if (np.random.rand(1) < 0.1 and node.level>1) or node.level>5:
node.val = compute_val()
else:
childs = [build(Node(level=node.level+1)) for _ in range(2)]
node = merge(node, childs)
return node
tree = build(Node(level=0))