If I pre-scatter a data object across worker nodes, does it get copied in its entirety to each of the worker nodes? Is there an advantage in doing so if that data object is big?
Using the futures
interface as an example:
client.scatter(data, broadcast=True)
results = dict()
for i in tqdm_notebook(range(replicates)):
results[i] = client.submit(nn_train_func, data, **params)
Using the delayed
interface as an example:
client.scatter(data, broadcast=True)
results = dict()
for i in tqdm_notebook(range(replicates)):
results[i] = delayed(nn_train_func, data, **params)
The reason I ask is because I noticed the following phenomena:
- If I pre-scatter the data,
delayed
appears to re-send data to the worker nodes, thus approximately doubling memory usage. It appears that pre-scattering is not doing what I expected it to do, which is allow for the worker nodes to reference the pre-scattered data. - The
futures
interface takes a long time to iterate through the loop (significantly longer). I am currently not sure how to identify where the bottleneck here is. - Using the
delayed
interface, from the time thecompute()
function is called to the time that activity is reflected on the dashboard, there is an extensive delay, which I suspected was due to data copying.