The logs of the function submitted via the client are immediately displayed. Instead, the logs are expected to be displayed on client.gather(futures)
. The expected behavior could be achieved using Delayed but not using Futures.
Here is the code to reproduce issue:
from dask.distributed import Client
client = Client(processes=False, n_workers=2)
def inc(x):
warning(f"{x}")
return x + 1
output=[]
for x in [1, 2, 3, 4, 5]:
a = client.submit(inc, x)
output.append(a)
The above-added code will already display the logs on submission as shown below.
Output:
2022-09-19 20:55:23 ⚡ [root] 1
2022-09-19 20:55:23 ⚡ [root] 2
2022-09-19 20:55:23 ⚡ [root] 3
2022-09-19 20:55:23 ⚡ [root] 4
2022-09-19 20:55:23 ⚡ [root] 5
Output of client.gather(output)
[2, 3, 4, 5, 6]
But it is expected to show only at the execution of client.gather(output)
along with the return of the results.
Intended behavior using Dask Delayed:
import dask
@dask.delayed
def inc(x):
warning(f"{x}")
return x + 1
data = [1, 2, 3, 4, 5]
output = []
for x in data:
a = inc(x)
output.append(a)
total = dask.delayed(output)
total.compute()
Output:
2022-09-19 21:05:01 ⚡ [root] 3
2022-09-19 21:05:01 ⚡ [root] 1
2022-09-19 21:05:01 ⚡ [root] 4
2022-09-19 21:05:01 ⚡ [root] 5
2022-09-19 21:05:01 ⚡ [root] 2
[2, 3, 4, 5, 6]
Could we get the expected behavior using the dask futures?