9

Since .persist() caches data in the background, I'm wondering whether it is possible to wait until it finishes caching then do the following things. In addition, there is a way to have a progress bar for the caching process? Thank you very much

user3716774
  • 431
  • 3
  • 11

1 Answers1

8

Yes, the functions you're looking for are aptly named wait and progress.

from dask.distributed import wait, progress

The progress function takes any dask thing and renders a progress bar

>>> progress(x)
[XXXXXXX................]  5.2 seconds

If you are in the IPython notebook, then progress is also non-blocking and uses IPython widgets. If you are in the IPython console or a straight Python executable, then progress is blocking and will not return until the computation completes.

If you do not want a progress bar, or if you are in the Jupyter notebook, then you may want to separately use the wait function, which will block until the computations finish.

wait(x)

http://distributed.readthedocs.io/en/latest/api.html#distributed.client.wait http://distributed.readthedocs.io/en/latest/api.html#distributed.diagnostics.progress

MRocklin
  • 55,641
  • 23
  • 163
  • 235