1

I have a distributed Dask cluster that I send a bunch of work to via Dask Distributed Client.

At the end of sending a bunch of work, I'd love to get a report or something that tells me what was the peak memory usage of each worker.

Is this possible via existing diagnostics tools? https://docs.dask.org/en/latest/diagnostics-distributed.html

Thanks! Best,

Jenna Kwon
  • 1,212
  • 1
  • 12
  • 22

1 Answers1

0

Specifically for memory, it's possible to extract information from the scheduler (while it's running) using client.scheduler_info() (this can be dumped as a json). For peak memory there would have to be an extra function that will compare the current usage with the previous usage and pick max.

For a lot of other useful information, but not the peak memory consumption, there's the built-in report:

from dask.distributed import performance_report

with performance_report(filename="dask-report.html"):
    ## some dask computation

(code from the documentation: https://docs.dask.org/en/latest/diagnostics-distributed.html)

Update: there is also a dedicated plugin for dask to record min/max memory usage per task: https://github.com/itamarst/dask-memusage

Update 2: there is a nice blog post with code to track memory usage by dask: https://blog.dask.org/2021/03/11/dask_memory_usage

SultanOrazbayev
  • 14,900
  • 3
  • 16
  • 46
  • Hello! Thanks for the reply. I've actually used that performance report but couldn't get the peak memory information for the workers which is why I raised the question. – Jenna Kwon Feb 09 '21 at 16:24
  • When you say 'for peak memory, there would have to be an extra function' ; do you mean this function would inspect 'scheduler_info' ? Would this give peak memory of any given workers? And how do you suggest I write a function that is continuously monitoring the updating nature of `scheduler_info()`? – Jenna Kwon Feb 09 '21 at 16:25
  • Ah, ok, so the `client.scheduler_info()` is a dictionary with memory/tasks of each worker, so I would either dump that dictionary every now and then as a json-line log (to inspect later) OR create a new dictionary with memory of each worker set to zero and then update that dictionary with the highest observed value for each worker. Whether to update the dictionary every X seconds or after specific calls to functions depends on your specific situation... – SultanOrazbayev Feb 09 '21 at 17:44