1

I've got a Dask cluster with 32 workers running on a local machine, and have tried to run the following Streamz workflow against it:

enter image description here

I'm only seeing a couple of the workers occupied at any given time:

enter image description here

I see increased occupancy when running locally using:

client = Client(n_workers=32, processes=True, threads_per_worker=1, memory_limit='32GB')

but still nowhere near 32 workers are occupied at any given time (max about 8).

Why is this, and why does the task stream appear to show more tasks running in parallel than the occupancy would suggest?

sgccarey
  • 492
  • 2
  • 6
  • 16
  • 1
    What is the input of your stream and the typical event rate, are you finding that events are not being processed fast enough? – mdurant Jul 15 '21 at 14:22

0 Answers0