0

When I use C pointers in python and try to process it using dask, it working like a pro. But when I try to use python's multiprocessing module, it splits the pointer reference error.

How is dask able to overcome the multiprocessing module when using C pointers

Naren Babu R
  • 453
  • 2
  • 9
  • 33

1 Answers1

3

The default scheduler for dask ("threaded") works with threads in the same process where you have defined your graph, so each worker has access to the original memory space - the C pointers are therefore valid. In a new process, whether using the builtin multiprocessing module or otherwise, you would need to recreate whatever C structures you need, and make new pointers to them; this can be done either in de/serialisation (dask-distributed has a lot of logic dedicated to this), or by reloading modules/data in the worker.

mdurant
  • 27,272
  • 5
  • 45
  • 74