1

I saw here that for multi-processing, that numpy memmaps should be used

https://joblib.readthedocs.io/en/latest/parallel.html#working-with-numerical-data-in-shared-memory-memmapping

As this problem can often occur in scientific computing with numpy based datastructures, joblib.Parallel provides a special handling for large arrays to automatically dump them on the filesystem and pass a reference to the worker to open them as memory map on that file using the numpy.memmap subclass of numpy.ndarray. This makes it possible to share a segment of data between all the worker processes.

I was wondering if there's a maximum number of processes a memmap can handle. I am asking because my jupyter notebook seems to crash around 24 processes accessing the same large memmap. I am trying to isolate if this is due to the number or processes.

SantoshGupta7
  • 5,607
  • 14
  • 58
  • 116

0 Answers0