Hi I have a python script that uses dask library to handle a very large data frame, larger than the physical memory. I notice that the job get killed in the middle of a run if the memory usage stays at 100% of the computer for some time.
Is it expected? I would thought the data would be spilled to disk and there are plenty of disk space left.
Is there a way to limit its total memory usage? Thanks
EDIT:
I also tried:
dask.set_options(available_memory=12e9)
It did not work. It did not seemed to limit its memory usage. Again, when memory usage reach 100%, the job gets killed.