I am trying to load and process large files using ray.
I am using ray for the purpose of multiprocessing the files and improving the speed of the solution.
I keep running into this pyarrow error: pyarrow.lib.ArrowInvalid: Maximum size exceeded (2GB)
. It seems to have something to do with the plasma object store.
I have tried to use huge_pages and mount it to the plasma store, increase the size of the ray object store on init.
Any help would be great.