I have a large dataset, 11 million rows, and I loaded the data into pandas. I want to then build a spatial index, like rtree or quad tree, but when I push it into memory, it consumes a ton of RAM along with the already reading the large file.
To help reduce the memory footprint, I was thinking of trying to push the index to disk. Can you store the tree in a table? Or even a dataframe and store it in hdf table? Is there a better strategy?
Thanks