I need to sort a data table that is well over the size of the physical memory of the machine I am using. Pandas cannot handle it because it needs to read the entire data into memory. Can dask handle that?
Thanks!
Yes, by calling set_index
on the column that you wish to sort. On a single machine it uses your hard drive intelligently for excess space.