4

I need to sort a data table that is well over the size of the physical memory of the machine I am using. Pandas cannot handle it because it needs to read the entire data into memory. Can dask handle that?

Thanks!

gerrit
  • 24,025
  • 17
  • 97
  • 170
Bo Qiang
  • 739
  • 2
  • 13
  • 34

1 Answers1

3

Yes, by calling set_index on the column that you wish to sort. On a single machine it uses your hard drive intelligently for excess space.

MRocklin
  • 55,641
  • 23
  • 163
  • 235