0

I repeatedly hear this question. I don't have any hands-on these scenarios. As I can learn and get many approaches/ideas to do the same, so would like to understand

1) What would be the best approach ?
2) What would be the efficient way to do this ?

According to me, I would approach to break down the huge file size to smaller files (I mean, Batches). Let's say, I have 2 files with data to be manipulated (Each file sorted and un-sorted order). Definitely, reading such huge file results to memory error (won't be able to load file depending upon RAM).

1) How can it be achieved through Python ?
2) Best time saving and efficient method ?
3) Can Python-Pandas achieve this ? If yes, how ?

Very curious to hear from you.Please help me.

StackGuru
  • 471
  • 1
  • 9
  • 25
  • Possible duplicate of [Python read stream](https://stackoverflow.com/questions/26127889/python-read-stream) – Milan Velebit Sep 30 '19 at 14:28
  • @MilanVelebit : Will https://stackoverflow.com/questions/26127889/python-read-stream solve this problem ? – StackGuru Sep 30 '19 at 14:31
  • The typical advice is to use [`Dask`](https://docs.dask.org/en/latest/) or if you want, just [process the files in chunks with `pandas`](https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html?highlight=chunksize#iterating-through-files-chunk-by-chunk). – ALollz Sep 30 '19 at 14:37

0 Answers0