How to read a 30G parquet file by python

Question

I am trying to read data from a large parquet file of 30G. My memory do not support default reading with fastparquet in python, so I do not know what I should do to lower the memory usage of the reading process.

score 1 · Answer 1 · answered Aug 25 '21 at 04:34

1

You can use pyarrow's iter_batches to read back chunks of rows incrementally.

answered Aug 25 '21 at 04:34

Micah Kornfield

1,325
5
10

How to read a 30G parquet file by python

1 Answers1