2

I have a feather datafile that weights approximately 300 MB, name it df.ftr. I can read it with Pandas using the following command:

import pandas as pd
df = pd.read_feather('df.ftr')

However, this dataset contains over 21 million rows and its size overflows my local computer's memory. What I would like to do is to read only the first 1 million rows.

If it were an df.h5 file, I would read it by using the stop argument of the read_hdf method (documentation available here):

import pandas as pd
df = pd.read_hdf('df.h5', 'table', stop=1000000)

However, after checking the read_feather() documentation, there is no argument which seems to be able to produce the same effect.

How can it be done?

Marioanzas
  • 1,663
  • 2
  • 10
  • 33

0 Answers0