I have a feather
datafile that weights approximately 300 MB, name it df.ftr
. I can read it with Pandas using the following command:
import pandas as pd
df = pd.read_feather('df.ftr')
However, this dataset contains over 21 million rows and its size overflows my local computer's memory. What I would like to do is to read only the first 1 million rows.
If it were an df.h5
file, I would read it by using the stop
argument of the read_hdf
method (documentation available here):
import pandas as pd
df = pd.read_hdf('df.h5', 'table', stop=1000000)
However, after checking the read_feather() documentation, there is no argument which seems to be able to produce the same effect.
How can it be done?