I have a very large data file (1000 by 1400000 array) that contains integers of 0, 1, 2 and 4. It takes a very long time to load this big data into a numpy array using h5py because my memory(4GB) cannot hold that much and the program uses the swap space. Since there are only 4 numbers in the data, I want to use a 8 bit integer array. Currently I load the data and convert it to a 8 bit int array after that.
with h5py.File("largedata", 'r') as f:
variables = f.items()
# extract all data
for name, data in variables:
# If DataSet pull the associated Data
if type(data) is h5py.Dataset:
value = data.value
if(name == 'foo'):
# convert to 8 bit int
nparray = np.array(value, dtype=np.int8)
Is it possible to load the data directly into a 8bit int array to save memory while loading?