I am looking for the efficient way to feed up the raster data file (GeoTiff) with 20GB size into PyTables for further out of core computation.
Currently I am reading it as numpy array using Gdal, and writing the numpy array into pytables using the code below:
import gdal, numpy as np, tables as tb
inraster = gdal.Open('infile.tif').ReadAsArray().astype(np.float32)
f = tb.openFile('myhdf.h5','w')
dataset = f.createCArray(f.root, 'mydata', atom=tb.Float32Atom(),shape=np.shape(inraster)
dataset[:] = inraster
dataset.flush()
dataset.close()
f.close()
inraster = None
Unfortunately, since my input file is extremely large, while reading it as numpy error, my PC shows memory error. Is there any alternative way to feed up the data into PyTables or any suggestions to improve my code?