From the h5py docs, I see that I can cast a HDF dataset as another type using astype
method for the datasets. This returns a contextmanager which performs the conversion on-the-fly.
However, I would like to read in a dataset stored as uint16
and then cast it into float32
type. Thereafter, I would like to extract various slices from this dataset in a different function as the cast type float32
. The docs explains the use as
with dataset.astype('float32'):
castdata = dataset[:]
This would cause the entire dataset to be read in and converted to float32
, which is not what I want. I would like to have a reference to the dataset, but cast as a float32
equivalent to numpy.astype
. How do I create a reference to the .astype('float32')
object so that I can pass it to another function for use?
An example:
import h5py as HDF
import numpy as np
intdata = (100*np.random.random(10)).astype('uint16')
# create the HDF dataset
def get_dataset_as_float():
hf = HDF.File('data.h5', 'w')
d = hf.create_dataset('data', data=intdata)
print(d.dtype)
# uint16
with d.astype('float32'):
# This won't work since the context expires. Returns a uint16 dataset reference
return d
# this works but causes the entire dataset to be read & converted
# with d.astype('float32'):
# return d[:]
Furthermore, it seems like the astype context only applies when the data elements are accessed. This means that
def use_data():
d = get_data_as_float()
# this is a uint16 dataset
# try to use it as a float32
with d.astype('float32'):
print(np.max(d)) # --> output is uint16
print(np.max(d[:])) # --> output is float32, but entire data is loaded
So is there not a numpy-esque way of using astype?