I have a HDF5 file as seen below. I would like edit the index column and create a new timestamp index. Is there any way to do this?
Asked
Active
Viewed 2,024 times
0
-
What are you using: h5py, HDF5Store, pyTables? Any attempt to answer this question will depend on your workflow. Can you share your existing code? Please also read: [mcve]. – jpp Feb 12 '18 at 11:13
-
I don't have any existing code because I'm completely new to HDF5 file writing. I'm using h5py only to read HDF5 files. I'm extracting features from audio files using YAAFE in python. The YAAFE module writes the HDF5 file. Since the original audio file doesn't have any timestamps the output HDF5 file from YAAFE also don't have timestamps. I'm now looking for a method to just insert/edit my existing index column in the HDF5 file. If not I will have to read the files one by one into a pandas dataframe, then insert the index and then convert them into HDF5 for further processing. – thileepan Feb 12 '18 at 12:01
1 Answers
0
This isn't possible, unless you have the scheme / specification used to create the HDF5 files in the first place.
Many things can go wrong if you attempt to use HDF5 files like a spreadsheet (even via h5py). For example:
- Inconsistent chunk shape, compression, data types.
- Homogeneous data becoming non-homogeneous.
What you could do is add a list as an attribute to the dataset. In fact, this is probably the right thing to do. Sample code below, with the input as a dictionary. When you read in the data, you link the attributes to the homogeneous data (by row, column, or some other identifier).
def add_attributes(hdf_file, attributes, path='/'):
"""Add or change attributes in path provided.
Default path is root group.
"""
assert os.path.isfile(hdf_file), "File Not Found Exception '{0}'.".format(hdf_file)
assert isinstance(attributes, dict), "attributes argument must be a key: value dictionary: {0}".format(type(attributes))
with h5py.File(hdf_file, 'r+') as hdf:
for k, v in attributes.items():
hdf[path].attrs[k] = v
return "The following attributes have been added or updated: {0}".format(list(attributes.keys()))

jpp
- 159,742
- 34
- 281
- 339