Getting multiple datasets from group in HDF5

Question

I am comparing two different hdf5 files to make sure that they match. I want to create a list with all of the datasets in the group in the hdf5 file so that I can have a loop run through all of the datasets, instead of entering them manually. I cant seem to find away to do this. Currently I am getting the data set by using this code:

tdata21 = ft['/PACKET_0/0xeda9_data_0004']

The names of the sets are located in the "PACKET_0" group. Once I arrange all of the datasets, I compare the data in the datasets in this loop:

for i in range(len(data1)):
   print "%d\t%g\t%g" % (i, data1[i],tdata1[i])
   if(data1[i]!=tdata1[i]):
     x="data file: data1 \nline:"+ str(i) + "\norgianl data:"  + str(data1[i]) + "\nrecieved data:" + str(tdata1[i]) + "\n\n"
     correct.append(x)

If there is an smartier way to compare hdf5 files I would like to see it as will, but mainly I am just looking for a way to get the names of all of the datasets in the group into a list. Thank you

I know that a similar question exists in this post, but I do not really understand it, so if it would work for my case, could someone explain how to use it. [link](http://stackoverflow.com/questions/35953404/listing-datasets-in-a-group-in-hdf5?rq=1) — Nikita Belooussov, Jan 06 '17 at 00:36
http://docs.h5py.org/en/latest/high/group.html#dict-interface-and-links - on accessing elements of a group as though it were a dictionary, including the used of `keys()`, `items()` etc. — hpaulj, Jan 06 '17 at 07:38

score 2 · Accepted Answer · answered Jan 06 '17 at 01:09

To get the datasets or groups that exist in an HDF5 group or file, just call list() on that group or file. Using your example, you'd have

datasets = list(ft['/PACKET_0'])

You can also just iterate over them directly, by doing:

for name, data in ft['/PACKET_0'].items():
    # do stuff for each dataset

If you want to compare two datasets for equality (i.e., they have the same data), the easiest way would be to do this:

(dataset1.value == dataset2.value).all()

This returns NumPy arrays from each dataset, compares those arrays element-by-element, and returns True if they match everywhere and False otherwise.

You can combine these two concepts to compare every dataset in two different files.

Getting multiple datasets from group in HDF5

1 Answers1

Linked