0

If I'm given *.hdf file, how can I print out all the data it contains?

>>> import h5py
>>> f = h5py.File('my_file.hdf', 'r')
>>> # What's next?

All the questions here describe how to either create an hdf file or just read it without printing out the data in contains. So don't mark it as a duplicate.

2 Answers2

1

You might want to use the visititems method.

Recursively visit all objects in this group and subgroups. Like Group.visit(), except your callable should have the signature: callable(name, object) -> None or return value. In this case object will be a Group or Dataset instance.

So the idea is to have a function that will take as argument the name of the visited group (or dataset) and the group (or dataset) instance to log and call the visititems function of the opened file with this log function as argument.

Here is a simple example implementation:

def log_hdf_file(hdf_file):
    """
    Print the groups, attributes and datasets contained in the given HDF file handler to stdout.

    :param h5py.File hdf_file: HDF file handler to log to stdout.
    """
    def _print_item(name, item):
        """Print to stdout the name and attributes or value of the visited item."""
        print name
        # Format item attributes if any
        if item.attrs:
            print '\tattributes:'
            for key, value in item.attrs.iteritems():
                print '\t\t{}: {}'.format(key, str(value).replace('\n', '\n\t\t'))

        # Format Dataset value
        if hasattr(item, 'value'):
            print '\tValue:'
            print '\t\t' + str(item.value).replace('\n', '\n\t\t')

    # Here we first print the file attributes as they are not accessible from File.visititems()
    _print_item(hdf_file.filename, hdf_file)
    # Print the content of the file
    hdf_file.visititems(_print_item)


with h5py.File('my_file.h5') as hdf_file:
    log_hdf_file(hdf_file)
Gall
  • 1,595
  • 1
  • 14
  • 22
  • 1
    does that have to be so complicated? –  May 11 '15 at 12:27
  • I just want to print out the data, what external links? –  May 11 '15 at 12:30
  • I don't understand: why are those class methods? I don't need a class, I need a simple way to print it out. –  May 11 '15 at 18:07
  • you didn't understand, I don't need a class. What's __call__, __init__? –  May 12 '15 at 11:51
1

This is not a proper answer to this question, but the one other answer is a bit unsatisfactory.

To have a look at what's inside an .hdf file, I usually use NASA's Panoply software. It can be downloaded here: http://www.giss.nasa.gov/tools/panoply/ and it lets you open, explore and plot data in all sorts of geo-referenced formats, including netCDF and hdf.

Then I can find out the name of the subdataset I'm interested in and open it in my python script.

Hope this will be a helpful tip for some people looking up this question!

Cynthia GS
  • 522
  • 4
  • 20