2

I'm using Silo with HDF5, and I'm having trouble accessing some of the metadata with h5py. It spits out some rather unusual HDF5 structuring, where it puts a DATATYPE inside a DATATYPE. Here's an excerpt of the output from h5dump:

DATATYPE "sigma_t" H5T_STD_I32LE;
   ATTRIBUTE "silo" {
      DATATYPE  H5T_COMPOUND {
         H5T_STRING {
            STRSIZE 5;
            STRPAD H5T_STR_NULLTERM;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         } "meshid";
         H5T_STRING {
            STRSIZE 15;
            STRPAD H5T_STR_NULLTERM;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         } "value0";
         H5T_STD_I32LE "ndims";
         H5T_STD_I32LE "nvals";
         H5T_STD_I32LE "nels";
         H5T_IEEE_F32LE "time";
         H5T_STD_I32LE "use_specmf";
         H5T_STD_I32LE "centering";
         H5T_ARRAY { [3] H5T_STD_I32LE } "dims";
         H5T_ARRAY { [3] H5T_STD_I32LE } "zones";
         H5T_ARRAY { [3] H5T_STD_I32LE } "min_index";
         H5T_ARRAY { [3] H5T_STD_I32LE } "max_index";
         H5T_ARRAY { [3] H5T_IEEE_F32LE } "align";
      }
      DATASPACE  SCALAR
      DATA {
      (0): {
            "mesh",
            "/.silo/#000004",
            2,
            1,
            100,
            0,
            -1000,
            111,
            [ 10, 10, 0 ],
            [ 9, 9, 0 ],
            [ 0, 0, 0 ],
            [ 9, 9, 0 ],
            [ 0.5, 0.5, 0 ]
         }
      }
   }
   ATTRIBUTE "silo_type" {
      DATATYPE  H5T_STD_I32LE
      DATASPACE  SCALAR
      DATA {
      (0): 501
      }
   }

Basically, f['sigma_t'].attrs['silo'] returns a tuple with all of the correctly formatted data but without any of the associated labels for the data types. (I need to know the names meshid, value0, etc.) Is there a way to get this? I'm at a loss.

Example file and script

HDF5 file contains the "sigma_t" field, and the actual data is stored in /.silo/#000004.

Script:

import h5py
f = h5py.File('xsn.silo', 'r')
print f['sigma_t'].attrs['silo']

Result:

('mesh', '/.silo/#000004', 2, 1, 100, 0.0, -1000, 111, array([10, 10,  0], dtype=int32), array([9, 9, 0], dtype=int32), array([0, 0, 0], dtype=int32), array([9, 9, 0], dtype=int32), array([ 0.5,  0.5,  0. ], dtype=float32))

What I also want is something like:

('meshid','value0','ndims', ..., 'align')

Is this possible?

Seth Johnson
  • 14,762
  • 6
  • 59
  • 85
  • Any way you can point to an example file? – diliop May 11 '11 at 23:34
  • From a quick look it appears that the compound dtype does not translate well into h5py and the labels are indeed lost. The ordering though is preserved so you could definitely define a dict with the indices for the tupple and access them by name. Might also be worth looking into h5py.h5t since this is the module to handle such dtypes. – diliop May 12 '11 at 03:10
  • @diliop It's a bug in h5py, see my answer and the link to the google group page. – Seth Johnson May 12 '11 at 03:16

2 Answers2

3

I got an answer from the developer via the h5py Google groups page: it's a bug that will be fixed in h5py 1.4.

What I ended up doing is:

import h5py
f = h5py.File('xsn.silo', 'r')
group = f['sigma_t']
attr_id = h5py.h5a.open(group.id, 'silo')
data = dict(zip(attr_id.dtype.names, group.attrs['silo'],))
Seth Johnson
  • 14,762
  • 6
  • 59
  • 85
0

Thanks for answering Seth! You're answer helped me but this might make it a little bit easier

    #path of table that you want to look at              
    group = f[path]                    
         #checking attributes leads to FIELD_0_NAME or TITLE
         for attribute in group.attrs:
            #I only one the ones that end with name
            if attribute.endswith('NAME'):
                #then I take the actual name (ex:TrialTime) instead of FIELD_0_NAME
                print group.attrs[attribute]
J Klein
  • 11
  • 2