accessing dataset distributed over many hdf5 files

Asked Feb 06 '15 at 18:22

Active Feb 06 '15 at 18:22

Viewed 109 times

Is there a simple way to transparently access a data set distributed over several hdf5 files in python?

Assume, I have two hdf files, h1 and h2. Both contain 1-dimensional datasets dd and cc, say the date in dd and temperature of this date in cc. I am interested in the concatenation d = [h1.dd h2.dd] and c = [h1.cc h2.cc], so that I can access the series using an index i by d[i] and c[i]. I know that I could combine both files in one, but I don't need this combined file and would only delete it afterwards again. Also I have not only two such files but several ones.

I wrote already a class, keeping book of the files and which file contains a given index, but wonder if this is not already part of PyTables or a similar module.

Thanks in advance

asked Feb 06 '15 at 18:22

monos

I don't think that's directly possible. You can place the open file descriptors in a list or dict but to do what you want, you will need a write a class yourself (as you did). – wpoely86 May 04 '15 at 17:07

accessing dataset distributed over many hdf5 files

0 Answers0