Questions tagged [hdf5]

The Hierarchical Data Format (HDF5) is a binary file format designed to store large amount of numerical data.

HDF5 refers to:

  • A binary file format designed to store efficiently large amount of numerical data
  • Libraries of function to create and manipulate these files

Main features

  • Free
  • Completely portable
  • Very mature
  • No limit on the number and size of the datasets
  • Flexible in the kind and structure of the data and meta-data
  • Complete library in C and Fortran well documented
  • A lot of wrappers and tools are available (Python, Matlab, Java, …)

Some links to get started

2598 questions
9
votes
1 answer

Is it possible to read field names from a compound Dataset in an HDF5 file in Python?

I have an HDF5 file that contains a 2D table with column names. It shows up as such in HDFView when I loot at this object, called results. It turns out that results is a "compound Dataset", a one-dimensional array where each element is a row. Here…
germ
  • 1,477
  • 1
  • 18
  • 18
9
votes
2 answers

Writing Tables in Torch to file

I am trying to save some tables of strings to files in Torch. I have tried using this Torch extension by Deepmind: hdf5. require 'hdf5' label = {'a', 'b','c','d'} local myFile = hdf5.open(features_repo .. 't.h5', 'w') myFile:write('label',…
Chris Parry
  • 2,937
  • 7
  • 30
  • 71
9
votes
1 answer

Concatenate two big pandas.HDFStore HDF5 files

This question is somehow related to "Concatenate a large number of HDF5 files". I have several huge HDF5 files (~20GB compressed), which could not fit the RAM. Each of them stores several pandas.DataFrames of identical format and with indexes that…
Vladimir
  • 1,363
  • 2
  • 14
  • 28
9
votes
2 answers

Convert HDF5 file to other formats

I am having a few big files sets of HDF5 files and I am looking for an efficient way of converting the data in these files into XML, TXT or some other easily readable format. I tried working with the Python package (www.h5py.org), but I was not able…
visakh
  • 2,503
  • 8
  • 29
  • 55
9
votes
2 answers

Pandas and HDF5, querying a table, string containing '&' character

I've ran into a problem grouping with HDFStore which turned out to extend to selecting rows based on strings that contain the '&' character. This should illustrate the problem >>> from pandas import HDFStore, DataFrame >>> df = DataFrame({'a': ['a',…
jan zegan
  • 1,629
  • 1
  • 12
  • 18
9
votes
2 answers

R and HDF5 Troubles

I am trying to load an hdf5 into R and running into some problems. Here are the steps I took to configure my environment: R 2.10.0 (x64) on Mac OS X 10.6 hdf5 1.8.3 installed via macports hdf5_1.6.9.tar.gz from CRAN I suspect the problem I am…
user174014
9
votes
2 answers

Append new columns to HDFStore with pandas

I'm using Pandas, and making a HDFStore object. I calculate 500 columns of data, and write it to a table format HDFStore object. Then I close the file, delete the data from memory, do the next 500 columns (labelled by an increasing integer), open up…
StevenMurray
  • 742
  • 2
  • 7
  • 18
9
votes
2 answers

Compression of existing file using h5py

I'm currently working on a project regarding compression of HDF5 datasets and recently began using h5py. I followed the basic tutorials and was able to open,create and compress a file while it was being created. However, I've been unsuccessful when…
kromegaman
  • 91
  • 1
  • 2
9
votes
1 answer

Write data to hdf file using multiprocessing

This seems like a simple issue but I cant get my head around it. I have a simulation which runs in a double for loop and writes the results to an HDF file. A simple version of this program is shown below: import tables as pt a = range(10) b =…
user2143958
  • 187
  • 1
  • 2
  • 6
9
votes
1 answer

Is hdf5 suitable for real-time measurements

I would like to know if hdf5 is suitable for real-time data logging or not ? More precisely: I work on a project in which we want to continuously (sampling rate ranging form 30 to 400Hz) mix a fair amount of data (several hours) of different natures…
Cheatboy2
  • 191
  • 3
8
votes
2 answers

write a boost::multi_array to hdf5 dataset

Are there any libraries or headers available to make writing c++ vectors or boost::multi_arrays to HDF5 datasets easy? I have looked at the HDF5 C++ examples and they just use c++ syntax to call c functions, and they only write static c arrays to…
AdamC
  • 81
  • 1
  • 2
8
votes
2 answers

Data persistency of scientific simulation data, Mongodb + HDF5?

I'm developing a Monte Carlo simulation software package that involves multiple physics and simulators. I need to do online analysis, track of the dependency of derived data on raw data, and perform queries like "give me the waveforms for…
Shen Chen
  • 123
  • 2
  • 6
8
votes
3 answers

AttributeError: 'Dataset' object has no attribute 'value'

I got this error when using a package to read hdf5 files: dataset.value Error: Traceback (most recent call last): File "train.py", line 163, in train(0, False, args.gpu_list, args.model_path) File "train.py", line 76, in train …
Jacob Stern
  • 3,758
  • 3
  • 32
  • 54
8
votes
3 answers

Is it possible to specify the pickle protocol when writing pandas to HDF5?

Is there a way to tell Pandas to use a specific pickle protocol (e.g. 4) when writing an HDF5 file? Here is the situation (much simplified): Client A is using python=3.8.1 (as well as pandas=1.0.0 and pytables=3.6.1). A writes some DataFrame using…
Pierre D
  • 24,012
  • 7
  • 60
  • 96
8
votes
3 answers

Get training hyperparameters from a trained keras model

I am trying to figure out some of the hyperparamters used for training some old keras models I have. They were saved as .h5 files. When using model.summary(), I get the model architecture, but no additional metadata about the model. When I open…
Mark
  • 419
  • 4
  • 13