Questions tagged [hdf]

Hierarchical Data Format (HDF, HDF4, or HDF5) is a set of file formats and libraries designed to store and organize large amounts of numerical data.

Hierarchical Data Format (HDF, HDF4, or HDF5) is a set of file formats and libraries designed to store and organize large amounts of numerical data.

Originally developed at the National Center for Supercomputing Applications, it is supported by the non-profit HDF Group, whose mission is to ensure continued development of HDF5 technologies, and the continued accessibility of data stored in HDF.

In keeping with this goal, the HDF format, libraries and associated tools are available under a liberal, BSD-like license for general use. HDF is supported by many commercial and non-commercial software platforms, including Java, MATLAB/Scilab, Octave, IDL, Python, and R. The freely available HDF distribution consists of the library, command-line utilities, test suite source, Java interface, and the Java-based HDF Viewer (HDFView).

There are two major versions of HDF; HDF4 and HDF5, which differ significantly in design and API.

Wikipedia: http://en.wikipedia.org/wiki/Hierarchical_Data_Format

344 questions
1
vote
0 answers

What do these parameters mean in package h5 in R

I am using the package h5 in R to write a library. So for the same reason I am trying to build a dataset from scratch with the function createDataSet like…
FoldedChromatin
  • 217
  • 1
  • 4
  • 12
1
vote
3 answers

HDF5 Functions and Smart Destructors - std::unique_ptr()

Many HDF5 functions are initialized as follows hid_t handler = DoSomething(someHandler); And one has to manually free the memory reserved by such an operation using something like: freeme(handler); So it's the same nightmare/problems that come…
The Quantum Physicist
  • 24,987
  • 19
  • 103
  • 189
1
vote
1 answer

How to retrieve sorted records from an hdf table

I'm looking for a way to retrieve sorted records from an hdf table. Here is a python MWE: import tables import numpy as np class Measurement(tables.IsDescription): time = tables.Float64Col() value = tables.Float64Col() h5 =…
remus
  • 2,635
  • 2
  • 21
  • 46
1
vote
1 answer

Replace hdf5 with sqlite

As hdf5 cannot handle well new data in a same file (files become bigger). What would be the drawback to replace it with sqlite + sql akchemey ?
Brook
  • 199
  • 1
  • 2
  • 9
1
vote
0 answers

Using space in pd.read_hdf column while reading

I have a HDF file, where a column name is "COL ABC" I need to query it using pandas > df = pd.read_hdf('test.h5', 'df', where=['COL ABC in ["Sample Condition"]']) But I am getting an error with the "COL ABC" which has spaces in it, working fine…
Vinay Ranjan
  • 294
  • 3
  • 14
1
vote
0 answers

Why h5dcreate_f function in Fortran prints so many digits?

I am using a Fortran code that writes REAL type variables in h5files, so it uses this code call h5dcreate_f(file_id,varn,H5T_NATIVE_REAL, filespace,dset_id, hdferr) and of course other h5 functions like geth5dims. But I am confused as to why when…
Herman Toothrot
  • 1,463
  • 3
  • 23
  • 53
1
vote
1 answer

How do I save variables from hdf5 file using strings as variable names?

I have a hdf5 file with multiple variables that I want to automatically store in a list or a matrix. library(rhdf5) file = H5Fopen("myfile.h5") file HDF5 FILE name / filename name otype dclass …
Herman Toothrot
  • 1,463
  • 3
  • 23
  • 53
1
vote
0 answers

h5py (HDF5) - random error with large ndarray - IOError: Can't prepare for writing data

Running into a very strange issue when trying to create a rather large numpy ndarray dataset. e.g. import h5py import numpy as np test_h5=h5py.File('test.hdf5','w') n=3055693983 # fail n=10000000000 # works n=40000000000 # fail n=100000000000 #…
Bryan
  • 103
  • 1
  • 6
1
vote
3 answers

Data in HDF file using Python missing

I am trying to read in a hdf file but no groups show up. I have tried a couple different methods using tables and h5py but neither work in displaying the groups in the file. I checked and the file is 'Hierarchical Data Format (version 5) data' (See…
BenT
  • 3,172
  • 3
  • 18
  • 38
1
vote
0 answers

HDF5 with Python, Pandas: Data Corruption and Read Errors

So I'm trying to store Pandas DataFrames in HDF5 and getting strange errors, rather inconsistently. At least half the time, some part of the read-process-move-write cycle fails, often with no clearer explanation than "HDF5 Read Error". Even worse,…
1
vote
0 answers

Lambda function is affected by variables outside of scope

When accessing HDF5 via pandas, I sometimes face the documented bug that one cannot make more than 31 select conditions. To circumvent this, I decided to split up the select conditions, create a batch of iterators and then concatenate the results at…
cheesecake
  • 93
  • 4
1
vote
1 answer

Pandas HDF limiting number of rows of CSV file

I have a CSV file with 3GB. I'm trying to save it to HDF format with Pandas so I can load it faster. import pandas as pd import traceback df_all = pd.read_csv('file_csv.csv', iterator=True, chunksize=20000) for _i, df in enumerate(df_all): …
Frias
  • 10,991
  • 9
  • 33
  • 40
1
vote
1 answer

Using HDF5 Thread Safe Library

I´ve got a question regarding the use of the HDF5 Thread Safe library. I currently work with an instance of the HDF5 C++ library (static) that was compiled by a co-worker of mine using nether the "HDF5_ENABLE_PARALLEL" nor the…
YoLieR
  • 11
  • 1
  • 4
1
vote
1 answer

Reading an array of unknown length from an HDF file in Fortran

I want to read an array of dimension one of arbitrary size from an hdf file. I'm working off of the "Read / Write to External Dataset" example here, but since I don't know the array dimensions a priori, I need to call a few extra subroutines. The…
moo
  • 15
  • 6
1
vote
2 answers

How to incorporate a SQL style "is not null" into the where statement of read_hdf

I'm trying to figure out how to block out null responses from a selection, and was wondering how to formulate the where statement such that it produces the correct selection. For instance, let's say I have the following code: df = pd.DataFrame({'A'…
halsdunes
  • 1,199
  • 5
  • 16
  • 28