Questions tagged [hdf]

Hierarchical Data Format (HDF, HDF4, or HDF5) is a set of file formats and libraries designed to store and organize large amounts of numerical data.

Hierarchical Data Format (HDF, HDF4, or HDF5) is a set of file formats and libraries designed to store and organize large amounts of numerical data.

Originally developed at the National Center for Supercomputing Applications, it is supported by the non-profit HDF Group, whose mission is to ensure continued development of HDF5 technologies, and the continued accessibility of data stored in HDF.

In keeping with this goal, the HDF format, libraries and associated tools are available under a liberal, BSD-like license for general use. HDF is supported by many commercial and non-commercial software platforms, including Java, MATLAB/Scilab, Octave, IDL, Python, and R. The freely available HDF distribution consists of the library, command-line utilities, test suite source, Java interface, and the Java-based HDF Viewer (HDFView).

There are two major versions of HDF; HDF4 and HDF5, which differ significantly in design and API.

Wikipedia: http://en.wikipedia.org/wiki/Hierarchical_Data_Format

344 questions
5
votes
2 answers

Why are CSV files smaller than HDF5 files when writing with Pandas?

import numpy as np import pandas as pd df = pd.DataFrame(data=np.zeros((1000000,1))) df.to_csv('test.csv') df.to_hdf('test.h5', 'df') ls -sh test* 11M test.csv 16M test.h5 If I use an even larger dataset then the effect is even bigger. Using an…
jeffalstott
  • 2,643
  • 4
  • 28
  • 34
4
votes
0 answers

How to merge multiple MODIS swaths into one plot in python?

I want to mosaic/merge multiple swaths of the MODIS dataset (MOD06_L2) using python. I used the example (http://hdfeos.org/zoo/MORE/LAADS/MOD/MOD04_L2_merge.py) to read multiple files and merge. But I am getting an error while doing so, how to…
Krishnaap
  • 297
  • 3
  • 18
4
votes
4 answers

How can I combine multiple .h5 file?

Everything that is available online is too complicated. My database is large to I exported it in parts. I now have three .h5 file and I would like to combine them into one .h5 file for further work. How can I do it?
ktt_11
  • 41
  • 1
  • 1
  • 5
4
votes
1 answer

Images saved as HDF5 arent colored

Im currently working on a program that converts text files and jpg-images into the HDF5-Format. Opened with the HDFView 3.0, it seems that the Images are only saved in greyscales. hdf = h5py.File("Sample.h5") img = Image.open("Image.jpg") data =…
b0r.py
  • 71
  • 1
  • 7
4
votes
0 answers

Can I do a "lazy" read with h5py when slicing with dynamic axis?

I have a HDF5_generator that returns data like this: for element_i in range(n_elements): img = f['data'][:].take(indices=element_i, axis=element_axis) yield img, label, weights I do slicing, because h5py doesn't seem to provide a different…
Honeybear
  • 2,928
  • 2
  • 28
  • 47
4
votes
0 answers

Fastest way to load large sparse matrix

I've been playing around with trying to find the fastest way to access large datasets in Python. In my real world case, I have a roughly 10,000 by 10,000 csv file which I'm loading into a pandas MultiIndex DataFrame, because I'm mainly taking dot…
BdB
  • 471
  • 5
  • 18
4
votes
3 answers

C/C++ HDF5 Read string attribute

A colleague of mine used labview to write an ASCII string as an attribute in an HDF5 file. I can see that the attribute exist, and read it, but I can't print it. The attribute is, as shown in HDF Viewer: Date = 2015\07\09 So "Date" is its…
The Quantum Physicist
  • 24,987
  • 19
  • 103
  • 189
3
votes
0 answers

R Error in FUN(X[[i]], ...) : '"C:\OSGeo4W\apps\gdal\share\bash-completion\completions\gdalinfo"' not found

Lately I have been encountering many problems I have never seen before in R with geospatial packages, especially GDAL. In particular I encounter this error (which when I had written the code I did not encounter): > sds <-…
chiarar
  • 31
  • 2
3
votes
1 answer

Working with hdf files in Databricks cluster

I am trying to create a simple .hdf in the Databricks environment. I can create the file on the driver, but the same code when executed with rdd.map(), it throws following exception. Py4JJavaError: An error occurred while calling…
3
votes
1 answer

Pandas to_hdf() TypeError: object of type 'int' has no len()

I would like to store a pandas DataFrame such that when I later load it again, I only load certain columns of it and not the entire thing. Therefore, I am trying to store a pandas DataFrame in hdf format. The DataFrame contains a numpy array and I…
r0f1
  • 2,717
  • 3
  • 26
  • 39
3
votes
2 answers

Xarray: Loading several CSV files into a dataset

I have several comma-separated data files that I want to load into an xarray dataset. Each row in each file represents a different spatial value of a field in a fixed grid, and every file represents a different point in time. The grid spacing is…
kilojoules
  • 9,768
  • 18
  • 77
  • 149
3
votes
1 answer

In nifi custom processor is throwing transfer relationship not specified exception

In nifi, I am creating a custom processor which reads multiple row csv data and converts each row into a json and sends. Below is the custom processor code: package hwx.processors.demo; import…
ashok
  • 1,078
  • 3
  • 20
  • 63
3
votes
5 answers

Locking of HDF files using h5py

I have a whole bunch of code interacting with hdf files through h5py. The code has been working for years. Recently, with a change in python environments, I am receiving this new error message. IOError: Unable to open file (unable to lock file,…
user2611761
  • 169
  • 1
  • 1
  • 11
3
votes
2 answers

Determine format of a DataFrame in pandas HDF file

There is an HDF file 'file.h5' and the key name of a pandas DataFrame (or a Series) saved into it is 'df'. How can one determine in what format (i.e. ‘fixed’ or ‘table’) was 'df' saved into the file? Thank you for your help!
S.V
  • 2,149
  • 2
  • 18
  • 41
3
votes
1 answer

Loading an HDF dataset into python, but it's recognized as empty

I am trying to load a large 400x300x60x28 dataset from Matlab (.mat file) into Python as an HDF file, but every time I try to see what's in the file it says it is empty. Some things I've tried so far: INPUT: …
Nat
  • 61
  • 3
1
2
3
22 23