Questions tagged [h5py]

h5py is a NumPy-compatible Python module for handling The Hierarchical Data Format (HDF5) files.

h5py is a NumPy-compatible Python module for handling The Hierarchical Data Format (HDF5) files.

Main features

  • Free (BSD licensed)
  • limited dependencies (Python, NumPy, HDF5 libs.)
  • includes both a low level c-like HDF5 interface and a high level Python/NumPy style interface
  • directly interact with datasets using NumPy metaphors, such as slicing
  • datatypes specified using standard NumPy dtype objects

Some links to get started

1301 questions
17
votes
5 answers

Python particles simulator: out-of-core processing

Problem description In writing a Monte Carlo particle simulator (brownian motion and photon emission) in python/numpy. I need to save the simulation output (>>10GB) to a file and process the data in a second step. Compatibility with both Windows and…
user2304916
  • 7,882
  • 5
  • 39
  • 53
17
votes
3 answers

Can h5py load a file from a byte array in memory?

My python code is receiving a byte array which represents the bytes of the hdf5 file. I'd like to read this byte array to an in-memory h5py file object without first writing the byte array to disk. This page says that I can open a memory mapped…
mahonya
  • 9,247
  • 7
  • 39
  • 68
16
votes
1 answer

Writing a large hdf5 dataset using h5py

At the moment, I am using h5py to generate hdf5 datasets. I have something like this import h5py import numpy as np my_data=np.genfromtxt("/tmp/data.csv",delimiter=",",dtype=None,names=True) myFile="/tmp/f.hdf" with h5py.File(myFile,"a") as f: …
NinjaGaiden
  • 3,046
  • 6
  • 28
  • 49
16
votes
3 answers

Save pandas DataFrame using h5py for interoperabilty with other hdf5 readers

Here is a sample data frame: import pandas as pd NaN = float('nan') ID = [1, 2, 3, 4, 5, 6, 7] A = [NaN, NaN, NaN, 0.1, 0.1, 0.1, 0.1] B = [0.2, NaN, 0.2, 0.2, 0.2, NaN, NaN] C = [NaN, 0.5, 0.5, NaN, 0.5, 0.5, NaN] columns = {'A':A, 'B':B,…
Phil
  • 5,822
  • 2
  • 31
  • 60
14
votes
2 answers

h5py setup.py on Mac: hdf5.h file not found

I am building h5py on Mac, following instructions "Building against Parallel HDF5" in this link: http://docs.h5py.org/en/latest/build.html $ export CC=mpicc $ python setup.py configure --mpi $ sudo python setup.py build I get this…
yanggao
  • 231
  • 4
  • 7
14
votes
2 answers

How to resize an HDF5 array with `h5py`

How can I resize an HDF5 array using the h5py Python library ? I've tried using the .resize method and on an array with chunks set to True. Alas, I'm still missing something. In [1]: import h5py In [2]: f = h5py.File('foo.hdf5', 'w') In [3]: d =…
MRocklin
  • 55,641
  • 23
  • 163
  • 235
14
votes
1 answer

Updating h5py Datasets

Does any one have an idea for updating hdf5 datasets from h5py? Assuming we create a dataset like: import h5py import numpy f = h5py.File('myfile.hdf5') dset = f.create_dataset('mydataset',…
George Monet
  • 317
  • 2
  • 3
  • 7
13
votes
1 answer

pytables writes much faster than h5py. Why?

I noticed that writing .h5 files takes much longer if I use the h5py library instead of the pytables library. What is the reason? This is also true when the shape of the array is known before. Further, i use the same chunksize and no compression…
adku1173
  • 181
  • 1
  • 5
13
votes
2 answers

Saving with h5py arrays of different sizes

I am trying to store about 3000 numpy arrays using HDF5 data format. Arrays vary in length from 5306 to 121999 np.float64 I am getting Object dtype dtype('O') has no native HDF5 equivalent error since due to the irregular nature of the data numpy…
13
votes
1 answer

What is the recommended compression for HDF5 for fast read/write performance (in Python/pandas)?

I have read several times that turning on compression in HDF5 can lead to better read/write performance. I wonder what ideal settings can be to achieve good read/write performance at: data_df.to_hdf(..., format='fixed', complib=..., complevel=...,…
Mark Horvath
  • 1,136
  • 1
  • 9
  • 24
12
votes
2 answers

Open .h5 file in Python

I am trying to read a h5 file in Python. The file can be found in this link and it is called 'vstoxx_data_31032014.h5'. The code I am trying to run is from the book Python for Finance, by Yves Hilpisch and goes like this: import pandas as pd h5…
python_enthusiast
  • 896
  • 2
  • 7
  • 26
12
votes
2 answers

Storing scipy sparse matrix as HDF5

I want to compress and store a humongous Scipy matrix in HDF5 format. How do I do this? I've tried the below code: a = csr_matrix((dat, (row, col)), shape=(947969, 36039)) f = h5py.File('foo.h5','w') dset = f.create_dataset("init", data=a, dtype…
Rama
  • 1,019
  • 1
  • 15
  • 34
12
votes
1 answer

h5py cannot convert element 0 to hsize_t

I have a boatload of images in a hdf5-file that I would like to load and analyse. Each image is 1920x1920 uint16 and loading all off them into the memory crashes the computer. I have been told that others work around that by slicing the image, e.g.…
DonMP
  • 317
  • 2
  • 9
12
votes
2 answers

Visible Deprecation warning...?

I have some data that Im reading from a h5 file as a numpy array and am doing some analysis with. For context, the data plots a spectral response curve. I am indexing the data (and a subsequent array I have made for my x axis) to get a specific…
GeoMonkey
  • 1,615
  • 7
  • 28
  • 56
12
votes
2 answers

Store datetimes in HDF5 with H5Py

How can I store NumPy datetime objects in HDF5 using h5py? In [1]: import h5py In [2]: import numpy as np In [3]: f = h5py.File('foo.hdfs', 'w') In [4]: d = f.create_dataset('data', shape=(2, 2), dtype=np.datetime64) TypeError: No conversion path…
MRocklin
  • 55,641
  • 23
  • 163
  • 235
1 2
3
86 87