Writing/reading large files with HDF5 and MPI using 1 process, from Python

Question

When writing a large dataset to a file using parallel HDF5 via h5py and mpi4py (and quite possible also when using HDF5 and MPI directly from C), I get the following error if using the mpio driver with a single process:

OSError: Can't prepare for writing data (Can't convert from size to size_i)

It seems that the limit on the allowed dataset is 4GB, at least when the content is double arrays. Larger datasets work fine if using more processes to share the workload, or if done on a single CPU without the mpio driver.

Why is this? Are size and size_i pointer types, and can the former not hold addresses larger than what corresponds to 4GB double[]? This error probably won't be a serious problem for me in the end, because I will use more than 1 process in general, but I would like my code to work even using just a single process.

Just a guess .. MPI's File IO routines are from MPI-2 which was ratified in the mid 90's. The MPI standard requires the `count` parameter is of type `int`. I don't think the 64-bit `long int` became available until C99. — eduffy, Jan 16 '15 at 13:33
As already mentioned by @eduffy, the MPI standard defines the count of data items in the MPI-IO calls to be of type `int`. This is not really a restriction for native MPI applications since one could easily circumvent it by constructing a derived MPI datatype (which was the rationale behind keeping `int` as count type in MPI-3). — Hristo Iliev, Jan 17 '15 at 18:10

score 1 · Accepted Answer · answered Feb 09 '16 at 17:09

1

I recently faced the same issue, and digging up got me to this point:

https://www.hdfgroup.org/ftp/HDF5/releases/hdf5-1.8.1/src/unpacked/src/H5FDmpio.c

Where you will see the error being raised. Simply put, the error comes when the size of the array in bytes is greater than 2GB.

Upon digging further, got me here: https://www.hdfgroup.org/hdf5-quest.html#p2gb

Where the problem and the workarounds are described.

Please have a look.

answered Feb 09 '16 at 17:09

Arvind

124
6

2

The links above are no longer accessible. Did anybody find a stable workaround for this issue? – GMc Jun 16 '21 at 00:46
@GMc I believe that the only fix is to divide your writes into chunks of 2GB (i.e., playing with hyperslabs and 2GB blocks). Technically, the HDF5 library could easily create derived datatypes in MPI to write terabytes at once if you want. But, for some reason, HDF5 ignores this fact and all I have read on-line is that this is an MPI issue, which is not. – SRG Mar 28 '22 at 07:31

Writing/reading large files with HDF5 and MPI using 1 process, from Python

1 Answers1