I have some HDF data that has been created with PyTables. This data is very large, an array of 3973850000 x 8 double precision values, but with PyTables compression this can easily be stored.
I want to access this data using Fortran. I do,
PROGRAM HDF_READ
USE HDF
IMPLICIT NONE
CHARACTER(LEN=100), PARAMETER :: filename = 'example.h5'
CHARACTER(LEN=100), PARAMETER :: dsetname = 'example_dset.h5'
INTEGER error
INTEGER(HID_T) :: file_id
INTEGER(HID_T) :: dset_id
INTEGER(HID_T) :: space_id
INTEGER(HSIZE_T), DIMENSION(2) :: data_dims, max_dims
DOUBLE PRECISION, DIMENSION(:,:), ALLOCATABLE :: dset_data
!Initialize Fortran interface
CALL h5open_f(error)
!Open an existing file
CALL h5open_f(filename, H5F_ACC_RDONLY_F, file_id,error)
END PROGRAM HDF_READ
!Open a dataset
CALL h5dopen_f(file_id, dsetname, dset_id, error)
!Get dataspace ID
CALL h5dget_space_f(dset_id, space_id, error)
!Get dataspace dims
CALL h5sget_simple_extent_dims_f(space_id, data_dims,max_dims, error)
!Create array to read into
ALLOCATE(dset_data(data_dims(1), data_dims(2)))
!Get the data
CALL h5dread_f(dset_id, H5T_NATIVE_DOUBLE, dset_data, data_dims,error)
However, this creates an obvious problem, in that the array cannot be allocated to such a large size with double precision floats as it becomes greater than the system memory.
What is the best method for accessing this data? My current thoughts are for some sort of chunking method? Or is there a way to store the array on disk? Does HDF have methods for dealing with large data like this - I have read around but can find nothing pertaining to my case.