MPI_FILE_READ && little endian on Bluegene

Question

I need to read (and write) some binary little endian file. I am writing my fortran code on a PC using Intel FC and Intel MPI. I/O works fine on PC, but final cause is running the program on Bluegene/P. The Bluegene/P(XL Fortran Compiler) has big endianness. And when I need non-parallel I/O operations (like fortran REED & WRITE) I am using

call SETRTEOPTS('ufmt_littleendian=8')

Unfortunately, when i need parallel I/O, for example MPI_FILE_READ, "SETRTEOPTS('ufmt_littleendian=8')" is ignored. I am setting view with:

call MPI_FILE_SET_VIEW(ifile, offset, MPI_FLOAT, MPI_FLOAT, 'native', MPI_INFO_NULL, ierr)

What should I do? I dont want to create my own DATAREP. Is there any other way? Speed is very important.

You could try using the `external32` representation on both the PC and the BGP systems. — Hristo Iliev, Oct 15 '14 at 15:57
@HristoIliev, he could try it but it won't do anything. He'd have to build his own MPICH on the Blue Gene system -- not impossible, but I'd not go there for a first step. — Rob Latham, Oct 15 '14 at 20:05
If you read the data successfully, only the byte order is wrong, can't you just swap the bytes? I do that all the time. Or convert the file to big-endian first. — Vladimir F Героям слава, Oct 16 '14 at 08:52
@VladimirF: the byteswapping is actually the easy part here. Isn't the bigger challenge how to deal with the padding (if any) the PC fortran compiler will expect beteween fortran records and the BG fortran compiler will expect? — Rob Latham, Oct 16 '14 at 14:09
@RobLatham Quite possibly, I almost exclusively work with stream files to avoid such complications and that's why I didn't think about that. — Vladimir F Героям слава, Oct 16 '14 at 14:24

score 1 · Accepted Answer · answered Oct 15 '14 at 20:08

1

You need to use parallel-netcdf or HDF5. The learning curve for Parallel-HDF5 is a bit steep but you will get a self describing portable file format. It will help you down the road in ways you do not yet understand.

Performance overhead is negligible. Pnetcdf has a bit of an edge if you have lots of tiny datasets, but that's a rather pathological situation.

some applications do byteswapping, but as you mentioned fortran you will have to be very careful that the PC fortran compiler and the Blue Gene fortran compiler agree exactly on how much (if any) record padding to put in its fortran output. Bleah.

answered Oct 15 '14 at 20:08

Rob Latham

5,085
3
27
44

Is that really going to help him in reading the particular file he has? – Vladimir F Героям слава Oct 16 '14 at 08:49
Because @Vasily is both writing and reading the data, I presumed he had some control over the file format. – Rob Latham Oct 16 '14 at 14:10

score 0 · Answer 2 · answered Jan 12 '15 at 06:56

You can try adding "call setrteopts('ufmt_littleendian=8')" to your program instead of setting the environment variable.

Alternatively, you can instruct the Intel compiler to generate big endian data files. See the CONVERT= specifier in the OPEN statement, or the convert compiler option.

Conversion (either from big-endian to little-endian on the BG/P side, or from little-endian to big-endian on Intel side) has a runtime performance cost. So if you want the read side (BG/P) to be as fast as possible, creating big-endian data files on the write side (Intel) is best.

MPI_FILE_READ && little endian on Bluegene

2 Answers2