I've got an old and messy Fortran program that uses MPI. It has one small module written in C, which tries to determine the largest allocatable block in the memory by calling malloc()
iteratively until it returns null
, and then returns a largest successful allocation size to a Fortran program.
When I compile it using gfortran
, it works well, but when I try to use mpif90
, the last malloc()
causes segfault instead of returning null
.
Here's the smallest illustrative example with no actual MPI code. File main.f
:
program test
complex(8) :: sig(256000000) ! Just allocating some big array in fortran
sig(1) = 0.d0 ! and now wondering how much space is left?
call bigalloc
end
File bigalloc.c
#include <stdlib.h>
#include <stdio.h>
void bigalloc_() {
size_t step = 0x80000000;
size_t size = 0;
int failed = 0;
void* p;
do {
size += step;
p = malloc(size);
if (p) {
free(p);
printf("Allocated %zd...\n", size);
} else {
printf("So, that's our limit\n");
failed = 1;
}
} while (!failed);
}
Compile and run using just gfortran
(works as expected):
~$ gcc -c bigalloc.c -o bigalloc.o && gfortran -o main main.f bigalloc.o && ./main
Allocated 2147483648...
Allocated 4294967296...
So, that's our limit
Compile with MPI and run (fails):
~$ gcc -c bigalloc.c -o bigalloc.o && mpif90 -o main main.f bigalloc.o && ./main
Allocated 2147483648...
Allocated 4294967296...
Segmentation fault
Replacing gcc
with mpicc
changes nothing here. When main
is also written in C and compiled using mpicc
everything is also OK. So problem is just with Fortran.
The output of mpif90 -show
is here. The problem depends solely on the presence of -lopen-pal
option.
$ mpif90 -show
gfortran -I/usr/include/openmpi/1.2.4-gcc/64 -I/usr/include/openmpi/1.2.4-gcc -m64 -pthread -I/usr/lib64/openmpi/1.2.4-gcc -L/usr/lib64/openmpi/1.2.4-gcc -lmpi_f90 -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl
It seems that while linking MPI substitutes a stnadard malloc
with its own one from PAL, which doesn't work properly on exceptions. Is there a way of getting around it (e. g. by somehow linking my bigalloc.c
with glibc
statically)?