When I call:
MPI_File_open(k_space_communicator, "TB_schro_BS.dat", MPI_MODE_CREATE|MPI_MODE_WRONLY,MPI_INFO_NULL, &bs_out); MPI_File_close(&bs_out);
k_space_communicator
works as it is used in many other functions without problem,
bs_out
is declared: MPI_File bs_out;
TB_schro_BS.dat
is file that is deleted before this call if already exists.
I get no immediate errors or hangs (all processes make it out of this function) and the code appears to run perfectly until later in the program, it randomly gets a hang when I try to delete another object.
So, this only occurs when I have a number of processes that is not a factor of the number of calculations I am doing. However, if I comment this line of code out, there are no hang-ups no matter how many processes I am running (obviously less than calculations).
The most confusing thing is that when I traced where the hang-ups occur, I found that it happens in between the end of the object destructor and straight after the call for the deletion of the object, where no extra code is.
Finally, there is one error that is output before the hang that I couldn't quite decipher and occurs between 1-3 times per run for a 4 process run:
*** glibc detected *** mpiuf-nemo: corrupted double-linked list: 0x000000000192a640 ***
I am working in cpp
using mpich2
on Ubuntu. Sorry I cannot be more specific as I cannot release detail about the code but I will be happy to try and answer any further questions you may have. Sorry if I have missed anything out. I am a little flustered by this problem.