I wrote a very simple MPI program in fortran to experiment with parallel programming. All it does is add the sum of 1+2+3+...N and do that within multiple threads. It works! But here is the weird thing: It only works if i leave a certain command line output inside the code. If I remove or uncomment it I will get a segfault after recompiling. Why is that so? Is there some kind of latency involved that the sum can not be done directly after the recieving? A simple output should, in my mind, not alter the structure of the program so that a SEGFAULT suddenly occurs. I tried several combinations of N and number of threads but it seems to come down to the output. Enlighten me:-)
The said line is marked with !!!HERE
-I compile with: mpif90 mpi_test.f90 -g
-I then execute with: mpirun -n 4 ./a.out
program mpi_test
implicit none
include 'mpif.h'
!------------------------------------------------------------------------------
integer,dimension(MPI_STATUS_SIZE) ::status
integer ::my_rank,mpi_size, error_mpi
integer ::dest,my_start,my_end,my_summ=0,i,bigsum=0,N = 50000
double precision ::starttime,endtime
!------------------------------------------------------------------------------
call MPI_INIT(error_mpi) ! Initialize MPI
call MPI_COMM_SIZE(MPI_COMM_WORLD, mpi_size, error_mpi) ! get no of THREADS
call MPI_COMM_RANK(MPI_COMM_WORLD, my_rank, error_mpi) ! distribute ranks
call cpu_time(starttime) ! call counter for benchm.
!--------------------------------START-----------------------------------------
my_start =((my_rank)*(N/mpi_size))+1 ! calculate where to start/end the
my_end = my_start+(N/mpi_size)-1 ! summation in each thread
do i =my_start,my_end !summ up the partial sum given to thread
my_summ = my_summ +i
end do
!-----------------------------------------------------------------------------
if(my_rank .NE. 0 ) then !send result to master Thread
call MPI_SEND(my_summ,1,MPI_INT,0,5,MPI_COMM_WORLD,error_mpi)
end if
!-----------------------------------------------------------------------------
if(my_rank .EQ. 0) then
bigsum= my_summ !first sum part is that of master
do i=1,mpi_size-1 !receive summation parts from threads and add
call MPI_RECV(my_summ,1,MPI_INT,i,5,MPI_COMM_WORLD,status)
!!! HERE
!write(*,*)'Master received sum:',my_summ,' from ',i, 'with status:',status
!!! HERE
bigsum = bigsum+my_summ
end do
end if
!-----------------------------------------------------------------------------
call MPI_BARRIER(MPI_COMM_WORLD,error_mpi)
if(my_rank .EQ. 0) then !compare output to simple serial calculation
call cpu_time(endtime)
write(*,*) 'the big sum is:',bigsum, 'parallel time:', dble(endtime-starttime),'sec.'
call cpu_time(starttime)
bigsum = 0
do i = 1,N
bigsum = bigsum+i
end do
call cpu_time(endtime)
write(*,*) 'the big sum is:',bigsum, 'serial time:', dble(endtime-starttime),'sec.'
end if
!-------------------------------END--------------------------------------------
call MPI_BARRIER(MPI_COMM_WORLD,error_mpi) !wait for every thread then Finalize
call MPI_FINALIZE(error_mpi)
!------------------------------------------------------------------------------
end program
The working output (write output left inside):
Master received sum: 234381250 from 1 with status: [...]
Master received sum: 390631250 from 2 with status: [...]
Master received sum: 546881250 from 3 with status: [...]
the big sum is: 1250025000 parallel time: 1.3100000000000264E-004 sec.
the big sum is: 1250025000 serial time: 1.3700000000000170E-004 sec.
And the output if write() is commented:
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0x7F173CBA87D7
#1 0x7F173CBA8DDE
#2 0x7F173C800D3F
#3 0x7F173CF181B7
#4 0x400E84 in mpi_test at mpi_test.f90:29