I am new in MPI Programming. I have to test 3 codes, such as sequential, OpenMP and MPI codes. These 3 codes (not the real codes, just for example) are given respectively as follow
The sequential code
program no_parallel
implicit none
integer, parameter :: dp = selected_real_kind(15,307)
integer :: i, j
real(kind = dp) :: time1, time2
real(kind = dp), dimension(1000000) :: a
!Initialisation
do i = 1, 1000000
a(i) = sqrt( dble(i) / 3.0d+0 )
end do
call cpu_time( time1 )
do j = 1, 1000
do i = 1, 1000000
a(i) = a(i) + sqrt( dble(i) )
end do
end do
call cpu_time( time2 )
print *, a(1000000)
print *, 'Elapsed real time = ', time2 - time1, 'second(s)'
end program no_parallel
The OpenMP code
program openmp
implicit none
integer, parameter :: dp = selected_real_kind(15,307)
integer :: i, j
real(kind = dp) :: time1, time2, omp_get_wtime
real(kind = dp), dimension(1000000) :: a
!Initialisation
do i = 1, 1000000
a(i) = sqrt( dble(i) / 3.0d+0 )
end do
time1 = omp_get_wtime()
!$omp parallel
do j = 1, 1000
!$omp do schedule( runtime )
do i = 1, 1000000
a(i) = a(i) + sqrt( dble(i) )
end do
!$omp end do
end do
!$omp end parallel
time2 = omp_get_wtime()
print *, a(1000000)
print *, 'Elapsed real time = ', time2 - time1, 'second(s)'
end program openmp
The MPI code
program MPI
implicit none
include "mpif.h"
integer, parameter :: dp = selected_real_kind(15,307)
integer :: ierr, num_procs, my_id, destination, tag, source, stat, i, j
real(kind = dp) :: time1, time2
real(kind = dp), dimension(1000000) :: a
call MPI_INIT ( ierr )
call MPI_COMM_RANK ( MPI_COMM_WORLD, my_id, ierr )
call MPI_COMM_SIZE ( MPI_COMM_WORLD, num_procs, ierr )
!Initialisation
do i = 1, 1000000
a(i) = sqrt( dble(i) / 3.0d+0 )
end do
destination = 0
tag = 999
source = 3
stat = MPI_STATUS_SIZE
time1 = MPI_Wtime()
do j = 1, 1000
do i = 1 + my_id, 1000000, num_procs
a(i) = a(i) + sqrt( dble(i) )
end do
end do
call MPI_BARRIER ( MPI_COMM_WORLD, ierr )
if( my_id == source ) then
call MPI_SEND ( a(1000000), 1, MPI_DOUBLE_PRECISION, destination, tag, MPI_COMM_WORLD, ierr )
end if
if( my_id == destination ) then
call MPI_RECV ( a(1000000), 1, MPI_DOUBLE_PRECISION, source, tag, MPI_COMM_WORLD, stat, ierr )
end if
time2 = MPI_Wtime()
if( my_id == 0) then
print *, a(1000000) !, 'from ID =', my_id
print *, 'Elapsed real time = ', time2 - time1, 'second(s)'
end if
stop
call MPI_FINALIZE ( ierr )
end program MPI
I compiled these codes using Intel Fortran Compiler 17.0.3
with -O0
optimisation flag. Both the OpenMP and MPI codes were performed on 4 cores Haswell Desktop. I got the CPU times for the sequential, OpenMP and MPI codes 8.08s
, 2.1s
and 3.2s
respectively. Actually, I was expecting that the results between OpenMP and MPI codes are almost similar; however, it wasn't. My questions:
Regarding the MPI code, if I want to print out the results of
a(1000000)
, is it possible to do that in a smarter way without doing suchcall MPI_SEND
andcall MPI_RECV
?Do you have any idea which part of the MPI code that can still be optimised?
With regard to
source
in the MPI code, is it possible to define it automatically? In this case, it is easy for me, since the number of processors is 4, soa(1000000)
must be allocated to thread 3.
Thank you in advance.