I have coded a matrix * vector multiplication with a matrix stored in the crs format. Here is my code:
call cpu_time(time3)
!$OMP PARALLEL DO
do o = 1, b2
do p = row(o), row(o+1)-1
ergcrs(o) = ergcrs(o) + val(p)*v(col(p))
end do
end do
!$OMP END PARALLEL DO
call cpu_time(time4)
it works with and without openmp but if it is enabled my code runs slower than it did without openmp.
2 questions: 1. why is my code slower if i try to run it parallel und 2. can i run the second do loop in a parallel do loop too?
the same thing happens if i run this test programm:
program test
use omp_lib
implicit none
integer :: i, n = 1000000000, counter = 0
real :: time1, time2
call cpu_time(time1)
!$OMP PARALLEL DO
do i = 1, N
counter = counter + 1
end do
!$OMP END PARALLEL DO
call cpu_time(time2)
print *,time2-time1
end program test
is my problem that i store the results in a shared varaible (ergcrs(o) in the first code piece and counter in the second)?
if so, is there a way to fix it?
Thank you very much for your help!