There are many things to consider when doing such tests. You have to clearly define what you are comparing in the first place. For such simple test, you should also deactivate the optimization, most major compilers accept the option -O0
to deactivate the optimization. Otherwise, a compiler will find out that you are not doing anything with the computed value and not even run your loop because it is useless.
To cut it short, I modify a little bit your program to have this
program sqtest
implicit none
real r0, r1, r2, s
integer i,n
real :: start, finish
n=10**9
call random_number(r0)
call random_number(r1)
call random_number(r2)
call cpu_time(start)
do i = 1,n
s = sqrt(r0)
enddo
call cpu_time(finish)
print '("SQRT: Time = ",f6.3," seconds.")',finish-start
call cpu_time(start)
do i = 1,n
s = r1+r2
enddo
call cpu_time(finish)
print '("Addtition: Time = ",f6.3," seconds.")',finish-start
end program
And it gives me the following results on my system:
ifort 13, n = 10^8
SQRT: Time = 0.378 seconds
Addtition: Time = 0.202 seconds
ifort 13, n = 10^9
SQRT: Time = 3.460 seconds
Addtition: Time = 1.857 seconds
gfortran (GCC) 4.9, n = 10^8
SQRT: Time = 0.385 seconds
Addtition: Time = 0.191 seconds
gfortran (GCC) 4.9, n = 10^9
SQRT: Time = 3.529 seconds
Addtition: Time = 1.733 seconds
pgf90 14, n = 10^8
SQRT: Time = 0.380 seconds
Addtition: Time = 0.058 seconds
pgf90 14, n = 10^9
SQRT: Time = 3.438 seconds
Addtition: Time = 0.520 sec
You will note that I call the CPU time inside the code. For the numbers to be meaningful, you should run each case many time and compute the time average or pick the minimum. The minimum is what is close to what your system can achieve in the optimal conditions.
You will also see that the result is compiler dependent. pgf90 clearly gives better results on the addition. I removed float(i)*
from the square root. gfortran and pgf90 perform very fast with that (~ 2.6 sec for n = 10^9) while ifort
performs very slowly (~7.3 sec for n = 10^9). Which means that somehow gfortran and pgf90 are choosing different path (faster operation) there, maybe they do some optimization even though I disabled it?