just playing around with openmp. Look at this code fragments:
#pragma omp parallel
{
for( i =0;i<n;i++)
{
doing something
}
}
and
for( i =0;i<n;i++)
{
#pragma omp parallel
{
doing something
}
}
Why is the first one a lot more slower (around the factor 5) than the second one? From theory I thought that the first one must be faster, because the parallel region is only created once and not n-times like the second? Can someone explain this to me?
The code i want to parallelise has the following structure:
for(i=0;i<n;i++) //wont be parallelizable
{
for(j=i+1;j<n;j++) //will be parallelized
{
doing sth.
}
for(j=i+1;j<n;j++) //will be parallelized
for(k = i+1;k<n;k++)
{
doing sth.
}
}
I made a simple program to measure the time and reproduce my results.
#include <stdio.h>
#include <omp.h>
void test( int n)
{
int i ;
double t_a = 0.0, t_b = 0.0 ;
t_a = omp_get_wtime() ;
#pragma omp parallel
{
for(i=0;i<n;i++)
{
}
}
t_b = omp_get_wtime() ;
for(i=0;i<n;i++)
{
#pragma omp parallel
{
}
}
printf( "directive outside for-loop: %lf\n", 1000*(omp_get_wtime()-t_a)) ;
printf( "directive inside for-loop: %lf \n", 1000*(omp_get_wtime()-t_b)) ;
}
int main(void)
{
int i, n ;
double t_1 = 0.0, t_2 = 0.0 ;
printf( "n: " ) ;
scanf( "%d", &n ) ;
t_1 = omp_get_wtime() ;
#pragma omp parallel
{
for(i=0;i<n;i++)
{
}
}
t_2 = omp_get_wtime() ;
for(i=0;i<n;i++)
{
#pragma omp parallel
{
}
}
printf( "directive outside for-loop: %lf\n", 1000*(omp_get_wtime()-t_1)) ;
printf( "directive inside for-loop: %lf \n", 1000*(omp_get_wtime()-t_2)) ;
test(n) ;
return 0 ;
}
If I start it with different n's I always get different results.
n: 30000
directive outside for-loop: 0.881884
directive inside for-loop: 0.073054
directive outside for-loop: 0.049098
directive inside for-loop: 0.011663
n: 30000
directive outside for-loop: 0.402774
directive inside for-loop: 0.071588
directive outside for-loop: 0.049168
directive inside for-loop: 0.012013
n: 30000
directive outside for-loop: 2.198740
directive inside for-loop: 0.065301
directive outside for-loop: 0.047911
directive inside for-loop: 0.012152
n: 1000
directive outside for-loop: 0.355841
directive inside for-loop: 0.079480
directive outside for-loop: 0.013549
directive inside for-loop: 0.012362
n: 10000
directive outside for-loop: 0.926234
directive inside for-loop: 0.071098
directive outside for-loop: 0.023536
directive inside for-loop: 0.012222
n: 10000
directive outside for-loop: 0.354025
directive inside for-loop: 0.073542
directive outside for-loop: 0.023607
directive inside for-loop: 0.012292
How can you explain me this difference?!
Results with your version:
Input n: 1000
[2] directive outside for-loop: 0.331396
[2] directive inside for-loop: 0.002864
[2] directive outside for-loop: 0.011663
[2] directive inside for-loop: 0.001188
[1] directive outside for-loop: 0.021092
[1] directive inside for-loop: 0.001327
[1] directive outside for-loop: 0.005238
[1] directive inside for-loop: 0.001048
[0] directive outside for-loop: 0.020812
[0] directive inside for-loop: 0.001188
[0] directive outside for-loop: 0.005029
[0] directive inside for-loop: 0.001257