I have a simple code prepared for testing. This is the most important piece of the code:
#pragma omp parallel sections
{
#pragma omp section
{
for (int j=0;j<100000;j++)
for (int i=0;i<1000;i++) a1[i]=1;
}
#pragma omp section
{
for (int j=0;j<100000;j++)
for (int i=0;i<1000;i++) a2[i]=1;
}
}
I compiled the program with MinGW compiler and results are as I expected. As I am going to use a computer with Linux only, I compiled the code on Linux (using the same machine). I used gcc 4.7.2 and intel 12.1.0 compilers. The efficiency of the program significantly decreased. It is slower than sequential program (omp_set_num_threads(1)
)
I have also tried with private arrays in threads, but the effect is similar.
Can someone suggest any explanation?