I've been trying to understand how OpenMP parallel for loop works when combined with critical sections and ordered directives. There are a couple of code samples which I find confusing:
1. OpenMP parallel for loop is used to initialize the array s
with the loop index i
and the thread ID. No ordered
directives or critical sections are used.
#include <stdio.h>
#include <omp.h>
#define N 10
#define CHUNKSIZE 1
int main(int argc, char* argv[])
{
int i, chunk = CHUNKSIZE;
char s[N][22];
#pragma omp parallel for shared(s,chunk) private(i) schedule(static, chunk)
for (i = 0; i < N; ++i)
{
int tid = omp_get_thread_num();
sprintf(s[i], "%d:%d", i, tid);
printf("i: %d tid: %d\n", i, tid);
}
puts("\nArray initialization order:");
for (i = 0; i < N; ++i)
puts(s[i]);
}
It prints the following:
i: 7 tid: 7
i: 4 tid: 4
i: 5 tid: 5
i: 6 tid: 6
i: 0 tid: 0
i: 8 tid: 0
i: 3 tid: 3
i: 1 tid: 1
i: 2 tid: 2
i: 9 tid: 1
Array initialization order:
0:0
1:1
2:2
3:3
4:4
5:5
6:6
7:7
8:0
9:1
I am failing to figure out why s
contains the i
indices (first number) in a strict sequence despite the absence of the ordered
directives and why printf("i: %d tid: %d\n", i, tid)
shows them in a different order?
2. Adding ordered
to the omp parallel for
clause doesn't seem to change anything unless omp ordered
is put inside the loop body.
#pragma omp parallel for shared(s,chunk) private(i) schedule(static, chunk) ordered
for (i = 0; i < N; ++i)
{
int tid = omp_get_thread_num();
sprintf(s[i], "%d:%d", i, tid);
printf("i: %d tid: %d\n", i, tid);
}
Produces the same result as before: sprintf(s[i], "%d:%d", i, tid)
initializes the array with a strict sequence of i
, whereas printf("i: %d tid: %d\n", i, tid)
prints i
in an arbitrary order.
#pragma omp parallel for shared(s,chunk) private(i) schedule(static, chunk) ordered
for (i = 0; i < N; ++i)
{
int tid = omp_get_thread_num();
sprintf(s[i], "%d:%d", i, tid);
#pragma omp ordered
printf("i: %d tid: %d\n", i, tid);
}
Now everything happens in the sequence of i
:
i: 0 tid: 0
i: 1 tid: 1
i: 2 tid: 2
i: 3 tid: 3
i: 4 tid: 4
i: 5 tid: 5
i: 6 tid: 6
i: 7 tid: 7
i: 8 tid: 0
i: 9 tid: 1
Array initialization order:
0:0
1:1
2:2
3:3
4:4
5:5
6:6
7:7
8:0
9:1
Again, I don't understand why we need to place the omp ordered
inside the loop body to enforce the order of prints wheres array initialization doesn't need that.
3. Use critical section to ensure that only one thread at a time executes the loop body:
#pragma omp parallel for shared(s,chunk) private(i) schedule(static, chunk) ordered
for (i = 0; i < N; ++i)
#pragma omp critical
{
int tid = omp_get_thread_num();
sprintf(s[i], "%d:%d", i, tid);
printf("i: %d tid: %d\n", i, tid);
}
Again, prints i
in an arbitrary order, and initializes s
in a strict order of i
:
i: 1 tid: 1
i: 4 tid: 4
i: 3 tid: 3
i: 2 tid: 2
i: 5 tid: 5
i: 0 tid: 0
i: 7 tid: 7
i: 6 tid: 6
i: 8 tid: 0
i: 9 tid: 1
Array initialization order:
0:0
1:1
2:2
3:3
4:4
5:5
6:6
7:7
8:0
9:1
This is totally bewildering since in my understanding the critical section must guarantee that sprintf
and printf
statements are executed by the same thread without any interruptions.
Any help to clear this up will be highly appreciated.