Parallel OpenMP loop with continue as a break alternative

Question

I'm referring to this question: Parallel OpenMP loop with break statement

The code suggested here:

volatile bool flag=false;

#pragma omp parallel for shared(flag)
for(int i=0; i<=100000; ++i)
{    
    if(flag) continue;
    if(element[i] ...)
    {
          ...
          flag=true;
    }
}

What are the advantages of using continue? Is it faster than doing the following:

volatile bool flag=false;

#pragma omp parallel for shared(flag)
for(int i=0; i<=100000; ++i)
{    
    if(!flag)
    {
        if(element[i] ...)
        {
              ...
              flag=true;
        }
    }
}

I could imagine that the first implementation could be favored on stylistic grounds as people tend to find "flatter" code, i.e. fewer levels of indentation, easier to read. — paleonix, Mar 10 '22 at 19:31
Alternatively use a `taskgroup` and have `cancel taskgroup`. — Victor Eijkhout, Mar 10 '22 at 20:28
@VictorEijkhout AFAICT cancellation depends on the `OMP_CANCELLATION=true` environment variable; you can't enable it within the program. Besides, you can use `#pragma omp cancel for` and not define any `taskgroup`. — Yann Vernier, Mar 11 '22 at 17:30

score 2 · Accepted Answer · answered Mar 10 '22 at 19:19

2

After compilation, they are identical at least for the trivial case.

Without continue

With Continue

If you compare the resulting assembly there is no difference between the two. I have taken the liberty of adding a junk condition of halting before 2000.

answered Mar 10 '22 at 19:19

Niteya Shah

1,809
1
17
30

score 1 · Answer 2 · answered Mar 11 '22 at 17:47

As pointed our by @Niteya it does not really matter which one you use, practically they are the same. I would like to point out, however, that you have a race condition in your code. According to OpenMP memory model:

if at least one thread reads from a memory unit and at least one thread writes without synchronization to that same memory unit,(...), then a data race occurs. If a data race occurs then the result of the program is unspecified.

To correct it you have to use atomic read/write operations. So, your code should be something like this:

#pragma omp parallel for shared(flag)
for(int i=0; i<=100000; ++i)
{        
    bool tmp_flag;
    #pragma omp atomic read acquire
    tmp_flag=flag;
    if(!tmp_flag)
    {
        if(element[i]  == 2000)
        {
            #pragma omp atomic write release
            flag=true;
        }
    }
}

Parallel OpenMP loop with continue as a break alternative

2 Answers2