1

Just write a small omp test, and it does not work correctly all the times:

#include <omp.h>
int main() {
  int i,j=0;
#pragma omp parallel
  for(i=0;i<1000;i++)
  {
#pragma omp barrier
    j+= j^i;
  }
  return j;
}

The usage of j for writing from all threads is incorrect in this example, BUT

  • there must be only nondeterministic value of j

  • I have a freeze.

Compiled with gcc-4.3.1 -fopenmp a.c -o gcc -static

Run on 4-core x86_Core2 Linux server: $ ./gcc and got freeze (sometimes; like 1 freeze for 4-5 fast runs).

Strace:

[pid 13118] futex(0x80d3014, FUTEX_WAKE, 1) = 1
[pid 13119] <... futex resumed> )       = 0
[pid 13118] futex(0x80d3020, FUTEX_WAIT, 251, NULL <unfinished ...>
[pid 13119] futex(0x80d3014, FUTEX_WAKE, 1) = 0
[pid 13119] futex(0x80d3020, FUTEX_WAIT, 251, NULL                       
                        <freeze>

Why do I have a freeze (deadlock)?

fduff
  • 3,671
  • 2
  • 30
  • 39
osgx
  • 90,338
  • 53
  • 357
  • 513

3 Answers3

4

Try making i private so each loop has it's own copy.

Now that I have more time, I will try and explain. By default variables in OpenMP are shared. There are a couple of cases where there are defaults that make variables private. Parallel regions is not one of them (so High Performance Mark's response is wrong). In your original program, you have two race conditions - one on i and one on j. The problem is with the one on i. Each thread will execute the loop some number of times, but since i is being changed by each thread, the number of times any thread executes the loop is indeterminate. Since all threads have to execute the barrrier for the barrier to be satisfied, you come up with the case where you will get a hang on the barrier which will never end, since not all threads will execute it the same number of times.

Since the OpenMP spec clearly states (OMP spec V3.0, section 2.8.3 barrier Construct) that "the sequence of worksharing regions and barrier regions encountered must be the same for every thread in a team", your program is non-compliant and as such can have indeterminate behavior.

ejd
  • 64
  • 2
  • -1: i is private, the OpenMP standard mandates this just so that each thread controls its own share of the iterations. – High Performance Mark Dec 22 '10 at 23:08
  • From the OpenMP v3.0 spec, section 2.9.1.1 Data-sharing Attribute Rules for Variables Referenced in a Construct: The loop iteration variable(s) in the associated for-loop(s) of a for or parallel for construct is(are) private. This program does NOT use a worksharing for construct, so the loop index is shared. And if you still don't believe it print out the address of i. – ejd Dec 23 '10 at 20:04
  • 1
    @High Performance Mark, YES, the iteration counter is private, but only for `#pragma omp for`. I have no `omp for` pragma, so i is shared – osgx Jan 08 '11 at 23:13
1

You're trying to add to the same location from multiple threads. You can't do what you're trying to do in parallel. If you want to do a sum in parallel, you need to divide it into smaller pieces and collect them afterwards.

Update by a5b: right idea but wrong part of code was spotted. The i variable is changed by both threads.

osgx
  • 90,338
  • 53
  • 357
  • 513
arsenm
  • 2,903
  • 1
  • 23
  • 23
  • ARSEN M, please read the all question. The SUM is incorrect, but program must to halt! And it doesn't. Parallel changing of variable is not deadlock – osgx Dec 20 '10 at 23:59
  • What you're have here is fundamentally incorrect. It looks like you're trying to lock around it, which is wrong also, but you're also doing that wrong. You cannot expect this to not deadlock. You're having all the threads wait for nothing to happen before the sum. – arsenm Dec 21 '10 at 00:30
  • Okay, if I delete `j+= j^i;` line, the behaviour would be the same – osgx Dec 21 '10 at 01:12
  • I'm measuring the omp barrier speed. All threads wait a every NEXT iteration on barrier. – osgx Dec 21 '10 at 01:13
0

@ejd, If I mark i as private, will my program be compliant?

Sorry - I just saw this question. Technically if you mark variable "i" as private your program will be OpenMP compliant. HOWEVER, there is still a race condition on "j" and while your program is compliant (because there are valid cases to have race conditions), the value of "j" is unspecified (according to the OpenMP spec).

In one of your previous answers you said that you were trying to measure the speed of the barrier implementation. There are several "benchmarks" that you might want to look at that have published results for a variety of OpenMP constructs. One was written by Mark Bull (EPCC, University of Edinburgh), another (Sphinx) comes from Lawrence Livermore National Labs (LLNL), and the third (Parkbench) comes from a Japanese Computing Partnership. They may offer you some guidance.

ejd
  • 1,717
  • 1
  • 11
  • 10
  • why do you change your user ID (you use several stackoverflow logins, which are named "ejd", but each have a different number in link http://stackoverflow.com/users/578711/ejd & http://stackoverflow.com/users/551576/ejd ) – osgx Jan 19 '11 at 16:42
  • 1
    The first time I didn't register. The second time I registered, but for some reason the web site didn't combine all of the ids - just some of them. Unfortunately I have no idea how to fix it. – ejd Jan 19 '11 at 20:44