1

I am trying to parallelize a large program that is written by a third-party. I cannot disclose the code, but I will try and give the closest example of what I wish to do. Based on the code below. As you can see, since the clause "parallel" is INSIDE the while loop, the creation/destruction of the threads are(is) done with each iteration, which is costly. Given that I cannot move the Initializors...etc to be outside the "while" loop.

--Base code

void funcPiece0()
{
    // many lines and branches of code
}


void funcPiece1()
{
    // also many lines and branches of code
}

void funcCore()
{
    funcInitThis();
    funcInitThat();

#pragma omp parallel
    {
#pragma omp sections
        {
#pragma omp section
            {
                funcPiece0();
            }//omp section
#pragma omp section
            {
                funcPiece1();
            }//omp section
        }//omp sections
    }//omp parallel

}

int main()
{

    funcInitThis();
    funcInitThat();
#pragma omp parallel
    {
    while(1)
    {
        funcCore();
    }
    }

}

What I seek to do is to avoid the creation/destruction per-iteration, and make it once at the start/end of the program. I tried many variations to the displacement of the "parallel" clause. What I basically has the same essence is the below: (ONLY ONE thread creation/destruction per-program run) --What I tried, but failed "illegal access" in the initializing functions.

void funcPiece0()
{
    // many lines and branches of code
}


void funcPiece1()
{
    // also many lines and branches of code
}

void funcCore()
{
    funcInitThis();
    funcInitThat();

//#pragma omp parallel
//  {
#pragma omp sections
        {
#pragma omp section
            {
                funcPiece0();
            }//omp section
#pragma omp section
            {
                funcPiece1();
            }//omp section
        }//omp sections
//  }//omp parallel

}

int main()
{

    funcInitThis();
    funcInitThat();

    while(1)
    {
        funcCore();
    }

}

--

Any help would be highly appreciated! Thanks!

user598208
  • 217
  • 3
  • 6
  • 15
  • Note that the OpenMP website has a forum where you can post questions about OpenMP -- check it out at http://openmp.org/forum/ –  Nov 21 '11 at 04:26

2 Answers2

3

OpenMP only creates worker thread at start. parallel pragma does not spawn thread. How do you determine the thread are spawned?

crazyjul
  • 2,519
  • 19
  • 26
  • Thank you for answering. My program hanged, and when I paused it with the debugger, using my compiler, I saw many worker threads executing the function where the "pragma omp parallel" was. Where are the threads "supposed" to be spawned? – user598208 Nov 16 '11 at 01:00
  • 1
    The thread are started when your program starts ( or the first time are needed, depending on the implementation ). Pause your program anywhere else, and you'll notice the threads are still there – crazyjul Nov 16 '11 at 08:25
  • While it seems this is true where I tested (both `std::this_thread::get_id()` and `omp_get_thread_num()` show repeating values for repeated execution of the parallel region), this contradicts the official documentation, wherein it says that the `parallel` construct "creates threads". https://www.openmp.org/spec-html/5.0/openmpse14.html – oarfish Oct 14 '20 at 16:03
0

This can be done! The key here is to move the loop inside one single parallel section and make sure that whatever is used to determine whether to repeat or not, all threads will make exactly the same decision. I've used shared variables and do a synchronization just before the loop condition is checked.

So this code:

initialize();
while (some_condition) {
  #pragma omp parallel
  {
     some_parallel_work();
  }
}

can be transformed into something like this:

#pragma omp parallel
{
  #pragma omp single
  {
    initialize();  //if initialization cannot be parallelized
  }
  while (some_condition_using_shared_variable) {
    some_parallel_work();
    update_some_condition_using_shared_variable();
    #pragma omp flush
  }
}

The most important thing is to be sure that every thread makes the same decision at the same points in your code.

As a final thought, essentially what one is doing is trading the overhead for creating/destroying threads (every time a section of #pragma omp parallel begins/ends) into synchronization overhead for the decision making of the threads. I think synchronizing should be faster however there are some many parameters at play here that this may not always be.

Jason
  • 304
  • 3
  • 6