-1

How do I implement Sum_EREW algorithm in C++ using OpenMP, or in which application can I implement it?

for i = 1 to log (n) do

    forall j where 1 <= j <= n/2 do in parallel

           if (2j modulo 2**i) = 0 then 
                  A[2j] <- A[2j] + A[2j – 2**(i-1)]
           endif

    endforall 
endfor
user3666197
  • 1
  • 6
  • 50
  • 92
Hole Whel
  • 1
  • 2
  • 1
    1. Prove that for al `1i` the iterations of the inner loop are independent 2. use a simple `omp parallel for` for the inner loop. – Victor Eijkhout Mar 21 '22 at 21:26

2 Answers2

1

If n is constant, so are all values of i, so are all values of j, so are all values of 2j modulo 2**i.

Then you can:

  • generate the list of index values

  • detect duplicates where the third component is same as another iterations first component

  • add markers to those index points as a way of generating independent segments

  • use OpenMP to compute independent segments and use normal serial path on the marker points.

If n is constant through lifetime of app, then this is initialized only once at start and every re-computation of algorithm re-uses already-generated segments.

huseyin tugrul buyukisik
  • 11,469
  • 4
  • 45
  • 97
0

I think your algorithm can be rewritten to a most efficient one by removing the branching inside the innermost loop:

for i = 1 to log (n) do
    for j = 2**i to n step 2**i in parallel do
       A[j] <- A[j] + A[j – 2**(i-1)]
    endfor
endfor

which can be the something like this in C/C++ using OpenMP:

const int MAX_I_TO_PARALLELIZE=4;   //find the optimal number in your case

for(size_t i = 1; i < log(n);i++)
{ 
   const size_t pow2i= 1 << i;
   const size_t pow2im1= 1 << (i-1);

   #pragma omp parallel for if(i<MAX_I_TO_PARALLELIZE)
   for(size_t j = pow2i; j <= n; j += pow2i )
      A[j] +=  A[j - pow2im1];

}
Laci
  • 2,738
  • 1
  • 13
  • 22