0

During a lecture my professor gave us the following loop:

for (int i = 0; i < 100; i++) {
    a[i] = a[i] + b[i];
    b[i + 1] = c[i] + d[i];
}

He pointed out the dependency between iterations of the loop because line three sets a value that is used in the next iteration on line 2 (sets b[i+1] which becomes b[i] in the next iteration). Therefore we can't run each iteration of the loop in parallel.

He then gave us this unrolled version:

a[1] = a[1] + b[1];
for (int i = 0; i < 98; i++) {
    b[i+1] = c[i] + d[i];
    a[i+1] = a[i] + b[i];
}
b[99] = c[99] + d[99];

He claims that each iteration of the loop can now be run in parallel. The problem I see is that line 3 sets what will become b[i] in the next iteration on line 4 and therefore we still can't run each iteration in parallel.

Am I right in saying that? If so, is there a properly unrolled version of the first loop where each iteration can be parallelized?

Joe P
  • 1

1 Answers1

1

I guess you made a mistake writing down the unrolled version your professor gave. To be equivalent to the first algorithm, it should read like this:

a[0] = a[0] + b[0];
for (int i=0 ; i<99 ; ++i) {
    b[i+1] = c[i] + d[i];
    a[i+1] = a[i+1] + b[i+1];
}
b[100] = c[100] + d[100];

On this version, you can see that the dependency problem is gone.

François Févotte
  • 19,520
  • 4
  • 51
  • 74