0

I have a serial code that looks something like that:

sum = a;
sum += b;
sum += c;
sum += d;

I would like to parallelize it to something like that:

temp1 = a + b     and in the same time     temp2 = c + d
sum = temp1 + temp2

How do I do it using Intel parallel studio tools?

Thanks!!!

skaffman
  • 398,947
  • 96
  • 818
  • 769
N.M
  • 685
  • 1
  • 9
  • 22

1 Answers1

1

Assuming that all variables are of integral or floating point types, there is absolutely no sense to parallelize this code (in the sense of executing by different threads/cores), as the overhead will be much much higher than any benefit out of it. The applicable parallelism in this example is at the level of multiple computation units and/or vectorization on a single CPU. Optimizing compilers are sophisticated enough nowadays to exploit this automatically, without code changes; however if you wish you may explicitly use temporary variables, as in the second part of the question.

And if you ask just out of curiosity: Intel Parallel Studio provides several ways to parallelize code. For example, let's use Cilk keywords together with C++11 lambda functions:

#include <cilk/cilk.h>
...
temp = cilk_spawn [=]{ return a+b; }();
sum = c+d;
cilk_sync;
sum += temp;

Don't expect to get performance out of that (see above), unless you use classes with a computational-heavy overloaded operator+.

Alexey Kukanov
  • 12,479
  • 2
  • 36
  • 55
  • Thanks! When you say: 'Optimizing compilers are sophisticated enough nowadays to exploit this automatically' what exactly does it mean? Do compilers automatically perform something similar to SIMD commands? – N.M Jul 29 '11 at 15:21
  • I meant compilers can generate code that can use multiple arithmetical units or/and vector (SIMD) units if available on the target processor. Or, to say it differently, a good optimizing compiler can recognize that the source code can be transformed/split into independent portions that can be executed in parallel, and so it may generate code exploiting parallelism of the target processor. – Alexey Kukanov Aug 12 '11 at 11:59