I am trying to learn OpenMP and want to study speed-up using OpenMP. For this purpose, I have written the following small program:
#include <vector>
#include <cmath>
int main() {
static const unsigned int testDataSize = 1 << 28;
std::vector<double> a (testDataSize), b (testDataSize);
for (int i = 0; i < testDataSize; ++i) {
a [i] = static_cast<double> (23 ^ i) / 1000.0;
}
b.resize(testDataSize);
#pragma omp parallel for
for (int i = 0; i < testDataSize; ++i) {
b [i] = std::pow(a[i], 3) * std::exp(-a[i] * a[i]);
b [i] += std::pow(a[i], 5) * std::exp(-a[i] * a[i]);
b [i] += std::pow(a[i], 7) * std::exp(-a[i] * a[i]);
b [i] += std::pow(a[i], 9) * std::exp(-a[i] * a[i]);
b [i] += std::pow(a[i], 11) * std::exp(-a[i] * a[i]);
b [i] += std::pow(a[i], 13) * std::exp(-a[i] * a[i]);
b [i] += std::pow(a[i], 15) * std::exp(-a[i] * a[i]);
b [i] += std::pow(a[i], 17) * std::exp(-a[i] * a[i]);
b [i] += std::pow(a[i], 19) * std::exp(-a[i] * a[i]);
b [i] += std::pow(a[i], 21) * std::exp(-a[i] * a[i]);
}
return 0;
}
I compiled the above code either with or without the -std=c++11 directive. What I notice is that when I am using the -std=c++11 directive, my code runs about 8 times slower as without using this. I am using -O3 and gcc version 4.9.2 on a Linux Debian system. Furthermore, when I compare the execution times without using OpenMP, I do note a speed difference. Thus, it looks to me that there is a problem with the -std=c++11 and not with OpenMP.
In detail, I obtain the following execution times (a measured using the Linux time
command)
Compilation with OpenMP and -std=c++11: 35.262s
Compilation only with OpenMP: 5.875s
Compilation with only -std=c++11: 2m12
Compilation without OpenMP and -std=c++11: 23.757s
What is the reason that the execution time is much slower when using -std=c++11?
Any help or suggestion is greatly appreciated!
I have tagged what, in my humble opinion, is the best answer. In follow-up of oLen's answer, I have made my own pow(double, int) function as given below:
double my_pow(double base, int exp) {
double result = 1.0;
while (exp) {
if (exp & 1)
result *= base;
exp >>= 1;
base *= base;
}
return result;
}
I am not sure whether this is the most efficient way to calculate the integer power of some base number, but using this function I get exactly the same results in terms of computational efficiency when compiling with or without std=c++11 fully in line with oLen's answer.