I am following Tim Mattson's lectures on OpenMP to learn ways of implementation of some parallel programming concepts.
I was trying to observe the running time behavior of a parallel program that computes the value of PI using 3x10^8 steps.
Here is the code,
#include <omp.h>
#include <stadio.h>
static long num_steps = 300000000;
double step;
#define PAD 8 // tried 50 too
#define NUM_THREADS 4
int main()
{
int i, nthreads;
double pi, sum[NUM_THREADS][PAD];
double ts, te;
ts = omp_get_wtime();
step = 1.0/(double) num_steps;
omp_set_num_threads(NUM_THREADS);
#pragma omp parallel
{
int i, id,nthrds;
double x;
id = omp_get_thread_num();
nthrds = omp_get_num_threads();
if (id == 0) nthreads = nthrds;
for (i=id, sum[id]=0.0;i< num_steps; i=i+nthrds) {
x = (i+0.5)*step;
sum[id][0] += 4.0/(1.0+x*x);
}
}
for(i=0, pi=0.0;i<nthreads;i++)
pi += sum[i][0] * step;
te = omp_get_wtime();
printf("%.10f\n", pi);
printf("%.f\n", te-ts);
}
Now I was on Ubuntu 14.04 LTS running on a Dual Core machine. A call to omp_get_num_procs()
returned 2. The running time was something like totally random, ranging from 1.31 second to 4.46 seconds. Whereas the serial program was taking 2.31 second almost always.
I tried creating 1, 2, 3, 4, upto 10 threads. The running time varies too much in every case, though the average is smaller in case of more threads. I wasn't running any other applications.
Can anyone explain why the running time varied too much?
How to calculate the run time accurately? The lecturer has given the running time of his computer which seems consistent. And he was also using Dual Core processor.