My multithreaded C program runs the following routine :
#define NUM_LOOP 500000000
long long sum = 0;
void* add_offset(void *n){
int offset = *(int*)n;
for(int i = 0; i<NUM_LOOP; i++) sum += offset;
pthread_exit(NULL);
}
Of Course sum
should be updated by acquiring a lock, but before that I have an issue with the running time of this simple program.
When the main function is (Single Thread):
int main(void){
pthread_t tid1;
int offset1 = 1;
pthread_create(&tid1,NULL,add_offset,&offset1);
pthread_join(tid1,NULL);
printf("sum = %lld\n",sum);
return 0;
}
The output and running time are :
sum = 500000000
real 0m0.686s
user 0m0.680s
sys 0m0.000s
When the main function is (Multi Threaded Sequential) :
int main(void){
pthread_t tid1;
int offset1 = 1;
pthread_create(&tid1,NULL,add_offset,&offset1);
pthread_join(tid1,NULL);
pthread_t tid2;
int offset2 = -1;
pthread_create(&tid2,NULL,add_offset,&offset2);
pthread_join(tid2,NULL);
printf("sum = %lld\n",sum);
return 0;
}
The output and running time are :
sum = 0
real 0m1.362s
user 0m1.356s
sys 0m0.000s
So far the program runs as expected. But when the main function is (Multi Threaded Concurrent):
int main(void){
pthread_t tid1;
int offset1 = 1;
pthread_create(&tid1,NULL,add_offset,&offset1);
pthread_t tid2;
int offset2 = -1;
pthread_create(&tid2,NULL,add_offset,&offset2);
pthread_join(tid1,NULL);
pthread_join(tid2,NULL);
printf("sum = %lld\n",sum);
return 0;
}
The output and running time are :
sum = 166845932
real 0m2.087s
user 0m3.876s
sys 0m0.004s
The erroneous value of sum
due to lack of synchronization is not the issue here, but the running time. The actual running time of concurrent execution far exceeds that of the sequential execution. It is opposite to what is expected of concurrent execution in a multicore CPU.
Please explain what might be the problem here.