I have a code in C which has the following overall framework:
while (err > tol){
func_A();
func_B();
func_C();
func_Par();
}
The codes are changing some global variables and this how they are connected. In func_Par()
, three threads are created. All threads are using the same function, namely Threads_Func()
. Based on number of thread, following code is used in Threads_Func()
to change cpu affinity for each thread:
pthread_t curThread = pthread_self();
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(thread_number, &cpuset);
pthread_setaffinity_np(curThread, sizeof(cpu_set_t), &cpuset);
Here is the strange behaviour that I cannot explain. I am measuring the cpu time for func_A
, func_B
and func_C
and here are the results (all results in micro seconds):
With setting CPU affinity in Threads_Func()
:
func_A: 439197
func_B: 61129
func_C: 400482
func_Par: 2488662
Without setting CPU affinity in Threads_Func()
:
func_A: 226677
func_B: 30922
func_C: 242516
func_Par: 4843463
As you can see, although the functions are executed in sequential order, setting cpu affinity doubles the time in the other functions. I am trying to figure out what I should to set CPU affinity (to get performance improvement in func_Par
) while avoiding performance degradation in other functions.
FYI, I am compiling the code using gcc
and with -O0
flag to make sure that compiler does not change any order. Moreover, I am using a quad-core processor and the OS is Linux Ubuntu.
Any help is appreciated. Thanks in advance for your help.