0

I'm currently comparing a serial versus a parallel implementation of some code on a quad-core processor. One of the things I'd like to understand/measure is how the serial code performs when it is running on a single core.

When I compile the serial code, I use gcc's -O3 option and at first I noticed the serial code wasn't doing too shabby. However, one thing I noticed is that when I am running another compute-intense process on one of the cores, the serial version's performance drops.

Here are some numbers:

Total Time elapsed: 1s, 233ms <- only serial code is running
Total Time elapsed: 1s, 238ms <- only serial code is running
Total Time elapsed: 2s, 128ms <- serial code run but other code is running on another core
Total Time elapsed: 2s, 220ms <- serial code run but other code is running on another core

I am guessing there may be background processes running on one of the four cores. But as best I gather running two processes on a quad-core processor shouldn't saturate all four cores.

What I'm wondering is whether there is reason to believe that some step in the O3 process allows the code to take advantage of the quad-core set up, or, perhaps more precisely, why it is that the supposed "serial version" performs better when other cores are available? I was trying to understand the GCC documentation and I gathered there were some references to threading. But I don't really get it and was wondering if someone could help me understand precisely what O3 might or might not do to take advantage of more than one core.

For what it is worth, I am using an Intel(R) Core(TM) i7-3820 CPU @ 3.60GHz and am running linux mint 13.

Thanks

dsolimano
  • 8,870
  • 3
  • 48
  • 63
user1790399
  • 220
  • 2
  • 11

1 Answers1

1

-O3 does not in the face of more than one core.

You are seeing effects of the shared resources on your processor: memory bandwidth and cache.

Yann Ramin
  • 32,895
  • 3
  • 59
  • 82