1

When writing programs with code that can be executed in parallel in C, we definitely use the O flags to optimize the code.

gcc -Olevel [options] [source files] [object files] [-o output file]

In large projects, we usually split the code into several files. My question, for which I've found no answer, is this:

Does the program's performance drop at all, due to the fact that we split the code into files and the O flags don't have enough information to optimize any further? Is there such a possibility?

Jason
  • 3,777
  • 14
  • 27
codingEnthusiast
  • 3,800
  • 2
  • 25
  • 37
  • 1
    If you split it into files included with `#include`, it _cannot_ make any difference whatsoever. If you split it into components that are compiled separately, I still seriously doubt it: Compiler optimizations are very local. But I can't say this for a fact, so this is not an answer... – alexis Oct 30 '15 at 18:58
  • I have split a personal project into files and the performance has dropped a lot. I asked a question about that specifically but I go no answer, so I deleted it. Now I'm not saying this happened because of the code splitting, but I want to understand whether it is possible, because to me it seems improbable. – codingEnthusiast Oct 30 '15 at 19:28
  • *"I'm not saying this happened because of the code splitting "*. Did you just split your project into separate files adding only the necessary headers, and minimal necessary linker details, or did you change other conditions too? – Weather Vane Oct 30 '15 at 19:35
  • Since my project has to do with various implementations of bitonic sort, I split each implementation into a .c file with its respective header file used for the definitions. Now there comes the problem: when the main file includes the header files and the compilation is done through a makefile to compile the files separately, I get worse performance than I would if I had directly included the .c files. I have changed nothing in the files, I've just added the header files with extern declaration of the globals I had to use and function definitions. – codingEnthusiast Oct 30 '15 at 19:49

2 Answers2

2

When you break code into separate files, it could potentially split it into more than one translation unit, which the compiler generally can't optimize across.

Take for example a constant defined in one translation unit but referenced in a number of others. All of the calculations that reference the constant have to be performed at run-time since the constant can't be folded into them at compile time.

Link-time optimization (-flto) is one way around the limitation.

Jason
  • 3,777
  • 14
  • 27
  • Exactly what I was looking for. Thanks for the simple explanation. – codingEnthusiast Oct 30 '15 at 20:29
  • @naltipar No problem, I'm glad I could help. – Jason Oct 30 '15 at 20:38
  • I was thinking whether should I accept this answer or mine, since -flto wasn't available for the purpose I needed it, but generally it was what I was looking for. Finally, I'll accept this answer once and for all and let my answer be a complementary one. – codingEnthusiast Nov 06 '15 at 20:19
  • @naltipar Either way is fine. If you find regressions between LTO and SCU, it would help to report them though since LTO is generally meant to supersede SCU. Unfortunately, it's only available in more recent versions of GCC and Clang. – Jason Nov 06 '15 at 20:40
1

Single Unit Optimization

Just to complement on @Jason's answer, I'd like to post another technique to avoid the limitation that arises when splitting files.

It's called Single Unit Optimization:

The Single Compilation Unit technique uses pre-processor directives to "glue" different translation units together at compile time rather than at link time. This reduces the overall build time, due to eliminating the duplication, but increases the incremental build time (the time required after making a change to any single source file that is included in the Single Compilation Unit), due to requiring a full rebuild of the entire unit if any single input file changes.

The whole project, even when split in files, can be optimized as if all parts of the program were visible to the compiler at once, without requiring the user merges the files back again.

How to apply it?

Usually, the project would contain a file with a main and will include all header files of each split file:

main.c

#include "sub-program-1.h"
#include "sub-program-2.h"
...
#include "sub-program-n.h"
//rest of code

where each of those .h files correspond to its respective .c which is compiled on its own (possibly through a makefile).

In order to apply SCU, we remove the include I've mentioned above and instead create a new file (let's call it SCU.c). This would be the following.

SCU.c

#include "sub-program-1.c"
#include "sub-program-2.c"
...
#include "sub-program-3.c"
#include "main.c"
//no more code in this file

And to compile the whole project, we just compile SCU.c

Community
  • 1
  • 1
codingEnthusiast
  • 3,800
  • 2
  • 25
  • 37