1

My professor mentioned that gcc can be run with -flto. I am wondering why the intermediary (GIMPLE in GCC case) are needed.

Why is the assembly not sufficient?

He mentioned that this allows the compiler to (at link time) see from what the code was generated, rather than just looking at the code itself, and generate new code if necessary.

If this is the only reason, why would this be necessary? Is the code not already optimized (under the assumption you are using -flto correctly and passing the same flags)?

  • 2
    The intermediate representations (for just about all compilers) typically contains a lot more information than plain assembly. – Some programmer dude Feb 27 '23 at 23:38
  • Yes @Someprogrammerdude , but why does it need that information for link time optimization? – cspurposesonly Feb 27 '23 at 23:40
  • 2
    For the same reason that it needs that for non-link-time optimization, assembly is not a good representation to perform analysis and transformations on. – harold Feb 27 '23 at 23:46
  • @harold I had thought that the primary purpose of lto is to expose the whole program to the compiler so it can inline and do such things across modules, why would it ever need to perform additional code generation? – cspurposesonly Feb 27 '23 at 23:48
  • 2
    Take the example of inlining, how would the LTO be able to know which functions are good candidates for inlining? That's *very* hard and time-consuming. Why not let the compiler, which had much more information about the code (especially the original code) pass that information along in the intermediate representation? – Some programmer dude Feb 27 '23 at 23:51
  • 3
    The real strength of inlining comes from creating new opportunities for optimizations, not so much from basically removing a call/ret pair (that does matter too though). Inlining without applying the usual optimizations would be a huge waste of the opportunity – harold Feb 27 '23 at 23:56
  • What sorts of optimizations would that be @harold ? I didn’t know such existed. Thank you btw. You answered my original question – cspurposesonly Feb 27 '23 at 23:59
  • @cspurposesonly inlining is an optimization on its own. And it's impossible to inline functions in another compilation unit because they were compiled separately. Only at link time they can be optimized – phuclv Feb 28 '23 at 01:15
  • 3
    @cspurposesonly: [Common subexpression elimination](https://en.wikipedia.org/wiki/Common_subexpression_elimination) and [constant folding](https://en.wikipedia.org/wiki/Constant_folding) for starters. If `foo` calls `a = bar(x+3)`, and `bar(int z)` does `return z+4;` then inlining alone will only get you the asm equivalent of `a=x+3+4` with two add instructions still needed. You now have to run constant folding again to get to `x+7` with one add. – Nate Eldredge Feb 28 '23 at 02:13
  • 1
    In that example, perhaps a [peephole optimization](https://en.wikipedia.org/wiki/Peephole_optimization) could recognize two consecutive add instructions and combine them, but in general this could become very hard. – Nate Eldredge Feb 28 '23 at 02:14
  • For similar reasons to the fact that when you do an optimized build, you do it from the source code, not from the debug-mode binary. – Peter Cordes Feb 28 '23 at 11:13
  • @phuclv many things are possible. A compiler can stash away info to let a link time optimizer insert the code inline, if that is an improvement, or not if it is an anti-optimization. Early 90s compilers (including plan9's) explored this area; so it is about appropriate it arise 20+ years later. – mevets Mar 01 '23 at 07:06

0 Answers0