3
int f(int x, int y) {
    return 20 * (x - 10) + 50 * (x + 5);
}

int f_expected(int x, int y) {
    return 70 * x + 50;
}

The generated code is:

f(int, int):
        lea     eax, [rdi-50+rdi*4]
        add     edi, 5
        imul    edi, edi, 50
        lea     eax, [rdi+rax*4]
        ret
f_expected(int, int):
        imul    eax, edi, 70
        add     eax, 50
        ret

I expect f to be compiled to f_expected. I tried -O3 and -Ofast on GCC 7. Which flag am I looking for, exactly (if any)? clang and icc produce the expected code under -O3.

For reference, clang code:

f(int, int):
        imul    eax, edi, 70
        add     eax, 50
        ret

f_expected(int, int):
        imul    eax, edi, 70
        add     eax, 50
        ret
Gabriel Garcia
  • 1,000
  • 10
  • 11
  • 3
    "I expect f to be compiled to f_expected" Why? –  Mar 31 '17 at 23:50
  • I have some other code where this type of optimization provably (I profiled, hand-optimized, etc.) results in faster execution. Also, clang has no problem with it. Is GCC particularly bad at these types of optimizations, or am I simply using it wrong? – Gabriel Garcia Mar 31 '17 at 23:54
  • You can't expect two different compilers to produce the same optimised (or non-optimised) code. –  Apr 01 '17 at 00:00
  • 4
    Somewhat mysteriously, GCC gets it right if you use `-fwrapv` - yes, a flag that inhibits some optimizations. – harold Apr 01 '17 at 00:05
  • @harold fantastic. This is what I'm looking for. Care to add it as an answer so that I can accept it? – Gabriel Garcia Apr 01 '17 at 00:09
  • 3
    Preferably not since it's nowhere close to the whole story. As it is it's just a completely random finding, it's a solution that shouldn't have worked. It's more evidence of a bug in GCC than it is a solution. – harold Apr 01 '17 at 00:19
  • 1
    @harold I'm not much into GCC and compiler-dev, but the wording of the [docs](https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html) sound perfectly fine to me: ```This flag enables some optimizations and disables others```. Without much background, this feels natural to me (it's either: **A** we are assuming some math-behaviour / not or **B** we are assuming one of two different math-behaviours; i'm not sure what is the case here, obviously the B) would indicate more the kind of *turn on some; turn off others* stuff) – sascha Apr 01 '17 at 11:35
  • @sascha yes but this optimization is valid either way, it really shouldn't be disabled. I suspect it isn't even really disabled, but some other rewrite gets in the way somehow and ruins the pattern so the optimization that is supposed to take care of this doesn't recognize the expression anymore, but I'm not that into GCC internals.. – harold Apr 01 '17 at 11:44
  • @harold Okay thanks for the insights. Interesting to see, what compilers do today. I would probably be scared by this kind of optimization here when i had to implement crypto with side-channel attacks in mind (meaning: valid in terms of result; but not always the change of behaviour might be desired). But that's not my expertise either and compiler-opts in general seem to be scary then. – sascha Apr 01 '17 at 11:47
  • Consider this code under GCC (https://godbolt.org/g/Mm7l0m) and Clang (https://godbolt.org/g/cTtbRx). – Gabriel Garcia Apr 01 '17 at 17:16
  • 1
    Try -30678338 with `-fsanitize=undefined`. The optimization is valid, but can only be done at a low level. There are sometimes discussions of enabling `-fwrapv` automatically during the last optimization passes. – Marc Glisse Nov 04 '18 at 11:27

1 Answers1

3

This happens because GCC is scared to introduce signed integer overflow which didn't exist in original version (and which will cause undefined behavior in program). You can force GCC to allow signed overflow by adding -fwrapv to CFLAGS but this would bring other issues (e.g. inability to optimize some loops).

$ gcc tmp.c -S -o- -O2
    ...
    leal    -50(%rdi,%rdi,4), %eax
    movl    $50, %edx
    addl    $5, %edi
    imull   %edx, %edi
    leal    (%rdi,%rax,4), %eax
    ret
$ gcc tmp.c -S -o- -O2 -fwrapv
    ...
    movl    %edi, %eax
    movl    $70, %edx
    imull   %edx, %eax
    addl    $50, %eax
    ret

Now Clang is able to somehow figure out that UB isn't present in original code so this may be a missing optimization in GCC (which I encourage you to report to their Bugzilla).

yugr
  • 19,769
  • 3
  • 51
  • 96
  • Clang doesn't have to figure out that UB isn't present, it just has to make asm that produces the correct result for all cases where the C abstract machine doesn't encounter UB. It doesn't have to break in cases that are UB, unless you used `-fsanitize=undefined`. Since binary integer math *does* wrap safely on normal CPUs, including x86 which it's compiling for in this case, it Just Works to simplify. (What old version of GCC did you compile with where it uses `mov $70, %edx` / `imul %edx, %eax` instead of `imul $70, %eax, %eax`, though?) – Peter Cordes Jun 22 '22 at 10:08
  • But yes, GCC is over-cautious without `-fwrapv`, treating the target asm like it had the same no-signed-int-overflow rules as ISO C. – Peter Cordes Jun 22 '22 at 10:11