0

I have compiled the following c program to asm to see what instructions it uses. What I have in C is:

int add(int num1, int num2) {
    int num3 = num1 + num2;
    return num3;
}

My thought on what the instructions "should be" (from my very limited knowledge of asm) would be:

  1. Load (two 4-byte int variables into memory).
  2. Add (two memory locations), and -
  3. Store the sum at a third memory location.
  4. Return the value and halt execution.

When compiling this, I was surprised at all the mov operations it does:

add:
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-20], edi
        mov     DWORD PTR [rbp-24], esi
        mov     edx, DWORD PTR [rbp-20]
        mov     eax, DWORD PTR [rbp-24]
        add     eax, edx
        mov     DWORD PTR [rbp-4], eax
        mov     eax, DWORD PTR [rbp-4]
        pop     rbp
        ret

Could someone walk me through the asm code here and point out why it's using the mov code so frequently? Here's an example of it: here.

David542
  • 104,438
  • 178
  • 489
  • 842
  • Did you use any compiler flags like `-O3` – Ackdari Apr 06 '20 at 05:04
  • @Ackdari not to my knowledge, but I've used godbolt to compile it, so it might've thrown some flags in there (you can check the link above). – David542 Apr 06 '20 at 05:10
  • If you go to the box marked "Compiler options" and type in `-O3`, you see that the code generated changes to `lea eax, [rdi+rsi]`. – David Wohlferd Apr 06 '20 at 05:15
  • 1
    Above the asm view is a text field for compiler flags if you enter `-O3` into that (i.e. all optimizations) the asm-code becones much more minimalitic. – Ackdari Apr 06 '20 at 05:15

1 Answers1

2

This looks like code that was specifically compiled at the lowest possible optimization setting. What it's doing is storing the arguments (which are in the registers edi and esi) into stack memory (rbp - offset), then reloading them into registers eax and edx to add them, storing the result back into int num3 in stack memory at [rbp - 4], then reloading to return the result in eax.

A lot of this is redundant and it could just as easily be:

add:
    add edi, esi
    mov eax, edi
    ret

Or in practice with gcc -O1 or higher is actually

add:
    lea    eax, [rdi+rsi]
    ret

The gcc/clang default is -O0 which anti-optimizes for consistent debugging, keeping variables in memory between C statements instead of registers. See Why does clang produce inefficient asm with -O0 (for this simple floating point sum)? for details.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Govind Parmar
  • 20,656
  • 7
  • 53
  • 85
  • thanks for this explanation. Additionally, why is the pointer offset 20 and 24, if the `int` is just 4 bytes long? – David542 Apr 06 '20 at 05:08
  • 1
    @David542 A stack frame contains multiple bits of info including the arguments passed in but also the address to return to, a pointer to the parents stack fram and some other stuff.. The -20 and -24 are distances from the current bottom of the stack to the function arguments. Look here: http://stffrdhrn.github.io/software/embedded/openrisc/2018/06/08/gcc_stack_frames.html – Jerry Jeremiah Apr 06 '20 at 05:36
  • 1
    `gcc` with `-O3` seems to be even more concise: It just does `leal (%rdi,%rsi), %eax` followed by `ret`. – Tom Karzes Apr 06 '20 at 05:38