Mov and add are doing nothing for some reason

Question

I got this code from my university professor, so I'm fairly certain that the code works, but for me the output is always 0.

I tried it on Windows and on a virtual Ubuntu machine but still the same.

I'm compiling using mingw:

gcc test.c test.s

This is the C code:

#include <stdio.h>

int func(int a, int b);

int main()
{   
    int a, b;
    scanf("%d %d", &a, &b);
    printf("%d\n", func(a, b));
    return 0;
}

And this is assembly:

.intel_syntax noprefix

.text

    .globl _func

_func:

    enter 0,0

    mov eax, edi
    add eax, esi
    leave
    ret

For inputs 2 and 3 it should output 5, but it's always 0.

@Serge C names correspond to assembly names with an underscore prefix. — Barmar, Jan 26 '19 at 02:39
@Barmar That is true on some but not all platforms. In particular it is _not_ true on Linux. I think it _is_ true on MacOS, but I'm not 100% sure about that, and I have no idea whether or not it is true in Windows. In any case, if that was OP's problem, they would be getting a link error, not incorrect output. — zwol, Jan 26 '19 at 03:03
you need to dump symbol names from the image and see if the 'func'/'_func' names are defined. objdump, nm could help on linux. — Serge, Jan 26 '19 at 03:16
If that code was given to you exactly then I can infer from registers EDI and ESI being first and second parameter that this code is designed for the x86-64 system ABI. The fact that you have `_` on the front would suggest this code was likely being used on MacOS with macho64 objects (underscores are needed on that platform). It should assemble and run properly on Linux if you use a 64-bit GCC compiler if you get rid of the underscores and compiled and linked it with `gcc test.c test.s` . This will not work with a 32-bit compiler. — Michael Petch, Jan 26 '19 at 03:47
This code will not compile and run properly on Windows with MinGW as the 64-bit calling convention there is quite different. — Michael Petch, Jan 26 '19 at 03:48
are you using x86 or x86-64? and `mov eax, edi; add eax, esi` can be simplified to `lea eax, [edi + esi]` — phuclv, Jan 26 '19 at 09:02
@phuclv: I'd say "optimized", not simplified. But anyway, `lea eax, [rdi + rsi]` if you're going to use LEA on x86-64. There's never a reason to use an address-size prefix with LEA, the default operand-size is 32-bit and that's sufficient to truncate the result. (Plus, that won't assemble in 32-bit code, so it would also rule out 32-bit calling-conventions. So would using something other than the very slow `enter` instruction to make a stack frame.) — Peter Cordes, Jan 26 '19 at 11:29
@PeterCordes I can see using `enter` and `leave` in an assembly course for beginners, because it'd be one fewer x86 weirdness to explain to people who don't even really grok registers yet. (I think a _better_ approach would be to start with a RISC, but I've never actually tried to teach assembly. There's an argument for using the CPU that everyone has conveniently to hand. But then there's also an argument for teaching the calling convention that the compiler actually uses, so disassembly dumps make sense. `¯\_(ツ)_/¯`) — zwol, Jan 26 '19 at 14:44
@zwol: Either way it's "magic boilerplate" at first; I think it makes the most sense to teach a RISCy subset of x86 at first, skipping magic fancy instructions like ENTER, LEAVE, and LOOP. dec ecx/jnz makes it super obvious that you can't use the same loop counter for both inner and outer loops, and that there's no magic. There several SO questions about how `loop` works or problems with attempted nested use of `loop`, and I hope at least half of those would have not been asked if they'd been taught dec [e]cx/jnz as their magic looping construct which can soon be understood. — Peter Cordes, Jan 26 '19 at 15:07
But I haven't tried to teach people either, other than on SO, and it amazes me how clueless many of the questions are. (Not this one, I mean debugging questions where using a debugger at all, or looking at the manual for an instruction , would have made it obvious.) Anyway, y86 is a toy architecture RISCy version of x86, with different mnemonics for mov immediate to register vs. mov memory to register. So definitely many people agree there's merit to the idea. I think using ENTER is like introducing REP SCAS: totally unnecessary complexity for beginners. — Peter Cordes, Jan 26 '19 at 15:11

score 3 · Accepted Answer · answered Jan 26 '19 at 03:18

This part of the assembly language ...

    mov eax, edi
    add eax, esi

... is one correct way to add the first two int arguments to a function and return the result, if the first two int arguments to the function are in registers edi and esi. On x86 Linux, that is true if and only if the program is compiled with the "64-bit ABI", which may or may not be what your Ubuntu virtual machine does by default. As far as I know, it is never true on Windows, regardless of ABI.

Use of the Linux 64-bit ABI, however, is not consistent with the rest of the assembly language: in particular, on Linux (all varieties) a C function named func corresponds to an assembly language procedure named func, not _func. But this should have caused your program to fail to link (the gcc command would have produced an error message saying "undefined reference to `func'"), not to produce incorrect output.

I advise you to go back to the professor who gave you the assembly code and ask them what operating system and ABI it's supposed to work with, and how to adapt it to the computer(s) you have convenient access to.

(You may not have encountered the term "ABI" before. It stands for Application Binary Interface, and it's a bunch of rules for how the low-level details of things like procedure calls work. For instance, the "x86-64 ELF ABI", which is what Linux uses, says that the first two integer arguments to a function call are placed in the DI and SI registers (in that order) before the CALL instruction, and an integer return value will be found in AX after it returns. But the x86-64 Windows ABI says that the first two integer arguments to a function are placed in some other two registers -- I don't remember which two -- and the x86-32 ELF ABI says they go on the stack. Everyone agrees on integer return values appearing in AX though.)

It would likely work on MacOS where Macho64 requires the `_` and is also using the x86-64 System V ABI. — Michael Petch, Jan 26 '19 at 03:49
I think you could use `__attribute__((sysv_abi)) int func(int a, int b);` on MinGW64 to call this function with the x86-64 System V ABI, but there's no way that can happen "by accident" on Windows. There's probably not even a command-line option to set the default calling convention to SysV instead of x64 Windows, because that would affect library calls like `printf`. — Peter Cordes, Jan 26 '19 at 11:33

Mov and add are doing nothing for some reason

1 Answers1