0

I was reading assembly tutorials and got stuck on stack operations and function calls. As said here, when fuction A calls function B, it passes first 4 arguments in registers(by value or pointer) and next ones are passed via stack. Also, caller funcion has to allocate 32 bytes on stack for calle to store those 4 value in registers. But when I have this simple code:

void foo()
{
    int a = 0;
    a++;
}

int WinMain(HINSTANCE, HINSTANCE, LPSTR, int)
{   
    foo();
    return 0;
}

The disassembly:

WinMain:
mov         dword ptr [rsp+20h],r9d  
mov         qword ptr [rsp+18h],r8  
mov         qword ptr [rsp+10h],rdx  
mov         qword ptr [rsp+8],rcx  
sub         rsp,28h  
call        foo (013F321000h)  
xor         eax,eax
add         rsp,28h  
ret  

foo:
sub         rsp,18h  
mov         dword ptr [rsp],0  
mov         eax,dword ptr [rsp]  
inc         eax  
mov         dword ptr [rsp],eax  
add         rsp,18h  
ret

And the question is: why compiler allocates more memory than it actually needs for stack frames? For example, in WinMain, when it's calling foo, no parameters are passed via stack, so why it allocates 40 bytes instead of 32 shadow space? And the same is in foo function: it needs only 4 bytes to store int variable, another 4 probably are used for alignment, but how another 16 bytes are used?

I use VS2017, code is build in Win64-Debug, optimisation, intrisicts and JMC are disabled.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Poseydon
  • 59
  • 1
  • 9
  • 4
    To maintain stack alignment, so RSP is 16-byte aligned every call, like it was aligned before a call to WinMain pushed a return address. As for `foo`, it's un-optimized code; don't expect it to be efficient. There are some existing Q&As about this for GCC/clang (Linux/Mac calling convention, not Windows); google `site:stackoverflow.com allocate extra stack space x86-64 stack alignment` – Peter Cordes Jun 14 '20 at 16:26
  • 1
    Ok, I've missed that stack has to be 16 bytes aligned, but then a next question: if I modify `foo` so that it takes 5 parameters, each is 4 byte, so last of them has to be pased via stack, then before `call foo` there is `sub rsp, 38h` instruction. It allocates 56 bytes on stack, 32 of them are shadow space, 4 are for parameter passing, then it only needs 4 more bytes to align stack to 16 bytes boundary, which is 40 bytes total. So is it another piece of unoptimized code or I have to deal with it in the same way in my assembly? – Poseydon Jun 14 '20 at 16:43
  • 1
    Stack args are passed in 8-byte stack slots, regardless of how narrow they are. (Or a struct is padded to a multiple of 8.) If you're hand-writing asm that calls a function, you only need to make sure that RSP is 16-byte aligned before a call, and that it's ok for the callee to step on the 32 bytes above its return address (RSP before the call), e.g. by dumping its register args into that "home space" to form a contiguous array of args. (variadic functions actually do that.) – Peter Cordes Jun 14 '20 at 17:01
  • Appending another 16 bytes will keep the same alignment. Maybe it is there to leave a red zone for the caller? In windows x86-64 the redzone size is 0, but you never know, the compiler might have bugs. –  Jun 17 '20 at 05:27

0 Answers0