Calling convention for variadic function

Question

When you initialize a variadic list, you use the macro va_start and pass list_name followed by the last fixed parameter before the va list starts because "the last fixed parameter neighbors the first variable one" and somehow this helps identifying the var arg length / position in stack (I said somehow because I don't understand how).

Using a cdecl calling convention (meaning pushing onto the stack parameters from righ-to-left) how is the the last fixed parameter before the va list starts useful in identifying the list length? If for example that argument is an integer 3 and the variable arguments also have a 3 how does the callee knows that the variadic list is doesn't end here, as there is another 3 (the fixed parameter) and there should end? e.g f(int a, int b, ... ) -> call f(1, 3, 1, 2, 3))

The other way around, there is the guardian "style" where you add for example NULL pointer at the end of the variadic args when calling a function. Again: how is that NULL usefull if it is pushed the first on on to the stack? Shouldn't the NULL be pushed between the fixed and variable part of the arguments? (e.g f(int a, int b, ... ) -> call f(a, b, NULL, param1, param2))

Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/224217/discussion-on-question-by-ctlina-sirbu-calling-convention-for-variadic-functio). — Machavity, Nov 06 '20 at 19:28

Marco Bonelli · Answer 1 · 2020-11-05T14:24:20.917

If I understand your doubts correctly, what you are basically asking is: how does a variadic function figure out where its variadic arguments start if all the arguments are pushed to the stack with no additional information?

As you already noted, the arguments are pushed on the stack in reverse order of declaration: this means that void f(int a, ...) called as f(1, 2, 3) pushes first 3, then 2, and finally 1 before calling.

So how do you find the start of the variadic arguments?

You always know:

Where the top of the stack is.
How many parameters are required (fixed) before the variable ones.

Therefore, pushing the values in reverse order is the easiest way to know where the variable argument list starts. You will always find a fixed number of variables (equal to the number of required (fixed) arguments, followed by all the variable arguments (if any). This makes calculating the start of the argument list possible regardless of the number of arguments passed, without the need to pass additional information anywhere else. In other words, the offset of the start of variadic arguments from the top of the stack is always the same since it only depends on the number of required parameters.

An example will make this clearer. Let's assume a function defined as:

int f(int n, ...) {
    // ...
}

Then, compile the call f(2, 123, 456). Under cdecl, this produces:

push 456
push 123
push 2
call f

When f starts, it will find the stack in the following state:

--- lower addresses ----
[ return address ] <-- esp
[ 2              ]
[ 123            ]
[ 456            ]
--- higher addresses ---

Now it's very easy for f to know where the argument list starts, knowing that n was the last "fixed" (non-variadic) parameter: it will only have to calculate esp - 4 - 4. That is: subtract from esp a fixed amount (4) for the saved return address, then subtract 4 for each fixed parameter (nb: this is assuming sizeof(int) == 4). Doing so you will end up with the position of the first variadic parameter.

This works for any number of variadic arguments:

; f(5, 1, 2, 3, 4, 5)      --- lower addresses ----
push 5                     [ return address ] <-- esp
push 4                     [ 5              ]
push 3                     [ 1              ]
push 2                     [ 2              ]
push 1                     [ 3              ]
push 5                     [ 4              ]
call f                     [ 5              ]
                           --- higher addresses ---

Now imagine the opposite scenario, in which arguments are pushed in the opposite order, you would end up with f(2, 123, 456) compiling to:

; f(2, 123, 456)     --- lower addresses ----
push 2               [ return address   ] <-- esp
push 123             [ 456              ]
push 456             [ 123              ]
call f               [ 2                ]
                     --- higher addresses ---

And f(5, 1, 2, 3, 4, 5) compiling to:

; f(5, 1, 2, 3, 4, 5)      --- lower addresses ----
push 5                     [ return address ] <-- esp
push 1                     [ 5              ]
push 2                     [ 4              ]
push 3                     [ 3              ]
push 4                     [ 2              ]
push 5                     [ 1              ]
call f                     [ 5              ]
                           --- higher addresses ---

Now where does the argument list start? It's impossible to tell only based on the value of the stack pointer (ESP) and the number of required arguments, because the offset from the top of the stack is no longer the same, but varies with the number of variadic arguments. In order to figure it out, you would either have to do some math with the base pointer (EBP, assuming your function even uses it since it's not required), or pass some additional information.

When the variable arguments are pushed into the stack, when do the function knows when they ended?

That is not something that the calling convention enstablishes. The programmer will have to figure out a way to understand how many variadic parameters are present based on the non-variadic ones (or something else). For example, in my above examples I simply pass n as first parameter, the printf family of functions figures it out from the number of format identifiers in the string (e.g. %d, %s), the syscall function figures it out based on the syscall number (first argument), and so on...

When the variable arguments are pushed into the stack, when do the function knows when they ended? Is it the fact that you find the first parameter following the variadic part and you know that the list just ended? Why is it so special that parameter? Isn't it seen as a normal value pushed into the stack? Thank you for the answer. — Cătălina Sîrbu, Nov 05 '20 at 06:32
@CătălinaSîrbu: "Why is it so special that parameter? Isn't it seen as a normal value pushed into the stack?" - `va_start ` accepts a **name** of the parameter, not a **value**. Knowing a parameter name allows to obtain **position** (address) of the parameter on the stack. Because successive function's parameters are passed near on the stack, by knowing a position of a function's parameter it is possible to compute a position of the next (in declaration order) parameter. Thus, knowing a position of the **last named** parameter allows to compute a position of the **first variadic** parameter. — Tsyvarev, Nov 05 '20 at 10:36

Calling convention for variadic function

1 Answers1