3

Specifically in the context of the SysV x86-64 ABI

If I have a struct with only two fields, such as:

typedef struct {
    void *foo;
    void *bar;
} foobar_t;

And I pass it to a function with a definition like so:

foobar_t example_function(foobar_t example_param);

The ABI seems to say that each eightbyte field should be passed as INTEGER to the function, therefore rdi == foo and rsi == bar. Similarly, when returning we should be able to use rax and rdx, since we don't need a memory pointer in rdi. If example_function is trivially defined as:

foobar_t example_function(foobar_t example_param) {
    return example_param;
}

A valid assembly implementation, ignoring prologue and epilogue, would be:

example_function:
   mov rax, rdi
   mov rdx, rsi
   ret

Conceivably, a mentally-deficient compiler could fill the struct with NO_CLASS padding and make that assembly invalid somehow. I'm wondering if it's written down anywhere that a struct with only two eightbyte fields must be handled this way.

The larger context to my question is that I'm writing a simple C11 task switcher for my own edification. I'm basing it largely on boost.context and this is exactly how boost passes two-field structs around. I want to know if it's kosher under all circumstances or if boost is cheating a little.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
nickelpro
  • 2,537
  • 1
  • 19
  • 25
  • There is no rule that a compiler conform to a targets ABI, it is the choice of the compiler author(s) what calling convention they create/adopt. – old_timer Jun 06 '19 at 18:26
  • anything you see a particular compiler and version doing is specific to that compiler and version, the abi definition could change or for any other reason the compiler could change on the next release. Probably not but there are no guarantees. – old_timer Jun 06 '19 at 18:28
  • If you want to take advantage of a particular behavior then you need to test and validate supported toolchains for your product/software and limit it to those. – old_timer Jun 06 '19 at 18:30
  • 1
    Ok, sure, but this is specifically asked in the context of a compiler conforming to the SysV x86-64 ABI. Perhaps I should be more clear, but I'm asking if I'm misinterpreting the ABI spec and there is a guarantee there, or if Boost is just relying on "typical" but unsafe behavior. – nickelpro Jun 06 '19 at 18:32
  • 5
    Inside the ABI, there is no wiggle room. The struct padding is specified exhaustively by the ABI, so it should be as you say. – fuz Jun 06 '19 at 18:49
  • if it conforms then it conforms – old_timer Jun 06 '19 at 18:50
  • The alignment requirement on page 15 of the spec is what I was missing, "Structures and unions assume the alignment of their most strictly aligned component. Each member is assigned to the lowest available offset with the appropriate alignment." So if the fields are just two eightbytes, this guarantees the layout will just use two INTEGERS. – nickelpro Jun 06 '19 at 18:57

2 Answers2

3

Compilers agreeing on struct layout and how they're passed by value as function args are key parts of an ABI. Otherwise they couldn't call each other's functions.

Hand-written asm is not different from compiler-generated asm; it doesn't have to have to come from the same version of the same compiler to interoperate properly. This is why stable and correct ABIs are such a big deal.

Compatibility with hand-written asm is fairly similar to compatibility with machine code that was compiled a long time ago and has been sitting in a binary shared library for years. If it was correct then, it's correct now. Unless the structs have changed in the source newly compiled code can call and be called by the existing instructions.


If a compiler doesn't match the standard as-written, it's broken.

Or maybe more accurately, if it doesn't match gcc, it's broken. And if the standard wording doesn't describe what gcc/clang/ICC do, then the standard document is broken.

If you had a compiler for x86-64 System V that passes a 2x void* struct any way other than in 2 registers, that compiler is broken, not your hand-written asm.

(Assuming there aren't a lot of earlier args that use up the arg-passing registers before we get to the struct arg.)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
3

The ABI seems to say that each eightbyte field should be passed as INTEGER to the function, therefore rdi == foo and rsi == bar.

Agreed, for "global" functions accessible from multiple compilation units, the argument structure is broken up into to eightbyte pieces, the first completely filled by foo, and the second completely filled by bar. These are classified as INTEGER, and therefore passed in %rdi and %rsi, respectively.

Similarly, when returning we should be able to use rax and rdx, since we don't need a memory pointer in rdi.

I don't follow your point about %rdi, but I agree that the members of the return value are returned in %rax and %rdx.

A valid assembly implementation, ignoring prologue and epilogue, would be: [...]

Agreed.

Conceivably, a mentally-deficient compiler could fill the struct with NO_CLASS padding and make that assembly invalid somehow. I'm wondering if it's written down anywhere that a struct with only two eightbyte fields must be handled this way.

A compiler that produces code conforming to the SysV x86-64 ABI will use the registers already discussed for passing the argument and returning the return value. Such a compiler is of course not obligated to implement the function body exactly as you describe, but I'm not seeing your concern. Yes, these details are written down. Although the specific case you present is not explicitly described in the ABI specification you linked, all of the behavior discussed above follows from that specification. That's the point of it.

A compiler that produces code (for a global function) that behaves differently is not mentally-deficient, it is non-conforming.

The larger context to my question is that I'm writing a simple C11 task switcher for my own edification. I'm basing it largely on boost.context and this is exactly how boost passes two-field structs around. I want to know if it's kosher under all circumstances or if boost is cheating a little.

It would take me more analysis than I'm prepared to expend to determine exactly what Boost is doing in the code you point to. Note that it is not what you present in your example_function. But it is reasonable to suppose that Boost is at least attempting to implement its function calls according to the ABI.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • 1
    My point about %rdi comes from page 24 of the spec: "If the type has class MEMORY, then the caller provides space for the return value and passes the address of this storage in %rdi as if it were the first argument to the function." My question was, now that I understand it more fully, can a compiler twist itself into classifying this struct as MEMORY by unaligning elements or padding its size to greater than two eightbytes? The answer is no, because page 15 strictly specifies how elements will be aligned and padded. Thank you for your answer. – nickelpro Jun 06 '19 at 19:19