Why GCC generates strange way to move stack pointer

Question

I have observed that GCC's C++ compiler generates the following assembler code:

sub    $0xffffffffffffff80,%rsp

This is equivalent to

add    $0x80,%rsp

i.e. remove 128 bytes from the stack.

Why does GCC generate the first sub variant and not the add variant? The add variant seems way more natural to me than to exploit that there is an underflow.

This only occurred once in a quite large code base. I have no minimal C++ code example to trigger this. I am using GCC 7.5.0

Nate Eldredge · Accepted Answer · 2021-12-22T18:10:17.827

Try assembling both and you'll see why.

   0:   48 83 ec 80             sub    $0xffffffffffffff80,%rsp
   4:   48 81 c4 80 00 00 00    add    $0x80,%rsp

The sub version is three bytes shorter.

This is because the add and sub immediate instructions on x86 has two forms. One takes an 8-bit sign-extended immediate, and the other a 32-bit sign-extended immediate. See https://www.felixcloutier.com/x86/add; the relevant forms are (in Intel syntax) add r/m64, imm8 and add r/m64, imm32. The 32-bit one is obviously three bytes larger.

The number 0x80 can't be represented as an 8-bit signed immediate; since the high bit is set, it would sign-extend to 0xffffffffffffff80 instead of the desired 0x0000000000000080. So add $0x80, %rsp would have to use the 32-bit form add r/m64, imm32. On the other hand, 0xffffffffffffff80 would be just what we want if we subtract instead of adding, and so we can use sub r/m64, imm8, giving the same effect with smaller code.

I wouldn't really say it's "exploiting an underflow". I'd just interpret it as sub $-0x80, %rsp. The compiler is just choosing to emit 0xffffffffffffff80 instead of the equivalent -0x80; it doesn't bother to use the more human-readable version.

Note that 0x80 is actually the only possible number for which this trick is relevant; it's the unique 8-bit number which is its own negative mod 2^8. Any smaller number can just use add, and any larger number has to use 32 bits anyway. In fact, 0x80 is the only reason that we couldn't just omit sub r/m, imm8 from the instruction set and always use add with negative immediates in its place. I guess a similar trick does come up if we want to do a 64-bit add of 0x0000000080000000; sub will do it, but add can't be used at all, as there is no imm64 version; we'd have to load the constant into another register first.

Re: omit `sub`-immediate: MIPS actually does do that, despite using 2's complement imm16. But that's because MIPS doesn't have a FLAGS register; x86 does, and `add $-1, %reg` produces a different carry-flag output from `sub $1, %reg`. (x86's CF acts as a borrow for subtract, rather than a !borrow, so it's actually *always* different in the -1 / +1 case.) — Peter Cordes, Dec 23 '21 at 04:48
Also, x86 opcodes that date back to 8086 follow some patterns, so it was probably simpler to have those execute as sub-immediate instead of NOP on 8086 which didn't have a #UD exception. Intel could have chosen not to document that instruction and reserve the opcode for future use, but with `salc` that didn't happen until amd64. — Peter Cordes, Dec 23 '21 at 04:50

Why GCC generates strange way to move stack pointer

1 Answers1

Linked