3

I wanted to check if GCC makes sure extra code generated for running destructors when exceptions are thrown is put in a cold section of the binary, to keep those instructions away from the "happy path" and avoid contributing to instruction cache pressure. So I made the example below (Godbolt link) where GCC can't tell if an exception is going to be thrown. Sure enough I see a separate buzz(int) [clone .cold.0]: label in the assembly. But there's this curious artifact above it:

.L9:
        mov     edi, OFFSET FLAT:.LC0
        xor     eax, eax
        call    printf
        mov     eax, ebx
        add     rsp, 16
        imul    eax, ebx
        add     eax, ebx
        pop     rbx
        ret
        mov     rbx, rax
        jmp     .L3
buzz(int) [clone .cold.0]:
.L3:
        mov     eax, DWORD PTR [rsp+12]
        ...

Notice the ret instruction is follow by a jmp into the cold area. But instructions after ret that aren't a jump target (no label) should never run. I thought maybe it could be for instruction alignment purposes, but following with a jump into the cold section seems like too weird a coincidence. What's going on? Why are these instructions here? Maybe something to do with the protocol for how unwinding works requires it to be there, and the exception unwind table makes it a jump target, but it's just unlabeled? But if that's the case why not jump straight to the cold section?

C++ code:

#include <stdio.h>

extern int global;

struct Foo
{
    Foo(int x)
    : x(x)
    {}

    ~Foo() { 
        if(x % global == 0) {
            printf("Wow."); 
        }
    }
    int x;
};

void bar(Foo& f); // compiler can't see impl, has to assume could throw

// Type your code here, or load an example.
int buzz(int num) {
    {
       Foo foo(3);
       bar(foo);
    }
    return num * num + num;
}
Joseph Garvin
  • 20,727
  • 18
  • 94
  • 165
  • 7
    godbolt filtered out its label `.L5` thinking it is unused but it is in fact referenced in the `.gcc_except_table` (which is also hidden due to the directives filtering). That code is part of the exception handling. – Jester Apr 29 '23 at 21:45
  • 1
    For alignment, GCC always just uses `.p2align`, sometimes with a 3rd operand to specify a padding limit. GAS expands that to long-NOP instructions (at least inside code sections, vs. zeros in data sections if leave the fill pattern as the default). GCC doesn't even track sizes, so it doesn't know how where it is relative to an alignment boundary. – Peter Cordes Apr 29 '23 at 22:08
  • 2
    So in general, you can definitely rule out a guess like `mov rbx, rax` being for alignment. It's either a weird missed optimization, or like this case something does reference it somehow. But it would have to be via a label, since GCC wouldn't be able to calculate an offset from another label; it doesn't track instruction lengths. So remember to be suspicious of Godbolt's filtering of "unused" labels. – Peter Cordes Apr 29 '23 at 22:08
  • @PeterCordes both good points, thanks – Joseph Garvin Apr 29 '23 at 22:17
  • @Jester Ah that makes more sense. Is there a reason that `.gcc_except_table` couldn't point directly to `.L3`? And then the `mov`/`jmp` after `ret` could be omitted. – Joseph Garvin Apr 29 '23 at 22:18
  • I wondered the same thing myself. Presumably the `mov` could be at the top of `.L3` and the `jmp` could be omitted entirely, unless `.L3` has to be contiguous with the main part of the function for some reason. – Peter Cordes Apr 29 '23 at 22:26
  • Trying with clang, it appears to be missing the hot/cold splitting optimization, and it still puts code after ret, but the extra jump is gone. – Joseph Garvin Apr 30 '23 at 15:09

0 Answers0