8

I have divided the whole question into smaller ones:

  1. What kind of different algorithms GDB is capable to use to reconstruct stacktraces?
  2. How each of the stacktrace reconstruction algorithm works at high level? Advantages and disadvantages?
  3. What kind of meta-information compiler needs to provide in program for each stacktrace reconstruction algorithm to work?
  4. And also corresponding g++ compiler switches that enable/disable particular algorithm?
user389238
  • 1,656
  • 3
  • 19
  • 40
  • 1
    http://en.wikipedia.org/wiki/Call_stack, compiler adds debugging info mapping code offsets to source lines. – Nikolai Fetissov Dec 03 '10 at 19:33
  • 2
    That wikipedia article does not go into details about other algorithms than simple "store RBP into stack when calling another function". For example that algorithm will not work anymore if -fomit-frame-pointer is set at compile time. What about stack unwind descriptor? How is that one used to reconstruct stacktrace? Are there any other algorithms? – user389238 Dec 03 '10 at 19:48
  • 1
    With `-fomit-frame-pointer` GDB is pooped. – Nikolai Fetissov Dec 03 '10 at 19:50
  • 3
    No GDB is not pooped with -fomit-frame-pointer. Try BT command in GDB when the program you were debugging crashed - it still generates correct backtrace. I guess that this has to do something with unwind descriptor which still allows to reconstruct stacktrace. – user389238 Dec 03 '10 at 20:03
  • 1
    From `man gcc`: "-fomit-frame-pointer Don't keep the frame pointer in a register for functions **that don't need one**. This avoids the instructions to save, set up and restore frame pointers; it also makes an extra register available in many functions. It also makes debugging impossible on some machines." The leaf function gets pooped. – Nikolai Fetissov Dec 03 '10 at 20:07
  • Exactly on some machines debugging becomes useless with -fomit-frame-pointer. But there are also other stack reconstruction algorithms that don't need frame pointers to reconstruct stacktrace. Try g++ -g3 -fomit-frame-pointer main.cpp and then debug your program – user389238 Dec 03 '10 at 20:20
  • @NikolaiFetissov: Non-leaf functions can still omit the frame pointer. The `.eh_frame` section holds unwind metadata from `.cfi` directives to make this possible, mapping program-counter addresses to stack-pointer offsets from the return address. If you look at the asm output of an optimized non-leaf function for x86-64, you won't see `push %rbp` / `leave` unless it uses `alloca` or whatever. – Peter Cordes May 27 '19 at 22:36
  • Peter, thanks, good to know – Nikolai Fetissov May 28 '19 at 21:42

1 Answers1

12

Speaking Pseudocode, you could call the stack "an array of packed stack frames", where every stack frame is a data structure of variable size you could express like:

template struct stackframe<N> {
    uintptr_t contents[N];
#ifndef OMIT_FRAME_POINTER
    struct stackframe<> *nextfp;
#endif
    void *retaddr;
};

Problem is that every function has a different <N> - frame sizes vary.

The compiler knows frame sizes, and if creating debugging information will usually emit these as part of that. All the debugger then needs to do is to locate the last program counter, look up the function in the symbol table, then use that name to look up the framesize in the debugging information. Add that to the stackpointer and you get to the beginning of the next frame.

If using this method you don't require frame linkage, and backtracing will work just fine even if you use -fomit-frame-pointer. On the other hand, if you have frame linkage, then iterating the stack is just following a linked list - because every framepointer in a new stackframe is initialized by the function prologue code to point to the previous one.

If you have neither frame size information nor framepointers, but still a symbol table, then you can also perform backtracing by a bit of reverse engineering to calculate the framesizes from the actual binary. Start with the program counter, look up the function it belongs to in the symbol table, and then disassemble the function from the start. Isolate all operations between the beginning of the function and the program counter that actually modify the stackpointer (write anything to the stack and/or allocate stackspace). That calculates the frame size for the current function, so subtract that from the stackpointer, and you should (on most architectures) find the last word written to the stack before the function was entered - which is usually the return address into the caller. Re-iterate as necessary.

Finally, you can perform a heuristic analysis of the contents of the stack - isolate all words in the stack that are within executably-mapped segments of the process address space (and thereby could be function offsets aka return addresses), and play a what-if game looking up the memory, disassembling the instruction there and see if it actually is a call instruction of sort, if so whether that really called the 'next' and if you can construct an uninterrupted call sequence from that. This works to a degree even if the binary is completely stripped (although all you could get in that case is a list of return addresses). I don't think GDB employs this technique, but some embedded lowlevel debuggers do. On x86, due to the varying instruction lengths, this is terribly difficult to do because you can't easily "step back" through an instruction stream, but on RISC, where instruction lengths are fixed, e.g. on ARM, this is much simpler.

There are some holes that make simple or even complex/exhaustive implementations of these algorithms fall over sometimes, like tail-recursive functions, inlined code, and so on. The gdb sourcecode might give you some more ideas:

https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/frame.c

GDB employs a variety of such techniques.

FrankH.
  • 17,675
  • 3
  • 44
  • 63
  • I should say that nothing of this is particularly specific to C++. C++ specifics come into play when unwinding exception stacks, but that's not something the debugger does (if you crash in a `throw()` you haven't unwound the exception stack yet, and if you crash in a `catch {}` then you have done so; if you crash in the process of unwinding then you have a compiler bug ... and debugging those is left to the true gods ...). – FrankH. Dec 07 '10 at 12:29
  • I just found another source that suggests that .eh_frame (specific to C++) might be used in GDB to rewind stack: http://sourceware.org/gdb/papers/unwind.html – user389238 Dec 08 '10 at 17:42
  • x86 `call rel32` is always 5 bytes long so it's easy to check a possible call-site for that. x86 GDB backtracing for functions without `.eh_frame` metadata could be able to scan dwords or qwords on the stack looking at pointers as possible code addresses. I think GDB *does* do this to a limited extent; because it can still backtrace through a hand-written asm function with no `.cfi` directives if it only does one `push`, and pushing a dummy register that happens to hold another pointer can lead to extra functions appearing in the backtrace. – Peter Cordes May 27 '19 at 22:33
  • sourceware link is broken – Dan M. Sep 29 '21 at 14:26
  • changed the sourceware link to git (cvsweb must've been deprecated there for near as long a the answer is old ...) – FrankH. Sep 30 '21 at 16:17