how does stack unwinding in cortex m devices works?

Question

I can't find a good explanation for this anywhere - I'm using a cortex-m4 device (specifically, it's a STM32L4 discovery board) and I try to write a tool that will print the call stack for me (specifically I would want to get the last 10 function calls, i.e., the corresponding program counter value). Obviously this can be done since when I'm debugging the board (using VS code with the cortex-debug extension and when I'm connected to the MCU using a J-link device) I can see the full call stack and even see the values that were passed to each function in the stack. However, I can't understand how this is done (how cortex-debug/Vs code get the backtrace).

When looking at the assembly code I don't see any frame pointer (I see the core uses R7 in some wired way but I cant backtrace using it). So I thought I might need to use the unwind tables, but when I test if the elf contains those sections (.ARM.exidx and .ARM.extbl) using arm-none-eabi-readelf --unwind test.elf I see that there are no unwind sections. So how does VS code unwind the stack in this case? Is there some special hardware related to the J-link that keeps that trace somehow and then the J-link gdb server can access it? Is there any good source to how those things work? Thanks

score 0 · Answer 1 · answered Jan 10 '22 at 13:00

0

As far as my limited knowledge about this subject goes, the debugger can show you the callstack while debugging is based on the data the .elf file, from which it "knows" how big the stack frame is etc. In other words, it uses both data read from the MCU and data embedded in the .elf file (portion of which got converted to binary loaded into MCU).

This also means that you cannot for example get backtrace "from the code" (_Unwind_Backtrace), unless you include the necessary information in the binary loaded into MCU (-funwind-tables option).

answered Jan 10 '22 at 13:00

J_S

2,985
3
15
38

This indeed makes some sense, but this means that the debugger (the GDB server) has to do some pretty "ugly" operations, such as reading the code from the elf file and figuring out the amount of stack that the current function uses then find the beginning of the stack and then the LR and so on. It just feels unintuitive that this process is that complicated. Can anybody confirm that this is indeed what happens? – fptech20 Jan 10 '22 at 13:11
I don't really think it is ugly. It moves the complexity of stack analysis from the runtime (no frame pointer) to the debugger. – Guillaume Petitjean Jan 10 '22 at 15:18

how does stack unwinding in cortex m devices works?

1 Answers1