Proper gdb backtrace from RISC-V trap / HardFault

Question

I'm using a RISC-V (rv32imac_zicsr) chip and have troubles debugging traps/hardfaults with gdb (riscv-none-elf-gcc toolchain v12.2.0-1 from xPack). With ARM chips gdb bt lists some, but not all functions leading up to the hardfault.

Assume we have these functions in the error stack: [ HardFault_Handler, Erroneous_Function, Calling_Function ]

ARM GDB would by default show the following backtrace: [ HardFault_Handler, Erroneous_Function ]

Assuming a simple RISC-V HardFault_Handler which would work similarly in ARM:

void HardFault_Handler(void) __attribute__((naked));
void HardFault_Handler(void)
{
    __asm("EBREAK;");
}

it will show the backtrace [ HardFault_Handler, Calling_Function ] followed by "Backtrace stopped: frame did not save the PC"

gdb seems to only be able to read the addresses in pc (HardFault_Handler), and ra (return address) when it occurred (Calling_Function). This skips Erroneous_Function completely (which is in mepc).

Additionally, with the (naked) attribute the stack pointer is not modified when entering HardFault_Handler, so sp is still the stack of the Erroneous_Function, and inspecting the Calling_Function would not be valid, either (similarly without the naked attribute).

So I tested the following HardFault_Handler:

void HardFault_Handler(void) __attribute__((naked));
void HardFault_Handler(void)
{
    __asm(
        "csrr ra, mepc;"
        "EBREAK;"
    );
}

With this, the backtrace looks like this: [ HardFault_Handler, Erroneous_Function ]

And thanks to the (naked) attribute Erroneous_Function IS valid and can be inspected, but all the information about the Calling_Function stored in ra is lost. I can of course store it temporarily (possibly discarding some relevant registers in Erroneous_Function), but I have no way to easily inspect the stack. While this also does not seem to be possible in ARM, since the information IS available, I wonder if this is something that has to be done in GDB or if there is a way to make it better. For example, I would not mind having only [ Erroneous_Function, Calling_Function ] in the backtrace, but this would require EBREAK to trigger delayed (after writing PC, or in other terms, executing mret, when the PC is on the erronous instruction). And I do not know of a way to call EBREAK to trigger one instruction delayed or something like it.

This both serves as a repository for orthers to get gdb to at least return a working backtrace (since I didn't find anything like it for RISC-V), and also to get some input:

What is the best way to get a better backtrace in a trap in RISC-V processors (preferrably with rv32imac)? Thanks!

Note: (naked) does preserve the SP, but it seems gdb (or the debugging tool I use, VS-Code with cortex-debug) can handle it when the handler modifies it. So the second code snipped with a regular interrupt("machine") attribute also allows me to inspect the faulty function. — Seneral, Jan 13 '23 at 19:48

Proper gdb backtrace from RISC-V trap / HardFault

0 Answers0