2

Our current project includes FreeRTOS, and I added --use_frame_pointer to Keil uVision's ARMGCC compiler option. But after loading firmware into STM32F104 chip, then runs it, it crashed. Without --use_frame_pointer, everything is OK. The hard fault handler shows that faultStackAddress is 0x40FFFFDC, which points to a reserved area. Does anyone has any idea of this error? Thanks a lot.

#if defined(__CC_ARM)
__asm void HardFault_Handler(void)
{
   TST lr, #4
   ITE EQ
   MRSEQ r0, MSP
   MRSNE r0, PSP
   B __cpp(Hard_Fault_Handler)
}
#else
void HardFault_Handler(void)
{
   __asm("TST lr, #4");
   __asm("ITE EQ");
   __asm("MRSEQ r0, MSP");
   __asm("MRSNE r0, PSP");
   __asm("B Hard_Fault_Handler");
}
#endif

void Hard_Fault_Handler(uint32_t *faultStackAddress)
{

}

I stepped into each line of code, and the crash happened in below function in FreeRTOS's port.c after I called vTaskDelete(NULL);

void vPortYieldFromISR( void )
{
    /* Set a PendSV to request a context switch. */
    portNVIC_INT_CTRL_REG = portNVIC_PENDSVSET_BIT;
}

But seems like this is not the root cause, because when I deleted vTaskDelete(NULL), crash still happened.

[update on Jan 8] sample code

#include "FreeRTOSConfig.h"
#include "FreeRTOS.h"
#include "task.h"
#include <stm32f10x.h>

void crashTask(void *param)
{

    unsigned int i = 0;
    /* halt the hardware. */
    while(1)
    {
         i += 1;
    }
    vTaskDelete(NULL);
}
void testCrashTask()
{
    xTaskCreate(crashTask, (const signed char *)"crashTask",  configMINIMAL_STACK_SIZE,  NULL,  1,  NULL);    
}

void Hard_Fault_Handler(unsigned int *faultStackAddress);

/* The fault handler implementation calls a function called Hard_Fault_Handler(). */
#if defined(__CC_ARM)
__asm void HardFault_Handler(void)
{
   TST lr, #4
   ITE EQ
   MRSEQ r0, MSP
   MRSNE r0, PSP
   B __cpp(Hard_Fault_Handler)
}
#else
void HardFault_Handler(void)
{
   __asm("TST lr, #4");
   __asm("ITE EQ");
   __asm("MRSEQ r0, MSP");
   __asm("MRSNE r0, PSP");
   __asm("B Hard_Fault_Handler");
}
#endif

void Hard_Fault_Handler(unsigned int *faultStackAddress)
{
    int i = 0;
    while(1)
    {
        i += 1;
    }
}

void nvicInit(void)
{

    NVIC_PriorityGroupConfig(NVIC_PriorityGroup_4);
    #ifdef  VECT_TAB_RAM                            
    NVIC_SetVectorTable(NVIC_VectTab_RAM, 0x0);     
    #else
    NVIC_SetVectorTable(NVIC_VectTab_FLASH, 0x0);  
    #endif
}

int main()
{
    nvicInit();

    testCrashTask();
    vTaskStartScheduler();

}

/* For now, the stack depth of IDLE has 88 left. if want add func to here, 
   you should increase it. */
void vApplicationIdleHook(void)
{   /* ATTENTION: all funcs called within here, must not be blocked */
    //workerProbe();
}

void debugSendTraceInfo(unsigned int taskNbr)
{
}

When crash happened, in HardFault_Handler, Keil MDK IDE reports below fault information. I looked the STKERR error, which mainly means that stack pointer is corrupted. But I really have no idea why it is corrupted. Without --use_frame_pointer, everything works OK.

enter image description here

[update on Jan 13]
I did further investigation. Seems like the crash is caused by FreeRTOS's default TimerTask. If I comment out the xTimerCreateTimerTask() in vTaskStartScheduler() function(tasks.c), the crash does not happen.
Another odd thing is that if I debug it and step into the TimerTask's portYIELD_WITHIN_API() function call, then resume the application. It does not crash. So my guess is that this might due to certain time sequence. But I could not find the root cause of it.
Any thoughts? Thanks.

artless noise
  • 21,212
  • 6
  • 68
  • 105
bettermanlu
  • 627
  • 1
  • 9
  • 28
  • Do you perhaps have assembly code that explicitly uses R11? – Clifford Dec 10 '15 at 07:34
  • If I didn't call FreeRTOS certain functions, such as create tasks, there is no hard fault. So I think r11 should not be the root cause. Actually, I also tried another project without using R11, it also crashed. I am not sure whether this is due to some compiling compatibility issues. – bettermanlu Dec 10 '15 at 08:58
  • I also tried to increase FreeRTOS's task stack size, but still crashed. – bettermanlu Dec 10 '15 at 09:02
  • R7 is typically the frame pointer in Thumb ABIs, although in this case I would also suspect stack corruption - some saved FP gets trashed, then eventually gets loaded back into SP when returning through that frame, causing unrelated things to fall apart; without frame pointers, the stack layout is slightly different, and the rogue write hits something less critical like padding bytes or a finished-with local variable. – Notlikethat Dec 10 '15 at 09:48
  • @Notlikethat : On Thumb-2 used on Cortex-M, it is R11 not R7. – Clifford Dec 10 '15 at 11:20
  • All we can do with this is offer debugging advice, rather then answer your question. As such unless you can post complete (and short) code that exhibits the behaviour, it may not be possible to answer. – Clifford Dec 10 '15 at 11:29
  • added a very simple code and fault error info. – bettermanlu Jan 08 '16 at 09:35
  • @Clifford Could you help to take a look at my newly added information? Thanks a lot. – bettermanlu Jan 13 '16 at 10:54

1 Answers1

3

I ran into a similar problem in my project. It looks that armcc --use_frame_pointer tends to generate broken function epilogues. An example of generated code:

; function prologue
stmdb   sp!, {r3, r4, r5, r6, r7, r8, r9, r10, r11, lr}
add.w   r11, sp, #36

; ... actual function code ...

; function epilogue
mov     sp, r11
        ; <--- imagine an interrupt happening here
sub     sp, #36
ldmia.w sp!, {r3, r4, r5, r6, r7, r8, r9, r10, r11, pc}

This code actually seems to break the constraint from AAPCS section 5.2.1.1:

A process may only access (for reading or writing) the closed interval of the entire stack delimited by [SP, stack-base – 1] (where SP is the value of register r13).

Now, on Cortex-M3, when an exception/interrupt arrives, partial register set is automatically pushed onto the current process' stack before jumping into the exception handler. If an exception is raised between the mov and sub, that partial register set will overwrite the registers stored by the function prologue's stmdb instruction, thus corrupting the state of the caller function.

Unfortunately, there doesn't seem to be any easy solution. None of the optimization settings seems to fix this code that looks like it can be easily fixed (coerced into sub sp, r11, #36). It seems that --use_frame_pointer is too broken to work on Cortex-M3 with multi-threaded code. At least on ARMCC 5.05u1, I didn't have the chance to check other versions.

If using a different compiler is an option for you, arm-none-eabi-gcc -fno-omit-frame-pointer seems to emit saner function epilogues, though.

kFYatek
  • 5,503
  • 4
  • 21
  • 14