Background
I have a custom bare metal mutex primitive written for the STM32F7 (Arm Cortex M7) processor per the Barrier and Litmus Test Cookbook from ARM, using the LDREX and STREX instructions. I use this to control critical sections in my code.
It seems to work well on multiple IRQs with differing priorities, and solves inversion deadlock with a timeout loop. If the lock spins 100 times on acquisition, I assume it's held by a lower priority context and break with an error return flag, returning from the IRQ context. This means the critical section for that IRQ instance never runs, though.
Question
I'm wondering if I missed anything in the CMSIS HAL or STM32F7 reference/programming manual that would allow me (or help me) to easily pause execution in the blocked higher priority IRQ, switch context to the lower priority one, finish execution and free the lock, then return to the higher priority one?
Solutions I've considered/tried
- Obviously just switching to an RTOS but it's not an option, it's an existing codebase that I don't entirely own.
- I read the sections on "Exception entry and return", etc, and can maybe do it manually with the stack. Seems complex though and I'd have to keep track of which context actually holds the lock.
- Ditch the timeout loop and use a separate timer, checking for priority inversion by examining the stack, and raising the lower priority IRQ to allow it to complete and stop blocking the higher priority IRQ. (Complex, and approaching the territory of just writing a scheduler.)
- Using WFE and SEV instructions to give up context. I'm not 100% sure, but I don't think this will work the way I think it will, and is more for multiprocessor systems?
- Accept that this is as good as it gets without significantly more effort.
Mutex code and Usage
Compiled with gcc-arm-none-eabi
using -mcpu=cortex-m7 -mfpu=fpv5-d16 -mfloat-abi=hard -mthumb
.
static inline int acquireLock(unsigned int *lock)
{
unsigned int tempStore = 0;
unsigned int lockFlag = 1;
unsigned int timeout = 0;
unsigned int result = 0;
__asm__ volatile( //
"Loop1%=: \n\t" // label for main spinlock loop
// Lock acquisition spin loop.
"add %[tim], %[tim], #1 \n\t" // add 1 to timeout counter
"ldrex %[ts], %[lock] \n\t" // read lock's current state
"cmp %[ts], #0 \n\t" // check if 0 (lock is available)
"it eq \n\t" // only try to store if lock is clear
"strexeq %[ts], %[lf], %[lock] \n\t" // try to grab lock if it is availble
// Loop exit logic block.
"cmp %[ts], #0 \n\t" // check we got the lock?
"beq Loop2%= \n\t" // if we got lock, quit loop
"cmp %[tim], #100 \n\t" // else, check timeout counter
"bgt Loop2%= \n\t" // quit loop if timeout > 100
"b Loop1%= \n\t" // else go back to start of spin loop
// Check and set return value (success) of lock acquisition
"Loop2%=: \n\t" // label for loop exit
"cmp %[ts], #0 \n\t" // check if we got lock (vs timeout)
"ite eq \n\t" // conditional store of return value
"moveq %[res], #0 \n\t" // return 0 if we got lock
"movne %[res], #1 \n\t" // else return 1 if we timed out
"dmb \n\t" // mem barrier for later RWMs
: [ lock ] "+m"(*lock), [ ts ] "+l"(tempStore), [ tim ] "+l"(timeout),
[ res ] "=l"(result)
: [ lf ] "l"(lockFlag)
: "memory");
return result;
}
and an example of usage, in an IRQ context:
if (acquireLock(&lock) == 0)
{
something_critical++;
releaseLock(&lock);
}
else
{
return;
}
Deadlock Example
If it helps, here's a backtrace of the deadlock when I disable the timeout counter in the spinlock loop. You can see TIM6 preempted execution of TIM7 while it was in the process of releasing the lock (but hadn't completed yet).