TL;DR - WFE
is wait for event. Events and the global monitor are different orthogonal concepts. They have synergy when used together, but are completely seperate.
Query- if a particular exclusive transaction is successful, should I signal a clear on the global monitor?
No, this case is successful and the global monitor is automatically cleared. The 'WFE' is different from the global monitor for exclusive access. The SEV
is send an event. It is not the global monitor.
To clear the global monitor, it is clrex
. A ldrex
reserves the global monitor and an strex
commits the global monitor if successful. The monitor itself is on the ''global'' state of memory. Each CPU/core can have different working copies of the memory to update. Normally, the strex
will fail if another core has reserved and committed the same memory. It is normal to re-issue an ldrex
to retrieve the updated memory copy when strex
fails.
An issue comes when a core supports pre-emption and/or interrupts. One context on the core may issue an ldrex
and then be pre-empted by a successful ldrex/strex
pair. When the context returns, the prior ldrex
is not reserving anything and the strex
is undefined. In this case, the OS (or interrupt code) must issue a clrex
to force the original paired strex
to fail and retry. For a Cortex-M, the system often does an intrinsic clrex
on a return from interrupt, but you need to read your system documentation. For some Cortex-A systems, you need a clrex
(and the same for normal/secure worlds).
What is your use case for using WFE/SEV
with the ldrex/strex
? I think it needs to be a simple flag as oppose to some lock free data structure. I guess the WFE/SEV could augment the plain ldrex/strex
for fairness between cores.
Specifically, it is valuable for a semaphore (simple flag). The 'semTake()' will do a WFE
to sleep when it fails. The 'semGive()' will issue a SEV
to wake all sleepers in a 'semTake()'. If a core has gained access to the semaphore, having the other cores sleep will result in a faster ldrex/strex
to put the semaphore to it's free state as well as save power on the blocked cores. (Rosetta, vxWorks/Posix: semGive
/sem_post
, semTake
/sem_wait
. The vxWorks names seem best for a binary semaphore or mutex depending on pendatics... but these are the ARM primitives).