Are just atomic operations enough to implement a mutex in x86. I am asking this in relation to out of order execution. Except atomic access to the integer that specifies whether the mutex is locked or not, are there any additional actions that must happen and why?
1 Answers
The answer depends upon what you mean by "just atomic operations". Properly aligned reads/writes on x86 are atomic, and both Dekker's and Peterson's algorithms for a mutex use only reads and writes. But neither algorithm works work correctly without also using (possibly implicit) memory fences. The problem is that both algorithms assume a stronger memory consistency model than x86 has. Specifically, x86 allows a load programmed after a store to happen earlier if the two accesses are not to the same address. See here for a detailed example.
If by "just atomic operations", you include LOCK-prefixed instructions and XCHG (which has an implicit LOCK prefix), then the answer is yes, since said instructions have an implicit memory fence. For example, an XCHG instruction can be used to perform the sequentially consistent store required by Dekker's and Peterson's algorithms, or used to implement the usual test-and-set approach.

- 3,829
- 2
- 16
- 26