Let's assume you know you have one processor. Let's also assume that your processor has an atomic instruction BBSC (Branch on bit set and set) that cannot be interrupted that branches if a bit is set and does not branch is clear and sets the bit
You can then do you locking using such an instruction
BBSS DID_NOT_GET_LOCK, #1,LOCK_LOCATION
; Critical Section
; . .. . . . .
MOV #0, LOCK_LOCATION ; End critical section
DID_NOT_GET_LOCK:
Locking becomes simple to implement in such a single processor system.
If you add multiple CPUs into the mix, that system of locking fails miserably. That instruction I describe has at least two memory accesses:
If (Bit is Set) ; Memory test
Goto Destination
Else
Set Bit ; Memory Set
If you have multiple processors, more than one process could see the Bit is clear simultaneously and could enter the critical section.