It depends on implementation, pthread_spin_lock
is not officially guaranteed to remain in user-space. In practice on systems that have a CAS instruction (i.e. most commodity SMP systems) it will often be the case.
Here are the links to the glibc implementations for x86, x86-64, ia64, sparc32, sparc64, PPC, SH4 and the general case, all based on a CAS loop.
Similarly there is no guarantee that a particular std::atomic
implementation won't go to kernel, but in practice, especially when std::atomic<T>::is_lock_free()
returns true
, it will be implemented in user-space with the help of atomic instructions.
Note also that in modern Linux pthread_mutex_lock
is implemented using a futex
, i.e. a "user-space mutex", it remains in user-space in non-contended case. malloc
will go to kernel only if there's a contention or when more virtual memory needs to be reserved.
Having said that, whether or not a spin lock is the right choice for synchronization is a broader question that depends on more factors that just system calls. As explained in this question, it is useful in true SMP cases, when the contended state is very short. The majority of performance benefits come from savings on context switching and scheduling.
Some mutex implementations (the Windows critical section for example) are hybrid: they will spin for a while first and only after that delegate to a syscall-based lock.