2

According to this question here by using pthread_spin_lock is dangerous to lock a critical section as the thread might be interrupted by the scheduler out of the bloom and other threads contenting on that resource might be left spinning.

Suppose that I decide to switch from pthread_spin_lock to locks implemented via atomic built-in + compare_and_swap idion: will this thing improve or still I will suffer from this issue?

Since with pthread it seems to be nothing to disable preemption, is there something I can do in case I use locks implemented via atomics or anything I can have a look at?

I am interested in locking a small critical region.

Community
  • 1
  • 1
Abruzzo Forte e Gentile
  • 14,423
  • 28
  • 99
  • 173
  • `pthread_spin_lock` is most probably implemented with atomic operations, what do you want to achieve? If the things that you want to perform *inside* the critical section are just updates/increments or stuff like that, you certain should just use the atomic buildins (or C11 `_Atomic`) to do the operation and do no locking at all. – Jens Gustedt Mar 13 '14 at 17:12
  • I want to lock a critical section with a pthread_spin_lock. According to the URL I inserted looks that there is the risk that other threads would be spinning indefinitely. My question is what happens if I implement lokcing of a critical region by using compare_and_swap + atomic...are other threads at risk to spin indefinitely because of preemption? – Abruzzo Forte e Gentile Mar 13 '14 at 17:58
  • @Jens: I add some more explanation. I need to lock a small critical section. – Abruzzo Forte e Gentile Mar 13 '14 at 18:00
  • As I said, `pthread_spin_lock` is most probably already implemented with this kind of atomic operation, so that would make no difference at all. Also `compare_exchange` is not the right tool for this purpose. Use `atomic_flag` or equivalent for spinlocks. They are usually implemented with a test and set, tas, operation. – Jens Gustedt Mar 13 '14 at 18:05
  • @AbruzzoForteeGentile it depends on what your goal are. First of all, don't implement it yourself, you'll get it wrong, and even if you get it right, nothing will be "better". This is normally not dangerous, but very rare corner cases can be hit if you have *LOTS* of threads contending on the same spinlock. If you suspect the locking you do in your application can end up in such corner cases, use a pthread_mutex_t instead - you still need to think about priority inversion bugs if your threads run at different priorities, under a realtime scheduler, normal applications rarely do that though. – nos Mar 13 '14 at 18:11
  • 1
    A pthread_mutex typically has a fast-path test and set which has low cost when the mutex is not locked or contended, which falls back on a slow-path system call that can block the thread with a context switch. Spin locks are best used when you have CPU affinity that prevents the thread that owns a lock from being preempted by another thread that spins on the lock, preventing the owner from releasing it. – pat Mar 13 '14 at 19:31
  • @pat: You should add that as an answer. – caf Mar 14 '14 at 02:24

1 Answers1

6

pthread_mutex_lock typically has a fast path which uses an atomic operation to try to acquire the lock. In the event that the lock is not owned, this can be very fast. Only if the lock is already held, does the thread enter the kernel via a system call. The kernel acquires a spin-lock, and then reattempts to acquire the mutex in case it was released since the first attempt. If this attempt fails, the calling thread is added to a wait queue associated with the mutex, and a context switch is performed. The kernel also sets a bit in the mutex to indicate that there is a waiting thread.

pthread_mutex_unlock also has a fast path. If the waiting thread flag is clear, it can simply release the lock. If the flag is set, the thread must enter the kernel via a system call so the waiting thread can be woken. Again, the kernel must acquire a spin lock so that it can manipulate its thread control data structures. In the event that there is no thread waiting after all, the lock can be released by the kernel. If there is a thread waiting, it is made runnable, and ownership of the mutex is transferred without it being released.

There are many subtle race conditions in this little dance, and hopefully it all works properly.

Since a thread that attempts to acquire a locked mutex is context switched out, it does not prevent the thread the owns the mutex from running, which gives the owner an opportunity to exit its critical section and release the mutex.

In contrast, a thread that attempts to acquire a locked spin-lock simply spins, consuming CPU cycles. This has the potential of preventing the thread that owns the spin-lock from exiting its critical section and releasing the lock. The spinning thread can be preempted when its timeslice has been consumed, allowing the thread that owns the lock to eventually regain control. Of course, this is not great for performance.

In practice, spin-locks are used where there is no chance that the thread can be preempted while it owns the lock. A kernel may set a per-cpu flag to prevent it from performing a context switch from an interrupt service routine (or it may raise the interrupt priority level to prevent interrupts that can cause context switches, or it may disable interrupts altogether). A user thread can prevent itself from being preempted (by other threads in the same process) by raising its priority. Note that, in a uniprocessor system, preventing the current thread from being preempted eliminates the need for the spin lock. Alternatively, in a multiprocessor system, you can bind threads to cpus (cpu affinity) so that they cannot preempt one another.

All locks ultimately require an atomic primitive (well, efficient locks; see here for a counter example). Mutexes can be inefficient if they are highly contended, causing threads to constantly enter the kernel and be context switched; especially if the critical section is smaller than the kernel overhead. Spin locks can be more efficient, but only if the owner cannot be preempted and the critical section is short. Note that the kernel must still acquire a spin lock when a thread attempts to acquire a locked mutex.

Personally, I would use atomic operations for things like shared counter updates, and mutexes for more complex operations. Only after profiling would I consider replacing mutexes with spin locks (and figure out how to deal with preemption). Note that if you intend to use condvars, you have no choice but to use mutexes.

pat
  • 12,587
  • 1
  • 23
  • 52