Does dispatch_sync have a conceptual performance advantage over a lock?

Question

In objective-c, there are (at least) two approaches to synchronizing concurrent accesses to a shared resource. The older lock-based approach and the newer approach with Grand Central Dispatch (GCD), for the latter one using dispatch_sync to dispatch all accesses to a shared queue.

In the Concurrency Programming Guide, section Eliminating Lock-Based Code, it is stated that "the use of locks comes at a cost. Even in the non-contested case, there is always a performance penalty associated with taking a lock."

Is this a valid argument for the GCD approach?

I think it's not for the following reason:

A queue must have a list of queued tasks to do. One ore more threads can add tasks to this list via dispatch_sync and one or more worker threads need to remove elements from this list in order to execute the tasks. This must be guarded by a lock. So a lock needs to be taken there as well.

Please tell me if there is any other way how queues can do this without a lock that I'm not aware of.

UPDATE: Further on in the guide, it is implied that there is something I'm not aware of: "queueing a task does not require trapping into the kernel to acquire a mutex."

How does that work?

This might be interesting: http://stackoverflow.com/questions/17599401/what-advantages-does-dispatch-sync-have-over-synchronized. — Martin R, Sep 04 '13 at 11:08

score 2 · Answer 1 · answered Sep 05 '13 at 21:45

On current releases of OS X and iOS, both pthread mutexes and GCD queues (as well as GCD semaphores) are implemented purely in userspace without needing to trap into the kernel, except when there is contention (i.e. a thread blocking in the kernel waiting for an "unlock").

The conceptual advantage of GCD queues over locks is more about them being able to be used asynchronously, the asynchronous execution of a "locked" critical section on a queue does not involve any waiting.

If you are just replacing locks with calls to dispatch_sync you are not really taking full advantage of the features of GCD (though the implementation of dispatch_sync happens to be slightly more efficient mainly due to pthread mutexes having to satisfy additional constraints).

ipmcc · Accepted Answer · 2013-09-04T11:15:10.110

There exist lock free queuing implementations. One reason they are often pooh-poohed is that they are platform specific, since they rely on the processors atomic operations (like increment, decrement, compare-and-swap, etc) and the exact implementation of those will vary from one CPU architecture to another. Since Apple is both the OS and hardware vendor, this criticism is far less of an issue for Apple platforms.

The implication from the documentation is that GCD queue management uses one of these lock-free queues to achieve thread safety without trapping into the kernel.

For more information about one possible MacOS/iOS lock-free queue implementation, read here about these functions:

void  OSAtomicEnqueue( OSQueueHead *__list, void *__new, size_t __offset);
void* OSAtomicDequeue( OSQueueHead *__list, size_t __offset);

It's worth mentioning here that GCD has been (mostly) open-sourced, so if you're truly curious about the implementation of it's queues, go forth and use the source, Luke.

FWIW OSAtomicEnqueue/OSAtomicDequeue are not what GCD uses internally, that API implements a lock-free LIFO queue (stack), whereas GCD queues are implemented with a lock-free FIFO queue algorithm. — das, Sep 05 '13 at 21:29
Yup. I'm just pointing out that lock-free queuing implementations exist, and that the GCD source is available. The OSAtomic* stuff is just one implementation that's easy to reference in proving the point. — ipmcc, Sep 05 '13 at 23:14

Does dispatch_sync have a conceptual performance advantage over a lock?

2 Answers2