Does pthreads provide any advantages over GCD?

Question

Having recently learned Grand Central Dispatch, I've found multithreaded code to be pretty intuitive(with GCD). I like the fact that no locks are required(and the fact that it uses lockless data structures internally), and that the API is very simple.

Now, I'm beginning to learn pthreads, and I can't help but be a little overwhelmed with the complexity. Thread joins, mutexes, condition variables- all of these things aren't necessary in GCD, but have a lot of API calls in pthreads.

Does pthreads provide any advantages over GCD? Is it more efficient? Are there normal-use cases where pthreads can do things that GCD can not do(excluding kernel-level software)?

In terms of cross-platform compatibility, I'm not too concerned. After all, libdispatch is open source, Apple has submtited their closure changes as patches to GCC, clang supports closures, and already(e.x. FreeBSD), we're starting to see some non-Apple implementations of GCD. I'm mostly interested in use of the API(specific examples would be great!).

This seems relevant: http://stackoverflow.com/questions/14177689/risk-assessment-using-pthreads-vs-gcd-or-nsthread — user1031420, May 29 '13 at 21:25

kfmfe04 · Answer 1 · 2011-12-30T17:00:36.877

I am coming from the other direction: started using pthreads in my application, which I recently replaced with C++11's std::thread. Now, I am playing with higher-level constructs like the pseudo-boost threadpool, and even more abstract, Intel's Threading Building Blocks. I would consider GCD to be at or even higher than TBB.

A few comments:

imho, pthread is not more complex than GCD: at its basic core, pthread actually contains very few commands (just a handful: using just the ones mentioned in the OP will give you 95%+ of the functionality that you ever need). Like any lower-level library, it's how you put them together and how you use it which gives you its power. Don't forget that the ultimately, libraries like GCD and TBB will call a threading library like pthreads or std::thread.
sometimes, it's not what you use, but how you use it, which determines success vs failure. As proponents of the library, TBB or GCD will tell you about all the benefits of using their libraries, but until you try them out in a real application context, all of it is of theoretical benefit. For example, when I read about how easy it was to use a finely-grained parallel_for, I immediately used it in a task for which I thought could benefit from parallelism. Naturally, I, too, was drawn by the fact that TBB would handle all the details about optimal loading balancing and thread allocation. The result? TBB took five times longer than the single-threaded version! But I do not blame TBB: in retrospect, this is obviously a case of a misuse of the parallel_for: when I read the fine-print, I discovered the overhead involved in using parallel_for and posited that in my case, the costs of context-switching and added function calls outweighed the benefits of using multiple threads. So you must profile your case to see which one will run faster. You may have to reorganize your algorithm to use less threading overhead.
why does this happen? How can pthread or no threads be faster than a GCD or a TBB? When a designer designs GCD or TBB, he must make assumptions about the environment in which tasks will run. In fact, the library must be general enough that it can handle strange, unforseen use-cases by the developer. These general implementations will not come for free. On the plus-side, a library will query the hardware and the current running environment to do a better job of load-balancing. Will it work to your benefit? The only way to know is to try it out.
is there any benefit to learning lower-level libraries like std::thread when higher-level libraries are available? The answer is a resounding YES. The advantage of using higher-level libraries is, abstraction from the implementation details. The disadvantage of using higher-level libraries is also abstraction from the implementation details. When using pthreads, I am supremely aware of shared state and lifetimes of objects, because if I let my guard down, especially in a medium to large size project, I can very easily get race conditions or memory faults. Do these problems go away when I use a higher-level library? Not really. It seems like I don't need to think about them, but in fact, if I get sloppy with those details, the library implementation will also crash. So you will find that if you understand the lower-level constructs, all those libraries actually make sense, because at some point, you will be thinking about implementing them yourself, if you use the lower-level calls. Of course, at that point, it's usually better to use a time-tested and debugged library call.

So, let's break down the possible implementations:

TBB/GCD library calls: greatest benefit is for beginners of threading. They have lower barriers to entry compared to learning lower level libraries. However, they also ignore/hide some of the traps of using multi-threading. Dynamic load balancing will make your application more portable without additional coding on your part.
pthread and std::thread calls: there are actually very few calls to learn, but to use them correctly takes attention to detail and deep awareness of how your application works. If you can understand threads at this level, the APIs of higher-level libraries will certainly make more sense.
single-threaded algorithm: let us not forget the benefits of a simple single-threaded segment. For most applications, a single-thread is easier to understand and much less error-prone than multi-threading. In fact, in many cases, it may be the appropriate design choice. The fact of the matter is, a real application goes through various multi-threading phases and single-threading phases: there may be no need to be multi-threaded all the time.

Which one is fastest? The surprising truth is, it could be any of the three of the above. To get speed benefits of multi-threading, you may need to drastically reorganize your algorithms. Whether or not the benefits outweigh the costs is highly case-dependent.

Oh, and the OP asked about cases where a thread_pool is not appropriate. Easy case: if you have a tight loop that does not require many cycles per loop to compute, using thread_pool may cost more than the benefits without serious reworking. Also be aware of the overhead of function calls like lambda through thread pools vs the use of a single tight loop.

For most applications, multi-threading is a kind of optimization, so do it at the right time and in the right places.

score 15 · Accepted Answer · edited Nov 24 '12 at 12:15

15

That overwhelming feeling that you are experiencing.. that's exactly why GCD was invented.

At the most basic level there are threads, pthreads is a POSIX API for threads so you can write code in any compliant OS and expect it to work. GCD is built on top of threads (although I'm not sure if they actually used pthreads as the API). I believe GCD only works on OS X and iOS — that in a nutshell is its main disadvantage.

Note that projects that make heavy use of threads and require high performance implement their own version of thread pools. GCD allows you to avoid (re)inventing the wheel for the umpteenth time.

edited Nov 24 '12 at 12:15

Constantino Tsarouhas

6,846
6
43
54

answered Jan 27 '10 at 04:57

slebetman

109,858
19
140
171

1

So aside from cross-platform compatibility, there are no disadvantages to using GCD(even speed)? – Mike Jan 27 '10 at 05:01
5

Actually, speed is one of GCD's advantages over naive usage of pthreads. Even complicated self-implemented-thread-pools in pthreads like Apache is rarely as good as GCD because GCD has better knowledge of the OS and underlying hardware. – slebetman Jan 27 '10 at 05:06
Pthreads libraries are as fast as the implementers make them. They have to be written specifically for the OS and CPU Architecture (Atomic operations to avoid race conditions are highly architecture dependent). It's not a generic library that will compile on any system. – Chris S Jan 27 '10 at 05:14
@Chris: As mentioned, GCD builds on top of threads, not an alternate implementation. It is basically thread pools implemented by the OS where the OS decides how many threads to spawn and how much of the tasks are actually scheduled rather than threaded. Therefore, no matter how fast pthreads is on a given platform a solution like GCD would always, at least in theory, be faster. If you try to optimize your threaded app for performance to the point of what Apache2 and NginX have done you end up emulating GCD but without the help of the OS to manage your pools. – slebetman Jan 27 '10 at 08:25
4

@slebetman: Once the threads are created the OS would treat them the same as any other thread, it wouldn't matter what library created them. So GCD and Pthreads, once created, would run at exactly the same speed. Now if GCD implements it's own mutexes, those might be faster or slower than manual pthread mutexes (it would depend mostly on the programming). GCD manages thread creation and canceling itself, so it may or may not present an optimal solution; the same can be said of pthreads, except the programmer is responsible for creat/cancel with pthreads, and could potential optimize the app. – Chris S Jan 27 '10 at 17:57
3

According to Apple's documentation, the real advantages of GCD is the 'management'. It's pretty hard to tweak & optimizing thread pool manually. GCD does it. And Apple's GCD implementation integrated with kernel, not on top of pthreads. – eonil Aug 29 '10 at 17:33
Even compatibility isn't an issue. GCD is an open-source library called `libdispatch`, so you can compile it with every project (at least for UNIX-compliant systems, without modification). – Constantino Tsarouhas Nov 24 '12 at 12:11
Another subtle thing is that GCD threads actually don't implement all of the pthread semantics. There's a bunch of pthread_* calls that are just marked as "don't call this from a GCD queue". I'm not *aware* of any speedup from this (mostly it's just so the thread pool can manage them properly), but a willingness to violate pthread semantics could help in some cases. On the flip side, GCD's priority system is not as fine-grained as pthreads. – Catfish_Man Dec 09 '12 at 19:34

score 4 · Answer 3 · answered Jan 27 '10 at 04:55

4

GCD is an Apple technology, and not the most cross platform compatible; pthread are available on just about everything from OSX, Linux, Unix, Windows.. including this toaster

GCD is optimized for thread pool parallelism. Pthreads are (as you said) very complex building blocks for parallelism, you are left to develop your own models. I highly recommend picking up a book on the topic if you're interested in learning more about pthreads and different models of parallelism.

answered Jan 27 '10 at 04:55

Chris S

766
4
13

I'm not too concerned with cross-platform compatibility. After all, GCD is open source and FreeBSD has already adopted it(albeit as an imperfect implementation). I'm mostly concerned with writing parallel programs. Are there cases where threadpool parallelism aren't applicable? I'd be especially interested in specific examples. – Mike Jan 27 '10 at 04:59
1

FreeBSD is actively porting CGD (actually libdispatch, you can't quite "port" a trademark since it's not code). On the compiler side, Apple have already submitted patches to gcc to support closure syntax that supports GCD. – slebetman Jan 27 '10 at 05:01
Thread pool parallelism is great for small/short workloads. Many web servers are built this way because most page requests involve minimal processing. For applications with diverse user-roles (think means of input and output, not people-users) you may want very different threads to handle each. For instance your application may have a disk cache management thread, a user connections thread, and a processing queue thread. This would allow your program to take advantage of multi-core processors. If GCD is doing what you need, stick with it, it's simpler. Pthreads have more options and complexity – Chris S Jan 27 '10 at 05:07
For your example, wouldn't it make more sense to use GCD's event listeners(with the appropriate file descriptors) rather than allocating separate threads? – Mike Jan 27 '10 at 05:11
Well, lets assume for a moment that you only need the three threads mentioned in the example. Using GCD you'd basically have to have a handler figure out which type of function you want to perform, then call the correct function. Also implementing asynchronous IO might get interesting. – Chris S Jan 27 '10 at 05:25

score 4 · Answer 4 · answered Jan 27 '10 at 05:28

As any declarative/assisted approach like openmp or Intel TBB GCD should be very good at embarrassingly parallel problems and will probably easily beat naïve manually pthread-ed parallel sort. I would suggest you still learn pthreads though. You'll understand concurrency better, you'd be able to apply right tool in each particular situation, and if for nothing else - there's ton of pthread-based code out there - you'd be able to read "legacy" code.

score 0 · Answer 5 · answered Jul 15 '13 at 18:42

GCD abstracts threads and gives you dispatch queues. It creates threads as it deems necessary taking into account the number of processor cores available. GCD is Open Source and is available through the libdispatch library. FreeBSD includes libdispatch as of 8.1. GCD and C Blocks are mayor contributions from Apple to the C programming community. I would never use any OS that doesn't support GCD.

score 0 · Answer 6 · answered Sep 23 '10 at 19:05

Usual: 1 task per Pthread implementations use mutexes (an OS feature).
GCD: 1 task per block, grouped into queues. 1 thread per virtual CPU can get a queue and run without mutexes through all the tasks. This reduces thread management overhead and mutex overhead, which should increase performance.

Does pthreads provide any advantages over GCD?

6 Answers6

Linked