I intend to implement a thread pool to manage threads in my project. The basic structure of thread pool come to my head is queue, and some threads generate tasks into this queue, and some thread managed by thread pool are waiting to handle those task. I think this is class producer and consumer problem. But when I google thread pool implementation on the web, I find those implementation seldom use this classic model, so my question is why they don't use this classic model, does this model has any drawbacks? why they don't use full semaphore and empty semaphore to sync?
-
The first couple I found, either used a Queue (`BlockingQueue`) or used Java `ExecutorService`, which tends to use a Queue internally. What did you actually find? – Thomas W Aug 12 '13 at 02:40
-
1Can you elaborate more on what you mean by "seldom"? Where did you see a Pub/Sub design not used? – shazin Aug 12 '13 at 02:42
-
Is this a Java question? – ataravati Aug 12 '13 at 02:48
-
no, it's a c++ question – Run Aug 12 '13 at 02:51
-
i find some implementation just use a mutex, when producer gain this lock, it first check to see whether queue in full,if true just return, else put a task into a queue and add queue size, release mutex, and consumer use similar logic. Those implementation does not use fullsemaphore and empty semaphore to sync – Run Aug 12 '13 at 03:01
-
3Re. your last comment, that does not differ much from the producer consumer model, except that there is a focus on not blocking the caller for a long time. The caller can opt to wait and retry, run the task in the caller's thread or just dump the load. If the lib implemented a blocking enqueue, those choices would be taken by the lib, not the user. – David Rodríguez - dribeas Aug 12 '13 at 03:29
2 Answers
If you have multiple threads waiting on a single resource (in this case the semaphores and queue) then you are creating a bottle neck. You are forcing all tasks through one queue, even though you have multiple workers. Logically this might make sense if the workers are usually idle, but the whole point of a thread pool is to deal with a heavily loaded scenario where the workers are kept busy (for maximum through-put). Using a single input queue will be particularly bad on a multi-processor system where all workers read and write the head of the queue when they are trying to get the next task. Even though the lock contention might be low, the queue head pointer will still need to be shared/communicated from one CPU cache to another each time it is updated.
Think about the ideal case: all workers are always busy. When a new task is enqueued you want it to be dispatched to the worker that will complete its current/pending task(s) first.
If, as a client, you had a contention-free oracle that could tell you which worker to enqueue a new task to, and each worker had its own queue, then you could implement each worker with its own multi-writer-single-reader queue and always dispatch new tasks to the best queue, thus eliminating worker contention on a single shared input queue. Of course you don't have such an oracle, but this mechanism still works pretty well until a worker runs out of tasks or the queues get imbalanced. "Work stealing" deals with these cases, while still reducing contention compared to the single queue case.
See also: Is Work Stealing always the most appropriate user-level thread scheduling algorithm?

- 1
- 1

- 3,822
- 1
- 19
- 33
-
Eh? If the workers are busy, they cannot be in contention for the queue. Even with a single queue, a new task will be dequeued by the the worker that will complete its current/pending task first. Problems arise when a large number of small CPU-intensive tasks are stuffed onto the one queue, ie. tasks whose expected CPU runtime is small compared with inter-thread comms latency. If the tasks are CPU-lengthy, or block for extended periods, there is no big problem. – Martin James Aug 12 '13 at 09:18
-
@MartinJames I agree with what you've written. Can you explain how it contradicts what I wrote? You have described a "workers kept busy" scenario too. I didn't distinguish between "small CPU-intensive tasks and "CPU-lengthy" tasks. I agree that there are workloads where it doesn't matter, maybe the answer could be improved with this info. – Ross Bencina Aug 13 '13 at 03:05
Why there's no Producer and Consumer model implementation
This model is very generic and could have lots of different explanations, one of the implementation could be a Queue
:
Try Apache APR Queue:
It's documented as Thread Safe FIFO bounded queue
.

- 19,732
- 32
- 90
- 138