Fastest way to synchronize work between a generating and consuming process in C#?

Question

Update 1: It seems like BlockingCollection suggested below is the perfect fit, and I will try to apply that first thing tomorrow. Thank you for the replies.

Update 2: Indeed it seems to perform admirably. Thanks again for all the help.
__

I have a scientific application in C# with two internal tasks that need to communicate:

Tasks: (a) rapidly acquires a large dataset (some times larger than memory), (b) processes each entry.

To prevent getting the entire dataset at once, I plan for (a) to take some amount of data at a time, then for (b) to process all of that, and repeat until done. (a) and (b) both have multiple threads, but are in the same executable.

The data comes out of (a) in form of a list of small individually processable chunks, so I'm wondering the speediest strategy to keep this dance going between the two processes in C# (Windows .NET Standard), and wonder if anyone has experience that can help this decision? I plan is either to:

Have a list in A that B acquires a lock{} on at say 1000 entries, to stop processes in A from adding more data while processing is done.
Have A send B a event when the list is over say 1000, and then have B stop A's threads while it processes.

I not a very experience programmer when it comes to these things, and wonder if anyone has insights that may help, or some terms I can google for to clarify? (: Help is appreciated (:

It seems to me a "producer/consumer pattern".just google it there are many ways to implement it — ClearLogic, Jul 29 '17 at 15:44
Please take a look at [BlockingCollection](https://msdn.microsoft.com/en-us/library/dd267312(v=vs.110).aspx) — Aleks Andreev, Jul 29 '17 at 15:56
Thank you, the BlockingCollection truly seems like a perfect match for what I am trying to do (: — knut, Jul 29 '17 at 19:49
For more information take a look [pipelines](https://msdn.microsoft.com/en-us/library/ff963548.aspx). — Alexander Petrov, Jul 29 '17 at 20:46

score 1 · Answer 1 · answered Jul 29 '17 at 15:45

You could have a queue (lets call that Q).

For A:

queueLimit = 1000

while A has data {

    // Wait until Q length is decreased
    // Wait until B removes data from Q
    while Q.length > queueLimit {
        sleep
    }

    // Add data to Q when B is < 1000
    Q.addData(data)
}

For B:

while Q has data {

    data = getAndRemoveFromQ() // Q's length will decrease
    process data
}

Basically that's how. But you have to implement locking. Q could be a class with methods pop() and add()

Or use the Queue Class

score 1 · Answer 2 · answered Jul 29 '17 at 17:19

1

You can use the ConcurrentQueue mechanism for cross thread data processing!

One thread can push into the queue all the work that needs to be done, while another can loop to wait until the queue has data.

answered Jul 29 '17 at 17:19

Callum Linington

14,213
12
75
154

3

Instead of ConcurrentQueue, a BlockingCollection is even better for this kind of job and uses a ConcurrentQueue as it's internal store by default. – Scott Chamberlain Jul 29 '17 at 17:24
Yeah - it is - didn't see the comment about the blocking collection - queue is really easy to implement though! – Callum Linington Jul 29 '17 at 17:26
And doing `foreach(var item in collection.GetConsumingEnumerable())` isnt? Also how are you going to handle not spinning and wasting a CPU at 100% when the queue is empty with just ConcurrentQueue? – Scott Chamberlain Jul 29 '17 at 18:04
It is more likely that someone will understand the concept of a queue over a blocking collection! But if your answer is spot on - then post it.... – Callum Linington Jul 29 '17 at 18:17

Fastest way to synchronize work between a generating and consuming process in C#?

2 Answers2