3

I have list of batches to process. forever.
I want to do each chunk (5) in parallel, and when it is done move to the next chunk.
for some reason, the code bellow is not waiting for the chunk to be done and continue even if it is not completed.

while (true)
{
    foreach (string[] urlsArr in chunks)
    { 
        int i = 0;
        foreach (var url in urlsArr)
        {
            ThreadPool.QueueUserWorkItem(x =>
            {
                ProccessUrl(url, config, drivers[i]);
                _resetEvent.Set();
                i++;
            });
        }
        _resetEvent.WaitOne();// this is not really waiting.
    }
}
VMAtm
  • 27,943
  • 17
  • 79
  • 125
SexyMF
  • 10,657
  • 33
  • 102
  • 206
  • 3
    If you want to process one by one then don't start all work in parallel. Simply process it in a loop. – usr Apr 06 '15 at 09:53

3 Answers3

1

Have a look at the Semaphore or it's slim version. Semaphore will allow you to always have only 5 threads running. Once any of those running threads finish, it can pick up new work. This is more efficient, especially if the workload is uneven. Consider situation when 1 item takes an hour to process and the other 4 take a second. In this case the 4 threads will wait for the last one to finish before picking up any other work.

For an example, see Need to understand the usage of SemaphoreSlim.

In your code the problem is that you only have one wait handle and 5 threads. When any of the 5 running threads finish work, it will set the wait handle, hence allowing your outer loop to proceed, which starts another five threads. By now, it's possible that the first 4 threads from the inner loop could have completed and any of them could set the wait handle again! Now, do you see a problem here?

As per Hans comment, if there is a dependency between work items in a single batch, so that all of the work items must be completed before you can start a next batch you should have a look at CountDownEvent

Community
  • 1
  • 1
oleksii
  • 35,458
  • 16
  • 93
  • 163
  • Semaphore/Slim will give him the exact same problem he has now. You need a *reverse* semaphore, it should be signaled when the count reaches zero. CountDownEvent is good for that. – Hans Passant Apr 06 '15 at 10:24
1

Here is a version with Tasks(async/await)

while (true)
        {
            foreach (string[] urlsArr in chunks)
            {
                Task[] tasks = new Task[urlsArr.Length];
                for (int i = 0; i < urlsArr.Length; i++)
                {
                    var url = urlsArr[i];
                    var driver = drivers[i];
                    tasks[i] = Task.Run(() => { ProccessUrl(url, config, driver); });
                }

                await Task.WhenAll(tasks);
            }
        }

note that it also fixes a problem with the 'i' variable in the original code which was not incremented in a thread safe way(could be fixed with Interlocked.Increment).

If your code is not async you can wait for the tasks to finish in the thread(but this is blocking) instead

Task.WhenAll(tasks).Wait();
Ventsyslav Raikov
  • 6,882
  • 1
  • 25
  • 28
0

I think that you can perhaps simplify the entire thing, and leverage Parallel.ForEach() to both manage the threads AND limit the degree of concurrency to 5.

If you run the following sample code, you will see that the pretend URLs are being processed in chunks of 5 because the number of concurrent threads is being limited to 5.

If you do it like this, you will not need your own chunking logic:

using System;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;

namespace ConsoleApplication2
{
    class Program
    {
        static void Main()
        {
            // Make some pretend URLs for this demo.

            string[] urls = Enumerable.Range(1, 100).Select(n => n.ToString()).ToArray();

            // Use Parallel.ForEach() along with MaxDegreeOfParallelism = 5 to process
            // these using 5 threads maximum:

            Parallel.ForEach(
                urls,
                new ParallelOptions{MaxDegreeOfParallelism = 5},
                processUrl
            );
        }

        static void processUrl(string url)
        {
            Console.WriteLine("Processing " + url);
            Thread.Sleep(1000);
            Console.WriteLine("Processed " + url);
        }
    }
}
Matthew Watson
  • 104,400
  • 10
  • 158
  • 276