BlockingCollection with async task

Question

I'm trying to correctly model a multi-thread single-producer/multi-consumer scenario where a consumer can ask the producer to get an item, but the producer needs to do a time consuming operation to produce it (think about executing a query or printing a document).

My goal is to ensure no consumer can concurrently ask the producer to produce an item. In my real world use case the producer is an hardware controller which must ensure that just 1 request at a time is sent to hardware. Other concurrent requests must eventually wait or be rejected (i know how to reject them, so let's concentrate on making them wait).

I want the producer and each consumer to run in different threads.
I cannot get a clean code using only a BlockingCollection. I had to use it together with a SemaphoreSlim otherwise consumers could incur in race conditions.
I think it should work (infact it worked well in all my tests), even if I'm not 100% sure about it.
This is my program:

Producer:

class Producer : IDisposable
{
    //Explicit waiting item => I feel this should not be there
    private SemaphoreSlim _semaphore;

    private BlockingCollection<Task<string>> _collection;

    public Producer()
    {
        _collection = new BlockingCollection<Task<string>>(new ConcurrentQueue<Task<string>>(), 1);
        _semaphore = new SemaphoreSlim(1, 1);
    }

    public void Start()
    {
        Task consumer = Task.Factory.StartNew(() =>
        {
            try
            {
                while (!_collection.IsCompleted)
                {
                    Task<string> current = _collection.Take();
                    current.RunSynchronously(); //Is this bad?

                    //Signal the long running operation has ended => This is what I'm not happy about
                    _semaphore.Release();
                }
            }
            catch (InvalidOperationException)
            {
                Console.WriteLine("Adding was compeleted!");
            }
        });
    }

    public string GetRandomString(string consumerName)
    {
        Task<string> task = new Task<string>(() =>
        {
            //Simulate long running operation
            Thread.Sleep(100);
            return GetRandomString();
        });

        _collection.Add(task);

        //Wait for long running operation to complete => This is what I'm not happy about
        _semaphore.Wait();

        Console.WriteLine("Producer produced {0} by {1} request", task.Result, consumerName);
        return task.Result;
    }

    public void Dispose()
    {
        _collection.CompleteAdding();
    }

    private string GetRandomString()
    {
        var chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
        var random = new Random();
        var result = new string(Enumerable
            .Repeat(chars, 8)
            .Select(s => s[random.Next(s.Length)])
            .ToArray());
        return result;
    }
}

Consumer:

class Consumer
{
    Producer _producer;
    string _name;

    public Consumer(
        Producer producer,
        string name)
    {
        _producer = producer;
        _name = name;
    }

    public string GetOrderedString()
    {
        string produced = _producer.GetRandomString(_name);
        return String.Join(String.Empty, produced.OrderBy(c => c));
    }
}

Console application:

class Program
{
    static void Main(string[] args)
    {
        int consumerNumber = 5;
        int reps = 10;

        Producer prod = new Producer();
        prod.Start();

        Task[] consumers = new Task[consumerNumber];

        for (var cConsumers = 0; cConsumers < consumerNumber; cConsumers++)
        {
            Consumer consumer = new Consumer(prod, String.Format("Consumer{0}", cConsumers + 1));

            Task consumerTask = Task.Factory.StartNew((consumerIndex) =>
            {
                int cConsumerNumber = (int)consumerIndex;
                for (var counter = 0; counter < reps; counter++)
                {
                    string data = consumer.GetOrderedString();
                    Console.WriteLine("Consumer{0} consumed {1} at iteration {2}", cConsumerNumber, data, counter + 1);
                }
            }, cConsumers + 1);

            consumers[cConsumers] = consumerTask;
        }

        Task continuation = Task.Factory.ContinueWhenAll(consumers, (c) =>
        {
            prod.Dispose();
            Console.WriteLine("Producer/Consumer ended");
            Console.ReadLine();
        });

        continuation.Wait();
    }
}

What I'm concerned about is if this is the correct way of approaching the problem or if there is any other best practise you guys can suggest.
I already googled and tried different proposed ideas but every example i tried made the assumption that the producer was able to produce items immediately after they were requested... quite a rare situation in real world applications :)
Any help is greatly appreciated.

Is there any order/logic consumers will ask producer for a new item? Is this kind of load balancing or what you're trying to achieve having multiple concurrent consumers in case of slow producer? I'm aware of "Slow Consumer" problem but not vice versa. Also do not forget to Dispose your semaphore instance — sll, Sep 09 '15 at 09:27
It is not at all clear what are you trying to achieve exactly. Can you clarify a bit more? — Evk, Sep 09 '15 at 09:34
You are right, i tried to explain it in my edit: My goal is to ensure no consumer can concurrently ask the producer to produce an item. In my real world use case the producer is an hardware controller which must ensure that just 1 request at a time is sent to hardware. Other concurrent requests must eventually wait or be rejected (i know how to reject them, so let's concentrate on making them wait). Is it better now? — andreapier, Sep 09 '15 at 09:38
The ActionBlock class from TPL's DataFlow already addresses this. By default it allows only one message to be processed at a time. — Panagiotis Kanavos, Sep 09 '15 at 09:40
@i3arnon I'll have a look, but from what i can understand TPL Dataflow is for .Net >= 4.5. Anything for 4.0? — andreapier, Sep 09 '15 at 09:40
Consumers don't ask producers to create an item - they receive whatever is available. If only one thread runs the producer code, only one item is created at a time. Perhaps you mean something different by producer/consumer? — Panagiotis Kanavos, Sep 09 '15 at 09:49
Well, i think about it as an on-demand producer... But in the end you are probably right... I'm not quite sure this is the best way to go. Still, if someone know how to address this it could be helpful — andreapier, Sep 09 '15 at 09:52
I had a look at Tpl Dataflow... I think that is the right way of handling my situation and not having to write lot of boilerplate (an probably buggy) code. Thanks for the suggestions guys! — andreapier, Sep 09 '15 at 13:30

score 2 · Accepted Answer · answered Sep 09 '15 at 10:29

If I understand you correctly, you want to ensure that only one task at a time is processed by so called "producer". Then with slight modifications to your code you can do it like this:

internal class Producer : IDisposable {
    private readonly BlockingCollection<RandomStringRequest> _collection;

    public Producer() {
        _collection = new BlockingCollection<RandomStringRequest>(new ConcurrentQueue<RandomStringRequest>());
    }

    public void Start() {
        Task consumer = Task.Factory.StartNew(() => {
            try {
                foreach (var request in _collection.GetConsumingEnumerable()) {
                    Thread.Sleep(100); // long work
                    request.SetResult(GetRandomString());
                }
            }
            catch (InvalidOperationException) {
                Console.WriteLine("Adding was compeleted!");
            }
        });
    }

    public RandomStringRequest GetRandomString(string consumerName) {
        var request = new RandomStringRequest();
        _collection.Add(request);
        return request;            
    }

    public void Dispose() {
        _collection.CompleteAdding();
    }

    private string GetRandomString() {
        var chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
        var random = new Random();
        var result = new string(Enumerable
            .Repeat(chars, 8)
            .Select(s => s[random.Next(s.Length)])
            .ToArray());
        return result;
    }
}

internal class RandomStringRequest : IDisposable {
    private string _result;
    private ManualResetEvent _signal;

    public RandomStringRequest() {
        _signal = new ManualResetEvent(false);
    }

    public void SetResult(string result) {
        _result = result;
        _signal.Set();
    }

    public string GetResult() {
        _signal.WaitOne();
        return _result;
    }

    public bool TryGetResult(TimeSpan timeout, out string result) {
        result = null;
        if (_signal.WaitOne(timeout)) {
            result = _result;
            return true;
        }
        return false;
    }

    public void Dispose() {
        _signal.Dispose();
    }
}

internal class Consumer {
    private Producer _producer;
    private string _name;

    public Consumer(
        Producer producer,
        string name) {
        _producer = producer;
        _name = name;
    }

    public string GetOrderedString() {
        using (var request = _producer.GetRandomString(_name)) {
            // wait here for result to be prepared
            var produced = request.GetResult();
            return String.Join(String.Empty, produced.OrderBy(c => c));
        }
    }
}

Note that producer is single-threaded and it uses GetConsumingEnumerable. Also there is no semaphore and no Tasks. Instead, RandomStringRequest is returned to consumer, and while calling GetResult or TryGetResult, it will wait until result is produced by producer (or timeout expires). You may also want to pass CancellationTokens in some places (like to GetConsumingEnumerable).

Thanks, very smart example! I didn't pass any CancellationToken just for clarity — andreapier, Sep 09 '15 at 11:06

score 1 · Answer 2 · answered Sep 09 '15 at 10:23

I think the semaphore is not really needed. All your needs should already be addressed by the Task class and the available concurrent collections.

I tried to create a little example code. Unfortunately I still didn't fully get my head around async/await, so maybe this would also be an area that could help in your case (if your real task are mainly I/O bound and not cpu bound).

But as you already can see, no semaphore or similar is needed. All these things are done by the provided classes (e.g. the call to BlockingCollection.GetConsumingEnumerable()).

So hope this helps a little.

Program.cs

private static void Main(string[] args)
{
    var producer = new Producer();
    var consumer = new Consumer(producer.Workers);

    consumer.Start();
    producer.Start();

    Console.ReadKey();
}

Producer.cs

public class Producer : IDisposable
{
    private CancellationTokenSource _Cts;
    private Random _Random = new Random();
    private int _WorkCounter = 0;
    private BlockingCollection<Task<String>> _Workers;
    private Task _WorkProducer;

    public Producer()
    {
        _Workers = new BlockingCollection<Task<String>>();
    }

    public IEnumerable<Task<String>> Workers
    {
        get { return _Workers.GetConsumingEnumerable(); }
    }

    public void Dispose()
    {
        Stop();
    }

    public void Start()
    {
        if (_Cts != null)
            throw new InvalidOperationException("Producer has already been started.");

        _Cts = new CancellationTokenSource();
        _WorkProducer = Task.Factory.StartNew(() => Run(_Cts.Token));
    }

    public void Stop()
    {
        var cts = _Cts;

        if (cts != null)
        {
            cts.Cancel();
            cts.Dispose();
            _Cts = null;
        }

        _WorkProducer.Wait();
    }

    private String GetRandomString()
    {
        var chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
        var result = new String(Enumerable
            .Repeat(chars, 8)
            .Select(s => s[_Random.Next(s.Length)])
            .ToArray());

        return result;
    }

    private void Run(CancellationToken token)
    {
        while (!token.IsCancellationRequested)
        {
            var worker = StartNewWorker();
            _Workers.Add(worker);
            Task.Delay(100);
        }

        _Workers.CompleteAdding();
        _Workers = new BlockingCollection<Task<String>>();
    }

    private Task<String> StartNewWorker()
    {
        return Task.Factory.StartNew<String>(Worker);
    }

    private String Worker()
    {
        var workerId = Interlocked.Increment(ref _WorkCounter);
        var neededTime = TimeSpan.FromSeconds(_Random.NextDouble() * 5);
        Console.WriteLine("Worker " + workerId + " starts in " + neededTime);
        Task.Delay(neededTime).Wait();
        var result = GetRandomString();
        Console.WriteLine("Worker " + workerId + " finished with " + result);

        return result;
    }
}

Consumer.cs

public class Consumer
{
    private Task _Consumer;
    private IEnumerable<Task<String>> _Workers;

    public Consumer(IEnumerable<Task<String>> workers)
    {
        if (workers == null)
            throw new ArgumentNullException("workers");

        _Workers = workers;
    }

    public void Start()
    {
        var consumer = _Consumer;

        if (consumer == null
            || consumer.IsCompleted)
        {
            _Consumer = Task.Factory.StartNew(Run);
        }
    }

    private void Run()
    {
        foreach (var worker in _Workers)
        {
            var result = worker.Result;
            Console.WriteLine("Received result " + result);
        }
    }
}

Nice example. I guess you could shorten it to `.Take()` method and not to `GetConsumingEnumerable()`. I guess the person who asked the question thinks that function that call `.Take()` will immediately return a new value... He actually try to re-implement the `BlockingCollection` — Alex F, Sep 09 '15 at 10:29
@Jasper I don't think that's actually true... Why do you think I didn't understand the difference between returning a result and returning a Task which will complete with a result? — andreapier, Sep 09 '15 at 10:39

BlockingCollection with async task

2 Answers2