0

I am trying to simulate work between two collections asynchronously and in parallel, I have a ConcurrentQueue of customers and a collection of workers. I need the workers to take a Customer from the queue perform work on the customer and once done take another customer right away.

I decided I'd use an event-based paradigm where the collection of workers would perform an action on a customer; who holds an event handler that fires off when the customer is done; which would hopefully fire off the DoWork Method once again, that way I can parallelize the workers to take customers from the queue. But I can't figure out how to pass a customer into DoWork in OnCustomerFinished()! The worker shouldn't depend on a queue of customers obviously

public class Worker
{
    public async Task DoWork(ConcurrentQueue<Customer> cust)
    {
        await Task.Run(() =>
        {
            if (cust.TryDequeue(out Customer temp))
            {
                Task.Delay(5000);
                temp.IsDone = true;
            }
        });
    }

    public void OnCustomerFinished()
    {
        // This is where I'm stuck
        DoWork(~HOW TO PASS THE QUEUE OF CUSTOMER HERE?~);
    }
}

// Edit - This is the Customer Class

 public class Customer
{
    private bool _isDone = false;

    public EventHandler<EventArgs> CustomerFinished;

    public bool IsDone
    {
        private get { return _isDone; }
        set
        {
            _isDone = value;
            if (_isDone)
            {
                OnCustomerFinished();
            }

        }
    }
    protected virtual void OnCustomerFinished()
    {
        if (CustomerFinished != null)
        {
            CustomerFinished(this, EventArgs.Empty);
        }
    }
}
Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
Dani Rashba
  • 107
  • 1
  • 8
  • It seems to me the pattern should be: 1. worker tries to get a customer from the queue 2. if there is a customer, do work on them, otherwise "sleep" 3. repeat. – Rufus L Sep 10 '19 at 23:40
  • 4
    Producer/consumer should be implemented using `BlockingCollection` and `GetConsumingEnumerable()`. – Peter Duniho Sep 10 '19 at 23:40
  • Hi @PeterDuniho I am not familiar with said class and method, I'll google it now, in the meantime, I just want to understand, is my idea of simulating it with events bad? – Dani Rashba Sep 10 '19 at 23:52
  • 3
    You didn't post enough code for anyone to know if it's good or bad. There's no actual queue, no customer object, and it's not clear what you mean by "event", since there are not events in the code either (not a C# event, nor a Windows-style "event handle" object) – Peter Duniho Sep 11 '19 at 00:06
  • @PeterDuniho the event handler that OnCustomerFinished() is subscribed to, is in the Customer class. It would just fire off when a customer is done, that's what the temp.IsDone = true; is for. Adding it here shouldn't make my code any clearer. – Dani Rashba Sep 11 '19 at 00:12
  • _"Adding it here shouldn't make my code any clearer."_ -- you don't get to decide what would or would not make your code any clearer. The Stack Overflow requirement is clear: include a [mcve] in your question. – Peter Duniho Sep 11 '19 at 00:13
  • 1
    All that said: why rely on the event at all? If the worker is calling some method on the customer that has to do some work, why not call the method synchronously and just wait for it to return? And if you want asynchronous, why not just await the async operation and enqueue a new work item there? Of course, lacking a good [mcve], it's hard to even see the design you have in mind, never mind offer alternatives. – Peter Duniho Sep 11 '19 at 00:16
  • @PeterDuniho My bad sorry, I've edited my post, I hope it's clearer now. – Dani Rashba Sep 11 '19 at 00:25
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/199270/discussion-between-dani-rashba-and-peter-duniho). – Dani Rashba Sep 11 '19 at 00:33
  • .NET already has pub/sub and worker mechanisms- DataFlow blocks and lately, Channels. What is the *actual* work you want to do? – Panagiotis Kanavos Sep 13 '19 at 12:36

1 Answers1

4

.NET already has pub/sub and worker mechanisms in the form of DataFlow blocks and lately, Channels.

Dataflow

Dataflow blocks from the System.Threading.Tasks.Dataflow namespace are the "old" way (2012 and later) of building workers and pipelines of workers. Each block has an input and/or output buffer. Each message posted to the block is processed by one or more tasks in the background. For blocks with outputs, the output of each iteration is stored in the output buffer.

Blocks can be combined into pipelines similar to a CMD or Powershell pipeline, with each block running on its own task(s).

In the simplest case an ActionBlock can be used as a worker:

void ProcessCustomer(Customer customer)
{
    ....
}

var block =new ActionBlock<Customer>(cust=>ProcessCustomer(cust));

That's it. There's no need to manually dequeue or poll.

The producer method can start sending customer instances to the block. Each of them will be processed in the background, in the order they were posted :

foreach(var customer in bigCustomerList)
{
    block.Post(customer);
}

When done, eg when the application terminates, the producer only needs to call Complete() on the block and wait for any remaining entries to complete.

block.Complete();
await block.Completion;

Blocks can work with asynchronous methods too.

Channels

Channels are a new mechanism, built into .NET Core 3 and available as a NuGet in previous .NET Framework and .NET Core version. The producer writes to a channel using a ChannelWriter and the consumer reads from the channel using a ChannelReader. This may seem a bit strange until you realize it allows some powerful patterns.

The producer could be something like this, eg a producer that "produces" all customers in a list with a 0.5 sec delay :

ChannelReader<Customer> Producer(IEnumerable<Customer> customers,CancellationToken token=default)
{
    //Create a channel that can buffer an infinite number of entries
    var channel=Channel.CreateUnbounded();
    var writer=channel.Writer;
    //Start a background task to produce the data
    _ = Task.Run(async ()=>{
        foreach(var customer in customers)
        {
            //Exit gracefully in case of cancellation
            if (token.IsCancellationRequested)
            {
                return;
            }
            await writer.WriteAsync(customer,token);
            await Task.Delay(500);
        }
    },token)
         //Ensure we complete the writer no matter what
         .ContinueWith(t=>writer.Complete(t.Exception);

   return channel.Reader;
}

That's a bit more involved but notice that the only thing the function needs to return is the ChannelReader. The cancellation token is useful for terminating the producer early, eg after a timeout or if the application closes.

When the writer completes, all the channel's readers will also complete.

The consumer only needs that ChannelReader to work :

async Task Consumer(ChannelReader<Customer> reader,CancellationToken token=default)
{
    while(await reader.WaitToReadAsync(token))
    {
       while(reader.TryRead(out var customer))
       {
           //Process the customer
       }
    }
}

Should the writer complete, WaitToReadAsync will return false and the loop will exit.

In .NET Core 3 the ChannelReader supports IAsyncEnumerable through the ReadAllAsync method, making the code even simpler :

async Task Consumer(ChannelReader<Customer> reader,CancellationToken token=default)
{
    await foreach(var customer in reader.ReadAllAsync(token))
    {
           //Process the customer
    }
}

The reader created by the producer can be passed directly to the consumer :

var customers=new []{......}
var reader=Producer(customers);
await Consumer(reader);

Intermediate steps can read from a previous channel reader and publish data to the next, eg an order generator :

ChannelReader<Order> ConsumerOrders(ChannelReader<Customer> reader,CancellationToken token=default)
{
    var channel=Channel.CreateUnbounded();
    var writer=channel.Writer;
    //Start a background task to produce the data
    _ = Task.Run(async ()=>{
        await foreach(var customer in reader.ReadAllAsync(token))
        {
           //Somehow create an order for the customer
           var order=new Order(...);
           await writer.WriteAsync(order,token);
        }
    },token)
         //Ensure we complete the writer no matter what
         .ContinueWith(t=>writer.Complete(t.Exception);

   return channel.Reader;
}

Again, all we need to do is pass the readers from one method to the next

var customers=new []{......}
var customerReader=Producer(customers);
var orderReader=CustomerOrders(customerReader);
await ConsumeOrders(orderReader);
Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
  • Do I see there the new C#8 async enumerable feature? – FCin Sep 13 '19 at 13:35
  • @FCin with a Go Live license for at least a month now. If you check [ReadAllAsync](https://github.com/dotnet/corefx/blob/master/src/System.Threading.Channels/src/System/Threading/Channels/ChannelReader.netcoreapp.cs)'s code you'll see it's very simple but simplifies the code *a lot*. Once you star passing ChannelReaders around it's easy to write generic channel methods that can be chained just like LINQ. – Panagiotis Kanavos Sep 13 '19 at 13:40
  • @FCin there's also a [Microsoft.Bcl.AsyncInterfaces](https://www.nuget.org/packages/Microsoft.Bcl.AsyncInterfaces/) NuGet package that adds IAsynEnumerable to older .NET versions but I haven't checked it yet – Panagiotis Kanavos Sep 13 '19 at 13:42
  • A note about `block.Post(customer)`. To avoid a possible frustration later, when the need for [`BoundedCapacity`](https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.dataflow.dataflowblockoptions.boundedcapacity) arises, it is preferable to feed the block with [`SendAsync`](https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.dataflow.dataflowblock.sendasync) instead of `Post`. `Post` fails to add the element in the block when the limit is reached. `SendAsync` waits until there is empty space again in the block's queue. – Theodor Zoulias Sep 13 '19 at 15:11
  • @TheodorZoulias there's no bound here so there's no need for SendAsync. The channels don't have a bound either. This answer is long enough already without going into specific for each library – Panagiotis Kanavos Sep 13 '19 at 15:17
  • We know that there is a collection of customers. Feeding this collection to the `ActionBlock` will result in duplicating this collection inside the block's buffer, which is a waste of resources. This may not be so important if the customers are few, or if the original collection is going to be discarded. Otherwise a `BoundedCapacity` functionality will be much desired. It seems that channels have [`Capacity`](https://learn.microsoft.com/en-us/dotnet/api/system.threading.channels.boundedchanneloptions.capacity) too. – Theodor Zoulias Sep 13 '19 at 17:47