1

I'm looking for a way to await all items to be processed via TPL TransformBlock. Sample code:

var transformBlock = new TransformBlock<int, int>(async number =>
{
    await Task.Delay(TimeSpan.FromMilliseconds(300));
    return number * 2;
}, new ExecutionDataflowBlockOptions
{
    MaxDegreeOfParallelism = Environment.ProcessorCount
});

foreach (var number in Enumerable.Range(1, 100))
{
    transformBlock.Post(number);
}

transformBlock.Complete();

At this point I have a call to Complete, which to my understanding signals to TransformBlock to not receive any more items & to finish processing all available data (available in InputQueue).

But I'm not sure how to await for all items to be available ?

Awaiting Completion task is not a solution becase as the answer states:

An instance of TransformBlock is not considered "complete" until the following conditions are met:

  1. TransformBlock.Complete() has been called
  2. InputCount == 0 – the block has applied its transformation to every incoming element
  3. OutputCount == 0 – all transformed elements have left the output buffer

One way I found is to await all tasks returned from ReceiveAsync, e.g.

var tasks = new List<Task<int>>();
foreach (var number in Enumerable.Range(1, 100))
{
    transformBlock.Post(number);

    tasks.Add(transformBlock.ReceiveAsync());
}

transformBlock.Complete();

await Task.WhenAll(tasks);

tasks.Select(t => t.GetAwaiter().GetResult())
     .ToList()
     .ForEach(Console.WriteLine);

However, I'm not really sure this is 100 % correct.

Another options I see in this answer suggestion made to add another TPL blocks and propogate completion, that way we can await transform block's completion and then consume results from linked TPL block, but it does seem like overcomplication of the task & I'm assuming there is better (built-in or less verbose) way ?

Michael
  • 2,961
  • 2
  • 28
  • 54
  • 1
    Why use a block if you want to process "all" messages and read them all at once? Why not used eg `Parallel.ForEach` or PLINQ in this case? Linking to another block, eg a BufferBlock isn't overcomplication. Blocks are meant to work in pipelines and a block that can't push its messages to the next one can't be considered complete. In any case, you don't need to *propagate* completion, just use `LinkTo(someBufferBlock)` – Panagiotis Kanavos Jul 29 '19 at 12:44
  • Awaiting the tasks returned by `ReceiveAsync` is wrong anyway. The actual thing you need to measure is the number of items read, not the tasks used to read those items. If you *know* you have 100 items, you can just decrement a counter. – Panagiotis Kanavos Jul 29 '19 at 12:46
  • 2
    It's not clear to me at all where these items are going *to*. They have to leave the block somehow - either to another block or by a consumer pulling them out. Either way, you can use `Completion` to tell when all items have left the `TransformBlock`. – Stephen Cleary Jul 29 '19 at 13:43
  • @PanagiotisKanavos using `Parallel.ForEach` is not that convenient, because I need to process all those items and use the results. Granted I could setup degree of parallelism on `Parallel.ForEach` and add processed items into some `concurrent collection` but PLINQ fits better in this case, but I needed some kind of throttling of processing so I falled back using `TransformBlock`. But I see the point now, `something` has to get items out of output queue & using another block might be more feasible. – Michael Jul 29 '19 at 13:48
  • @PanagiotisKanavos could you please elaborate a bit more on wrongness of using `ReceiveAsync` ? – Michael Jul 29 '19 at 13:50
  • Expanding on the comment from Stephen Cleary. If you want to wait for completion of everything in the `TransformBlock`; then call `Complete()` and then `await block.Completion` – JSteward Jul 29 '19 at 15:39
  • 1
    If you can't `await` the `Completion` task because items are left in the input buffer you could link it to a null target bnut the bvetter solution would be to just use an `ActionBlock` to do your transform. – JSteward Jul 29 '19 at 15:52
  • @JSteward if you call `Complete()` and then `await block.Completion` it will never finish, see my post on when block is considered `completed` - `OutputCount` will never reach 0, since I'm not consuming items out of outputqueue (which I believe was point of Stephen's comment). – Michael Jul 30 '19 at 07:24
  • 2
    @Michael yup that was what I was talking about in my second comment, if you don't need items to flow anywhere further in a dataflow then you should use an `ActionBlock` – JSteward Jul 30 '19 at 15:08

0 Answers0