My DataFlow pipeline starts with a BatchBlock
and several Tasks are posting items into this BatchBlock
. Now, this BatchBlock
propagates data to the next block depending on a Timer with the help of the TriggerBatch()
method.
In this case, you can assume that none of the batches are of the (very high) batch size provided during the creation of the BatchBlock
i.e. each triggered batch could be of a different size.
Just before triggering the BatchBlock
I would like to remove all duplicate items present in the batch that is about to be propagated to the next block in the pipeline. Is there a way I can do that?