TPL Dataflow Batchblock check for elements in Input buffer

Question

My DataFlow pipeline starts with a BatchBlock. I would like to trigger the BatchBlock with the help of the TriggerBatch() method. My batch sizes can be variable, so when creating the Batchblock I have placed a rather high BatchSize which I don't really expect to be reached.

Now I would like to call the TriggerBatch() method depending on the fact that whether the BatchBlock indeed has some elements in it which could infact be triggered into a batch. Is there a way to check whether the internal buffer of the Batchblock is non-empty? If not, could someone please suggest a solution for the same to me. The BatchBlock is being fed from multiple threads which Post() data into the Batchblock. However I would like a mechanism which could wait until there are elements present in the Batchblock before triggering it. Basically I would not want my TriggerBatch() to be fruitless. Unfortunately, using a timer is not an option for me.

At this point I do need a third persons opinion as I have been stuck on this for quite a while now.

*What* is the desired behaviour? When and why should the batch be triggered? It isn't hard to create a custom block, in fact you could probably adapt the first sliding window example from [the MSDN article on creating custom blocks](https://msdn.microsoft.com/en-us/library/hh228606(v=vs.110).aspx) but you need to decide what the trigger conditions are — Panagiotis Kanavos, Oct 01 '15 at 10:03
@PanagiotisKanavos The desired behaviour is somewhat like in this question http://stackoverflow.com/questions/32717337/data-propagation-in-tpl-dataflow-pipeline-with-batchblock-triggerbatch. I did think about a custom block but that is also getting quite difficult for me to conceptualize. I have gone through the MSDN that you had outlined but somehow could not relate my requirements with it. The only way I could think of doing it was to maintain a list of inputs coming in with the help of a transform block (placed just before this batchblock) and clear the list everytime the Triggerbatch() — Ricky, Oct 01 '15 at 13:54
^on the Batchblock is successful. From the outside I could check if the list has any items in it to call TriggerBatch(). But then again this would also depend on the condition that the TriggerBatch was successful, and there is no way of knowing that. — Ricky, Oct 01 '15 at 13:55
@Ricky the linked question doesn't answer anything - you'll notice it doesn't have any answers either. It may describe what you think is the *solution*, but you forgot to mention *what is the problem*? What to you want your block to do? When do you want it to transmit its buffered data? Obviously not only when it reaches the batch size, that's already done by the BatchBlock. Every X seconds? Send the messages received in the last X seconds? Those *older* than X seconds? If you can't describe that in *two lines*, you haven't understood the requirement. — Panagiotis Kanavos, Oct 02 '15 at 08:59
@PanagiotisKanavos I think I did clearly specify the problem in these lines 'I want to trigger the Batchblock to send a group of items to the Transformblock only when one of my Actionblocks are available for operation. Till then the Batchblock should just keep buffering elements and not pass them on to the Transformblock. My batch-sizes are variable.As Batchsize is mandatory,I do have a really high upper-limit for BatchBlock batch size, however I really don't wish to reach that limit, I would like to trigger my batches depending on the availability of the Actionblocks performing the said task' — Ricky, Oct 05 '15 at 09:13
@PanagiotisKanavos I did say that using a timer is not a solution for me, so X seconds and older than X seconds are not really the questions I am interested in. So in _two lines_ 'The batch should be triggered from an `ActionBlock`at the end of the Pipeline, as the last thing the `ActionBlock` does'. I hope I could make the requirement clear now. — Ricky, Oct 05 '15 at 09:24
Worse. Now it sounds like you are confusing a batch block with a buffer block. A batch block doesn't *buffer* messages, it creates one big batch of them and sends them downstream, eg as a single array. What you describe though is the behaviour of any block with a BoundedCapacity limit, eg BufferBlock,ActionBlock. This *will* stop upstream steps if the block is busy. The question would make sense if you *didn't* want to send individual buffered messages downstream, but wanted to send an array of the buffered messages to be processed by the next block — Panagiotis Kanavos, Oct 05 '15 at 09:40
@PanagiotisKanavos As far as I know you are completely wrong in saying that a BatchBlock _doesn't buffer_ messages, since it infact does, otherwise how would you explain _creating one big batch of them_ if it is not _storing/buffering_ them in its _internal buffer_ for e.g even the TransformBlock has two _internal buffers_ one input and one output (refer [http://blog.stephencleary.com/2012/09/introduction-to-dataflow-part-2.html]). I honestly do not understand your confusion here.I suggest reading the question in its entirety before commenting. I have clearly mentioned I need batches of inputs — Ricky, Oct 05 '15 at 11:09
@PanagiotisKanavos where did you get the notion that I need individual buffered messages? I don't! I need batches of inputs or in other words arrays of inputs which a `Batchblock` is perfect for. I have not mentioned once in either of my questions that I need single inputs. So for me the question does make sense. — Ricky, Oct 05 '15 at 11:15
So *I'm* dumb, but both your questions haven't attracted any other answers. Perhaps the MS employees that typically answer dataflow questions are on vacation, even though they did comment once. Or you are trying to phrase the question in the wrong terms. The question in a single line seems to be: *"I want to create a block that will buffer incoming messages as long as subsequent steps are blocked, then emit them in a single batch when the downstream block is unblocked"*. — Panagiotis Kanavos, Oct 05 '15 at 11:40
@PanagiotisKanavos Thank you for your version of my question. Since you do seem to understand the question now, do you have a possible solution in mind? Also, I don't think not having answers really reflects on anything, I did ask the same question to Stephen Cleary himself and he advised me to ask it on stackoverflow, so there is a slight possibility of it being a thought provoking question which may/may not be straightforward. It would be great if you could use your knowledge to answer the question rather than framing it in a different way! — Ricky, Oct 05 '15 at 14:44

score 0 · Answer 1 · edited May 23 '17 at 11:45

0

Using DataflowBlock.Encapsulate you can create a custom block where you manage the input buffer yourself and can push batches of any size based on your own conditions: https://stackoverflow.com/a/36437112/94853

edited May 23 '17 at 11:45

Community

1
1

answered Apr 05 '16 at 21:37

Loren Paulsen

8,960
1
28
38

TPL Dataflow Batchblock check for elements in Input buffer

1 Answers1