4

I have a flow setup in the following way:

_publisherQueue = CreateBuffer();
var batchingBlock = CreateBatchBlock(options.BatchSize);
var debounceBlock = CreateDebounceBlock(options.DebounceInterval, batchingBlock.TriggerBatch);
var publishBlock = CreatePublishBlock();
var groupByTopicBlock = CreateGroupByTopicBlock(publishBlock);

_publisherQueue.LinkTo(debounceBlock, new DataflowLinkOptions { PropagateCompletion = true});
debounceBlock.LinkTo(batchingBlock, new DataflowLinkOptions { PropagateCompletion = true });
batchingBlock.LinkTo(groupByTopicBlock, new DataflowLinkOptions { PropagateCompletion = true });

where:

  • CreateDebounceBlock return a transform block (with a timer to trigger the batchblock)
  • CreateGroupByTopicBlock returns an ActionBlock whose Action triggers the Action block returned by CreatePublishBlock

I cannot dispose the links because this flow should live for the entire life time of the program (in this case it is a Windows Service).

I have noticed that every time I invoke _publisherQueue (which is a BufferBlock) some memory is used, which is normal, However after the process is finished the memory allocated is not being released.

This is worrisome due to the fact that this is a long running process that will accept inputs at random intervals.

It's my first attempt at using TPL so most probably I am not doing proper disposal. However I'm not sure what I need to dispose of since I need these structures to remain alive throughout the life time of the program.

VMAtm
  • 27,943
  • 17
  • 79
  • 125
Jonny
  • 2,787
  • 10
  • 40
  • 62
  • What do you mean by "process is finished" and how are you measuring that memory is still allocated? – Peter Ritchie Dec 12 '16 at 19:14
  • @PeterRitchie Using the diagnostic tools in VS. 'Process is Finished' means that I added something to the buffer, it got through the whole flow and got Published. At which point I expect memory that was allocated to execute the flow to be released – Jonny Dec 12 '16 at 19:19
  • Why? If you don't tell the flow that it's finished with `Complete()`, there is no reason to release anything. You may post another message in the next nanosecond. Flows are expected to handle a lot of messages, especially when used in services. – Panagiotis Kanavos Dec 13 '16 at 17:23
  • 3
    The memory you see in the Diagnostic Tools window is most likely available for garbage collection. It *won't* be collected unless the GC runs (shows with an orange marker). Or you may have a memory leak. Take two snapshots in the memory tab to see which new objects survived and where they were allocated – Panagiotis Kanavos Dec 13 '16 at 17:27
  • @PanagiotisKanavos Thanks for your comments. Actually, the Complete method is what I am most curious about. I am not sure when to call Complete. I agree I can possibly post another message in the next nanosecond and hence I can only call the the Complete method when the program is shutting down, if I am understanding correctly what the Complete method does. – Jonny Dec 13 '16 at 17:50
  • @jonny that's the challenge of dataflow: creating a protocol where where "complete" is specific and detectable. – Peter Ritchie Dec 13 '16 at 18:51

1 Answers1

2

I have concerns about this part:

CreateGroupByTopicBlock returns an ActionBlock whose Action triggers the Action block returned by CreatePublishBlock

Looks like a closure here, which easily can lead to memory leaks, as it's being compiled into a internal class, with storing all the references from it in fields. You should investigate your application with some memory profiler (either built-in VS profiler or some external, like dotTrace) and see if there are any objects being held by reference inside this closure, and, may, rewrite your logic to avoid unnecessary closures in your code.

VMAtm
  • 27,943
  • 17
  • 79
  • 125