-1

When using System.Threading.Tasks.Dataflow, if I link block a to block b, will the link keep b alive? Or do I need to keep a reference to b around to prevent it from being collected?

internal class SomeDataflowUser
{
    public SomeDataflowUser()
    { 
        _a = new SomeBlock();
        var b = new SomeOtherBlock();
        _a.LinkTo(b);
    }

    public void ReactToIncomingMessage( Something data )
    {    
        // might b be collected here?
        _a.Post( data );
    }

    private ISourceBlock<Something> _a;
}
Haukinger
  • 10,420
  • 2
  • 15
  • 28
  • 2
    Is there something keeping `a` alive? `a` and `b` appear to be uninitialized local variables here; what are they? Is there some reason why you're not trusting the garbage collector to do its job correctly? What are you really asking here? – Eric Lippert Mar 22 '18 at 20:45
  • @Eric: My question is about the inner workings of dataflow: does linking prevent garbage collection? – Haukinger Mar 23 '18 at 07:42
  • @Lasse: "link a to b" means calling `LinkTo` on `a` with parameter `b` – Haukinger Mar 23 '18 at 07:45
  • Those are managed objects. The garbage collector's only job is to correctly manage their lifetimes. Trust the GC. – Eric Lippert Mar 23 '18 at 14:40
  • 1
    @Eric: I trust the GC, but I don't know whether `LinkTo` establishes a link that the GC recognizes or whether it creates a weak reference. That's not completely impossible, think `PropertyObserver` or Prism's `EventAggregator` that both do not keep the subscriber alive. Dataflow might well think that only blocks I have an explicit reference to are considered active. – Haukinger Mar 23 '18 at 14:48

3 Answers3

3

You're confusing variables with variable contents. They can have completely different lifetimes.

Local variable b is no longer a root of the GC once control leaves the block. The object that was referenced by the reference stored in b is a managed object, and the GC will keep it alive at least as long as it's reachable from a root.

Now, note that the GC is allowed to treat local variables as dead before control leaves the block. If you have:

var a = whatever;
a.Foo(); 
var b = whatever;
// The object referred to by `a` could be collected here. 
b.Foo();
return;

Because for example maybe the jitter decides that b can use the same local store as a since their usages do not overlap. There is no requirement that the object referred to by a stays alive as long as a is in scope.

This can cause issues if the object has a destructor with a side effect that you need to delay until the end of the block; this particularly happens when you have unmanaged code calls in the destructor. In that case use a keep-alive to keep it alive.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • I suppose, this looks different for fields? For local variables one has to use the likes of `GC.KeepAlive`, of course, to keep them uncollected. I realize my example is not chosen too well and probably complicating things. – Haukinger Mar 23 '18 at 15:00
  • @Haukinger: Instance fields live as long as the object containing them lives. – Eric Lippert Mar 23 '18 at 15:03
  • Exactly, so my question becomes: do I need a field for every block? – Haukinger Mar 23 '18 at 15:22
2

In addition to @Eric great explanation of GC behavior I want to address the special case related to TPL-Dataflow. You can easily see the behavior that LinkTo yields from a simple test. Notice that nothing, to my knowledge, is holding on to b except for its link to a.

[TestFixture]
public class BlockTester
{

    private int count;

    [Test]
    public async Task Test()
    {
        var inputBlock = BuildPipeline();
        var max = 1000;
        foreach (var input in Enumerable.Range(1, max))
        {
            inputBlock.Post(input);
        }
        inputBlock.Complete();

        //No reference to block b
        //so we can't await b completion
        //instead we'll just wait a second since
        //the block should finish nearly immediately
        await Task.Delay(TimeSpan.FromSeconds(1));
        Assert.AreEqual(max, count);
    }

    public ITargetBlock<int> BuildPipeline()
    {
        var a = new TransformBlock<int, int>(x => x);
        var b = new ActionBlock<int>(x => count = x);
        a.LinkTo(b, new DataflowLinkOptions() {PropagateCompletion = true});
        return a;
    }
}
JSteward
  • 6,833
  • 2
  • 21
  • 30
1

Yes, linking a dataflow block is enough to prevent it from being garbage collected. Not only that, but even with no references whatsoever, by just having work to do, the block stays alive until its work is done. Here is a runnable example:

public static class Program
{
    static void Main(string[] args)
    {
        StartBlock();
        Thread.Sleep(500);
        for (int i = 5; i > 0; i--)
        {
            Console.WriteLine($"Countdown: {i}");
            Thread.Sleep(1000);
            GC.Collect();
        }
        Console.WriteLine("Shutting down");
    }

    static void StartBlock()
    {
        var block = new ActionBlock<int>(item =>
        {
            Console.WriteLine("Processing an item");
            Thread.Sleep(1000);
        });
        for (int i = 0; i < 10; i++) block.Post(i);
    }
}

Output:

Processing an item
Countdown: 5
Processing an item
Countdown: 4
Processing an item
Countdown: 3
Processing an item
Countdown: 2
Processing an item
Countdown: 1
Processing an item
Shutting down
Press any key to continue . . .

As long as there is a foreground thread still alive in the process, the block keeps going.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104