Deadlock from combination of async/await, CancellationTokenSource, Threading.Timer

Question

In the process of writing a wrapper to arbitrary cancellable code that's supposed to run while a certain condition holds true (which has to be verified regularly), I came across interesting behavior in the interaction between CancellationTokenSource, Threading.Timer, and async/await generated code.

In a nutshell, what it looks like is that if you have some cancellable Task that you're awaiting on and then you cancel that Task from a Timer callback, the code that follows the cancelled task executes as part of the cancel request itself.

In the program below, if you add tracing you'll see that execution of the Timer callback blocks in the cts.Cancel() call, and that the code after the awaited task that gets cancelled by that call executes in the same thread as the cts.Cancel() call itself.

The program below is doing the following:

Create cancellation token source for cancelling work that we're going to simulate;
create timer that will be used to cancel work after it has started;
program timer to go off 100ms from now;
start said work, just idling for 200ms;
- timer callback kicks in, cancelling the Task.Delay "work", then sleeping for 500ms to demonstrate that timer disposal waits for this;
verify that work gets cancelled as expected;
cleanup timer, making sure that the timer does not get invoked after this point, and that if it was already running we block here waiting for it to complete (pretend there was more work afterwards that would not work properly if it the timer callback was running at the same time).

namespace CancelWorkFromTimer
{
    using System;
    using System.Diagnostics;
    using System.Threading;
    using System.Threading.Tasks;

    class Program
    {
        static void Main(string[] args)
        {
            Stopwatch sw = Stopwatch.StartNew();
            bool finished = CancelWorkFromTimer().Wait(2000);
            Console.WriteLine("Finished in time?: {0} after {1}ms; press ENTER to exit", finished, sw.ElapsedMilliseconds);
            Console.ReadLine();
        }

        private static async Task CancelWorkFromTimer()
        {
            using (var cts = new CancellationTokenSource())
            using (var cancelTimer = new Timer(_ => { cts.Cancel(); Thread.Sleep(500); }))
            {
                // Set cancellation to occur 100ms from now, after work has already started
                cancelTimer.Change(100, -1);
                try
                {
                    // Simulate work, expect to be cancelled
                    await Task.Delay(200, cts.Token);
                    throw new Exception("Work was not cancelled as expected.");
                }
                catch (OperationCanceledException exc)
                {
                    if (exc.CancellationToken != cts.Token)
                    {
                        throw;
                    }
                }
                // Dispose cleanly of timer
                using (var disposed = new ManualResetEvent(false))
                {
                    if (cancelTimer.Dispose(disposed))
                    {
                        disposed.WaitOne();
                    }
                }

                // Pretend that here we need to do more work that can only occur after
                // we know that the timer callback is not executing and will no longer be
                // called.

                // DO MORE WORK HERE
            }
        }
    }
}

The simplest way of making this work as I was expecting it to work when I first wrote it is to use cts.CancelAfter(0) instead of cts.Cancel(). According to documentation, cts.Cancel() will run any registered callbacks synchronously, and my guess is that in this case, with the interaction with async/await generated code, all code that's after the point where the cancellation took place is running as part of that. cts.CancelAfter(0) decouples the execution of those callbacks from its own execution.

Has anyone run into this before? In a case like this, is cts.CancelAfter(0) the best option to avoid the deadlock?

If you eliminate the `using(var disposed = ...)` block it works as expected. How sure are you that this (second) dispose of the timer is necessary? — H H, May 18 '13 at 13:54
@HenkHolterman: In this example it wouldn't matter but if there was code after it that couldn't be run safely until we knew that the timer callback was no longer running and wouldn't be called again, then that block is the way to ensure that ([MSDN docs](http://msdn.microsoft.com/en-us/library/b97tkt95.aspx)) — observer, May 18 '13 at 21:47

score 1 · Accepted Answer · answered May 18 '13 at 18:28

1

This behavior is because an async method's continuation is scheduled with TaskContinuationOptions.ExecuteSynchronously. I ran into a similar issue and blogged about it here. AFAIK, that's the only place this behavior is documented. (As a side note, it's an implementation detail and could change in the future).

There are a few alternative approaches; you'll have to decide which one is best.

First, is there any way the timer could be replaced by CancelAfter? Depending on the nature of the work after the cts is cancelled, something like this might work:

async Task CleanupAfterCancellationAsync(CancellationToken token)
{
  try { await token.AsTask(); }
  catch (OperationCanceledException) { }
  await Task.Delay(500); // remainder of the timer callback goes here
}

(using AsTask from my AsyncEx library; it's not hard to build AsTask yourself if you prefer)

Then you could use it like this:

var cts = new CancellationTokenSource();
var cleanupCompleted = CleanupAfterCancellationAsync(cts.Token);
cts.CancelAfter(100);
...
try
{
  await Task.Delay(200, cts.Token);
  throw new Exception("Work was not cancelled as expected.");
}
catch (OperationCanceledException exc) { }
await cleanupCompleted;
...

Or...

You could replace the Timer with an async method:

static async Task TimerReplacementAsync(CancellationTokenSource cts)
{
  await Task.Delay(100);
  cts.Cancel();
  await Task.Delay(500); // remainder of the timer callback goes here
}

Used as such:

var cts = new CancellationTokenSource();
var cleanupCompleted = TimerReplacementAsync(cts);
...
try
{
  await Task.Delay(200, cts.Token);
  throw new Exception("Work was not cancelled as expected.");
}
catch (OperationCanceledException exc) { }
await cleanupCompleted;
...

Or...

You could just kick off the cancellation in a Task.Run:

using (var cancelTimer = new Timer(_ => { Task.Run(() => cts.Cancel()); Thread.Sleep(500); }))

I don't like this solution as well as the others because you still end up with synchronous blocking (ManualResetEvent.WaitOne) inside an async method which isn't recommended.

answered May 18 '13 at 18:28

Stephen Cleary

437,863
77
675
810

@StephenClearly: Your alternatives would work well for my sample (which I may have oversimplified a bit too much), but I have a few more restrictions for the real code where I found this problem: the timer is not just for cancellation but it's for periodically renewing ownership of a resource that the being done depends on (here replaced with the `Task.Delay` call). If it's unable to renew ownership in the resource it has to abort the work as soon as possible. When the work completes without being be cancelled, I want to stop timer and any renewal attempts from it, then release the resource. – observer May 18 '13 at 22:04
(cont.) For the sake of simplicity of implementation I think I can get away with the assumption that the release of the resource would always win over the renewal. What caught me by surprise was seeing all code after `Task.Delay` being executed as part of the `Cancel` call that cancelled it. I think I understand why now, which is why I think I'll explore using `CancelAfter(0)` from the callback instead of `Cancel` (which I know by itself _fixes_ the issue in the sample) and probably looking into using `ThreadPool.RegisterWaitForSingleObject` to wait for Dispose(). – observer May 18 '13 at 22:15
(cont.) Of course I'd take into account your great comments about `ThreadPool.RegisterWaitForSingleObject` in [this thread](http://social.msdn.microsoft.com/Forums/en-US/async/thread/f3f062cb-ac66-4abd-86b0-8d82193ec404/). :) – observer May 18 '13 at 22:17
I would recommend `Task.Run` over `CancelAfter(0)` because you never know if a future framework update will "optimize" `CancelAfter(0)` into a plain `Cancel()`. Followup question: is it possible to restructure your logic so that the resource is only acquired when needed? .NET 4.5 adds `async`-compatible methods to the `SemaphoreSlim` type, so now `SemaphoreSlim` can be used to share resources between `async` and synchronous code. – Stephen Cleary May 18 '13 at 23:06
There are also a lot of other `async` techniques available, but right now they're spread out in different resources (I've got a lot of them on my blog, but not all of them). If you could collect semantic requirements (and post them in a different question), then we might find a better solution. – Stephen Cleary May 18 '13 at 23:10
`CancelAfter(0)` cannot be changed like that since plain `Cancel` is supposed to block while registered callbacks execute; for `CancelAfter` that's not the case. I'll try to rework this with those other techniques... It still feels weird that the `Cancel` as used in the sample leads to all code after the `Task.Delay` being executed in that call (because internally it's being treated as if it's part of a registered callback to Cancel). Makes async/await feel kind of tacked-on rather than a proper language feature if your code can get this behavior out of nowhere. – observer May 19 '13 at 00:21
I can't find where `CancelAfter` is guaranteed to not block. The `Cancel` behavior matches `TaskCompletionSource`, and it only causes a problem if you're mixing asynchronous code with synchronous blocking (which you're not really supposed to do). I do agree it is frustrating, though, and when I first ran across this I [reported it to Connect as a bug](https://connect.microsoft.com/VisualStudio/feedback/details/769322). I didn't think they'd fix it though b/c it would decrease `async` performance for everyone. – Stephen Cleary May 19 '13 at 00:32

Deadlock from combination of async/await, CancellationTokenSource, Threading.Timer

1 Answers1