How to tell if an async computation is sent to the thread pool?

Question

I was recently informed that in

async {
    return! async { return "hi" } }
|> Async.RunSynchronously
|> printfn "%s"

the nested Async<'T> (async { return 1 }) would not be sent to the thread pool for evaluation, whereas in

async {
    use ms = new MemoryStream [| 0x68uy; 0x69uy |]
    use sr = new StreamReader (ms)
    return! sr.ReadToEndAsync () |> Async.AwaitTask }
|> Async.RunSynchronously
|> printfn "%s"

the nested Async<'T> (sr.ReadToEndAsync () |> Async.AwaitTask) would be. What is it about an Async<'T> that decides whether it's sent to the thread pool when it's executed in an asynchronous operation like let! or return!? In particular, how would you define one which is sent to the thread pool? What code do you have to include in the async block, or in the lambda passed into Async.FromContinuations?

`ReadToEndAsync ()` queues an IO task on the Task pool. `Async` is just capturing the continuation of that. — Asti, Mar 15 '21 at 16:16
@Asti Thanks. How does `ReadToEndAsync` do that? Is there a class which has methods like `QueueIOTaskOnTaskPool`? — CarbonFlambe, Mar 16 '21 at 10:55

Fyodor Soikin · Answer 1 · 2021-03-15T12:56:52.423

TL;DR: It's not quite like that. The async itself doesn't "send" anything to the thread pool. All it does is just run continuations until they stop. And if one of those continuations decides to continue on a new thread - well, that's when thread switching happens.

Let's set up a small example to illustrate what happens:

let log str = printfn $"{str}: thread = {Thread.CurrentThread.ManagedThreadId}"

let f = async {
  log "1"
  let! x = async { log "2"; return 42 }
  log "3"
  do! Async.Sleep(TimeSpan.FromSeconds(3.0))
  log "4"
}

log "starting"
f |> Async.StartImmediate
log "started"
Console.ReadLine()

If you run this script, it will print, starting, then 1, 2, 3, then started, then wait 3 seconds, and then print 4, and all of them except 4 will have the same thread ID. You can see that everything until Async.Sleep is executed synchronously on the same thread, but after that async execution stops and the main program execution continues, printing started and then blocking on ReadLine. By the time Async.Sleep wakes up and wants to continue execution, the original thread is already blocked on ReadLine, so the async computation gets to continue running on a new one.

What's going on here? How does this function?

First, the way the async computation is structured is in "continuation-passing style". It's a technique where every function doesn't return its result to the caller, but calls another function instead, passing the result as its parameter.

Let me illustrate with an example:

// "Normal" style:
let f x = x + 5
let g x = x * 2
printfn "%d" (f (g 3)) // prints 11

// Continuation-passing style:
let f x next = next (x + 5)
let g x next = next (x * 2)
g 3 (fun res1 -> f res1 (fun res2 -> printfn "%d" res2))

This is called "continuation-passing" because the next parameters are called "continuations" - i.e. they're functions that express how the program continues after calling f or g. And yes, this is exactly what Async.FromContinuations means.

Seeming very silly and roundabout on the surface, what this allows us to do is for each function to decide when, how, or even if its continuation happens. For example, our f function from above could be doing something asynchronous instead of just plain returning the result:

let f x next = httpPost "http://calculator.com/add5" x next

Coding it in continuation-passing style would allow such function to not block the current thread while the request to calculator.com is in flight. What's wrong with blocking the thread, you ask? I'll refer you to the original answer that prompted your question in the first place.

Second, when you write those async { ... } blocks, the compiler gives you a little help. It takes what looks like a step-by-step imperative program and "unrolls" it into a series of continuation-passing calls. The "breaking" points for this unfolding are all the constructs that end with a bang - let!, do!, return!.

The above async block, for example, would look somethiing like this (F#-ish pseudocode):

let return42 onDone = 
  log "2"
  onDone 42

let f onDone =
  log "1"
  return42 (fun x ->
    log "3"
    Async.Sleep (3 seconds) (fun () ->
      log "4"
      onDone ()
    )
  )

Here, you can plainly see that the return42 function simply calls its continuation right away, thus making the whole thing from log "1" to log "3" completely synchronous, whereas the Async.Sleep function doesn't call its continuation right away, instead scheduling it to be run later (in 3 seconds) on the thread pool. That's where the thread switching happens.

And here, finally, lies the answer to your question: in order to have the async computation jump threads, your callback passed to Async.FromContinuations should do anything but call the success continuation immediately.

A few notes for further investigation

The onDone technique in the above example is technically called "monadic bind", and indeed in real F# programs it's represented by the async.Bind method. This answer might also be of help understanding the concept.
The above is a bit of an oversimplification. In reality the async execution is a bit more complicated than that. Internally it uses a technique called "trampoline", which in plain terms is just a loop that runs a single thunk on every turn, but crucially, the running thunk can also "ask" it to run another thunk, and if it does, the loop will do so, and so on, forever, until the next thunk doesn't ask to run another thunk, and then the whole thing finally stops.
I specifically used Async.StartImmediate to start the computation in my example, because Async.StartImmediate will do just what it says on the tin: it will start running the computation immediately, right there. That's why everything ran on the same thread as the main program. There are many alternative starting functions in the Async module. For example, Async.Start will start the computation on the thread pool. The lines from log "1" to log "3" will still all happen synchronously, without thread switching between them, but it will happen on a different thread from log "start" and log "starting". In this case thread switching will happen before the async computation even starts, so it doesn't count.

Where is the code that makes the thread switching happen? In `Async.Sleep`? In `AsyncBuilder`? — CarbonFlambe, Mar 17 '21 at 02:23
As I explained in the answer: "... _Async.Sleep function doesn't call its continuation right away, instead scheduling it to be run later_ ..." [Here's the precise line](https://github.com/dotnet/fsharp/blob/3815b8f5d1a0089898428281d2139b0e6f5cd7ed/src/fsharp/FSharp.Core/async.fs#L1383) if you insist. — Fyodor Soikin, Mar 17 '21 at 02:30

How to tell if an async computation is sent to the thread pool?

1 Answers1

Linked