Using Task.WhenAll for multiple async and fake async methods

Question

A colleague of mine has refactored our controller methods so that all of our IO operations, including the synchronous ones, are encapsulated in separate tasks and then all those tasks are executed in parallel via Task.WhenAll. I can definitely understand the idea: we use more threads but all of our IO operations (and we can have quite a few) are executed with the speed of the slowest one, but I'm still unsure whether it is a correct path to go. Is it a valid approach or I'm missing something? Is the cost of using more threads going to be noticeable in a typical ASP.Net website application? Here is some sample code

public async Task<ActionResult> Foo() {
    var dataATask = _dataARepository.GetDataAsync();
    var dataBTask = Task.Run(_dataBRepository.GetData());
    await Task.WhenAll(dataATask, dataBTask);
    var viewModel = new ViewModel(dataATask.Result, dataBTask.Result);
    return View(viewModel);
}

Because it happens to be this way and refactoring it is way outside the scope of the task. — AndreySarafanov, Dec 18 '18 at 00:40
this is where your problem really is. `async` is only effective when chained all the way down to the IO operation. Wrapping a sync method in a `Task.Run` will be even worse that just calling it inline because you are still blocking a thread, but now you also add the overhead of the state machine. — zaitsman, Dec 18 '18 at 00:41
My understanding might be lacking. I realize that I won't be able to reuse the thread to serve another request while a `Task.Run` operation is running but will launching multiple syncronous operations with `await Task.WhenAll` increase the speed with which a single request is returned? — AndreySarafanov, Dec 18 '18 at 00:46
@Jacob Task.Run always schedules work on threadpool. [MSDN article](https://learn.microsoft.com/en-us/dotnet/standard/asynchronous-programming-patterns/task-based-asynchronous-pattern-tap) you are talking about is not about Task.Run ... — Alexei Levenkov, Dec 18 '18 at 00:47
The thing is, Task.Run will block _another_ thread, I do realize it. The thing is to serve the request faster. I will sacrifice other threads by blocking them. — AndreySarafanov, Dec 18 '18 at 00:50

score 4 · Accepted Answer · answered Dec 18 '18 at 01:02

In general you code is ok - it will consume more threads and a bit more CPU than original, but unless your site takes heavy load it would unlikely significantly impact overall performance. Obviously you need to measure it yourself for your particular load (including some sort of stress level load of 5-10x regular traffic).

Wrapping synchronous method in Task.Run is not best practice (see Should I expose asynchronous wrappers for synchronous methods?). It may work for you as long as trading extra threads for such behavior is acceptable for your case.

If you only have one synchronous operation left you can instead keep it synchronous and await the rest at the end of synchronous step saving that extra thread:

var dataATask = _dataARepository.GetDataAsync();
var dataBTaskResult = _dataBRepository.GetData();
await Task.WhenAll(dataATask); // or just await dataATask if you have only one.
var viewModel = new ViewModel(dataATask.Result, dataBTaskResult);

I understand it the same way, but I'd love to have some understanding of the amount of threads in a threadpool and some theoretical idea of the load when this no longer speeds up the site. I realize it should be pretty low or it wouldn't be considered such a bad practice. I _can_ do the testing but it might just be time lost for checking a nonsensical idea (which it most likely is). — AndreySarafanov, Dec 18 '18 at 01:14
@AndreySarafanov number of thread in the threadpool - a lot (https://stackoverflow.com/a/6000891/477420), but if you use the most of them (i.e. you consume 10 threads per request for your async-sync wrappers and there is enough load) there will be not enough threads to *continue* execution of requests. You should be able to do back of envelope calculation on how many threads your site is going to use at normal/peak load (don't forget that each operation takes time) and decide if you have enough threads. — Alexei Levenkov, Dec 18 '18 at 01:32

score 1 · Answer 2 · answered Dec 18 '18 at 07:45

Consider these two approaches.

Original:

public async Task<ActionResult> Foo() 
{
    var dataATask = _dataARepository.GetDataAsync();
    var dataBTask = Task.Run(_dataBRepository.GetData());
    await Task.WhenAll(dataATask, dataBTask);
    var viewModel = new ViewModel(dataATask.Result, dataBTask.Result);
    return View(viewModel);
}

^This version will create a new thread to call _dataBRepository.GetData(). The additional thread will block until the call is complete. While waiting for the additional thread to complete, the main thread will yield control back to the ASP.NET pipeline, where it may handle some other user's request.

Different:

public async Task<ActionResult> Foo() 
{
    var dataATask = _dataARepository.GetDataAsync();
    var dataBResult = _dataBRepository.GetData();
    await dataATask;
    var viewModel = new ViewModel(dataATask.Result, dataBResult);
    return View(viewModel);
}

^This version does not spin up a separate thread for dataBRepository.GetData(). But it does block the main thread.

So your choice is:

Spin up another thread, just so you can yield the main thread to some other task.
Hang on to the main thread. If some other task needs a thread, it'll have to spin up its own.

In both cases you are using one thread at a time (for the most part). In both cases the transaction will complete in the time required by the slower of the two back-end transactions. But the original option spins up a new thread and yields the current one. This seems like extra work that is not needed.

On the other hand, if the action requires two or more synchronous back-end transactions, the original option would complete faster, because they could run in parallel. Assuming that they support it. If the back-end transactions are database transactions and use the same database, they are likely to block each other, in which case you still won't see any benefit.

I intentionally simplified the example, I'm talking about multiple sync tasks, should've stated that clearer. They are often db transactions but from different enough ares not to block each other. And I do understand what's happening behind the curtains, my question was mostly about practical usefulness of this approach. the thread overhead might simply outshadow the speed boost. — AndreySarafanov, Dec 18 '18 at 09:54
And obviously we should make more of our IO async but that's quite a lot of work for future times. — AndreySarafanov, Dec 18 '18 at 09:55
Well you can have as many threads as you want... by adding servers. I take it this is load balanced? — John Wu, Dec 18 '18 at 10:08
I don't think I deserve the sarcasm. But your answer is that it's unreasonable in most real environments, I presume. Thx for it. — AndreySarafanov, Dec 18 '18 at 10:14
No you don't, and I wasn't being sarcastic. The point is that there are competing concerns. You will get maximum number of transactions/users without spinning up multiple threads, so if you have sparse hardware, that could be the way to go. But if you have a beefy data center and you just want transactions to finish more quickly, maybe spinning up new threads for parallel work is just what you need. — John Wu, Dec 19 '18 at 01:53

score 1 · Answer 3 · answered Dec 18 '18 at 10:03

1

Besides what Alexei Levenkov mentioned about exposing synchronous wrappers for asynchronous methods, using Task.Run on an ASP.NET application will do more harm than good. Each Task.Run will cause 2 thread pool scheduling and context switches with no benefit whatsoever.

answered Dec 18 '18 at 10:03

Paulo Morgado

14,111
3
31
59

I believe there's some benefit and I've described it clear enough. The harm might be much bigger though, yes. – AndreySarafanov Dec 18 '18 at 10:05
1

@AndreySarafanov, there's no benefit. There are always better ways of doing it, depending on the use case. – Paulo Morgado Dec 18 '18 at 13:59

score 0 · Answer 4 · answered Dec 18 '18 at 11:26

Normally it is not useful to start a different thread to do some synchronous stuff, if all you do is wait until this thread is finished.

Compare:

async Task<int> ProcessAsync()
{
    var task = Task.Run(() => DoSomeHeavyCalculations());
    int calculationResult = await task;
    return calculationResult;
}

with

int ProcessAsync()
{
    return DoSomeHeavyCalculations();
}

Apart from that the async function uses more brainpower, it would limit re-usage of the function: only async callers can use it. Let your caller decide whether he wants to call you asynchronously or not. If he also has nothing to do but wait, he can let his caller decide, etc.

By the way, this is exactly what GetData does: it doesn't force callers like you to be asynchronous, it gives you the freedom to call it synchronous or use Task.Run to call it asynchronous.

However, if you function has something else to do during the heavy calculations, then it might be useful:

async Task<int> ProcessAsync()
{
    var task = Task.Run(() => DoSomeHeavyCalculations());

    // while the calculations are being performed, do something else:
    DoSomethingElse();

    return await task;
}

In your example: if there is only one "heavy calculation", do this yourself. If there are several heavy calculations that you can consider using Task.Run. But don't just order other threads to do something without doing anything yourself.

Using Task.WhenAll for multiple async and fake async methods

4 Answers4