Why is async considered better performing than multithreading?

Question

I understand both asynchronous and multithreaded programming and I have done both and can do them with ease. However one thing still bugs me: why is the general consensus that async is better performing than multithreading? (added: I'm talking about the case where either approach is viable and you get to make a choice)

At the first glance the reason seems clear - less threads, less work for the OS scheduler, less memory wasted for stack space. But... I don't feel like these arguments hold water. Let's look at them individually:

Less work for the OS scheduler. True, but does it mean less work in total? There are still N tasks running in parallel, SOMEBODY has to switch between them. Seems to me like we've simply taken the work from the OS kernel and started doing it in our own userland code. But the amount of work that needs to be done hasn't changed because of it. Where then does the efficiency come from?
Less memory wasted for stack space. Or is it?
- First of all, I don't know about other OS'es, but at least in Windows the stack space for a thread isn't committed all at once. There are virtual memory addresses reserved, but actual memory is committed only when it's needed.
- And even if it was committed, it wouldn't matter much, because just allocating memory doesn't slow your program down. Not unless you run out of it, and modern computers have enough memory for thousands of stacks, especially servers.
- And even if the stacks DO get committed and DO end up causing a memory shortage, most stacks will only be used a bit at the start (if your program is flirting with a stack overflow, you've got bigger problems to worry about). Which means that it will be possible to page most of them out anyway.
- The real problem with large memory usage is that the CPU cache gets trashed a lot. When you've got lots of data all over the place that you need, and the CPU cache cannot keep up with it all and needs to fetch things from main RAM again and again - that's when things get slow. But async programming doesn't help in this in any way. If anything, it actively uses more memory. Instead of a lean stack frame we now have separate Task objects allocated on the heap essentially for every stack frame which contain state and local variables and callback references and everything. Plus it's fragmented all over the address space which gives even more headaches to the CPU cache, because pre-fetching will be useless.

So... which elephant in the room have I missed?

fashion. These fads for "oooh, shiny and new" come all the time, and then go just as quickly. One of my favourite quotes is "nothing sensible ever goes out of fashion" - meaning the right way of doing things will always be the right way, regardless of what everyone thinks is "better" this week. — gbjbaanb, Mar 17 '17 at 12:24
@gbjbaanb - That could be it, but I'm not so sure. A lot of really smart guys are saying that this is better, so perhaps there's something more to it. I'd like to give them the benefit of doubt at least. — Vilx-, Mar 17 '17 at 12:25
Async operation and multithreading are orthogonal concept, even if the former can usa the latter in many implementations. — Luca, Mar 17 '17 at 12:53
@Luca - Yes, but in many cases you could solve your problems either way. For example, the typical case today is the parallel processing of incoming HTTP requests. And it seems to me that writing multithreaded code would actually be **easier** in this case. There aren't many shared resources there and synchronization is rarely necessary, so with the "1 thread per request" approach you can almost always write that code in the synchronous mindset, which is a lot easier than the asynchronous one. — Vilx-, Mar 17 '17 at 13:02
One reason could be that threads used to be much more expensive than today. While searching for an article I once read about this (thread per connection vs async), I came across [this interesting read](http://highscalability.com/blog/2013/5/13/the-secret-to-10-million-concurrent-connections-the-kernel-i.html). — Danny_ds, Mar 17 '17 at 13:12
_Is_ the overall consensus that async is better performing? async/await (at least in C# land) is _easier to write_ than multithreading (so you get better "programmer" performance), because it takes away most of the pain points, but is it better performing? I don't know... — James Thorpe, Mar 17 '17 at 13:14
@DavidHaim - I'm talking about the `async`/`await` concept that is becoming ever more popular in various programming languages these days. C#, ecmascript (javascript), typescript, probably others that I don't follow. — Vilx-, Mar 17 '17 at 13:15
@JamesThorpe - Well, I don't have any specific links to give you, but as much as I've seen this talked about, this is the direction in which everyone has been leaning. As for writing code - perhaps, but I'd need to ponder about that a bit. The only paint point I see about multithreading is the need for synchronization, but async doesn't take that away as such - it only does that when you "emulate" multithreading with async on a single thread. Although that might be the typical scenario for desktop applications, so you might have a point there. — Vilx-, Mar 17 '17 at 13:20
sorry, it's not clear what you are asking about. is it about "task per thread vs. thread pool" or "async/await vs. how we did it before async/await"? — Andriy Tylychko, Mar 17 '17 at 14:32
Honestly, `async/await` are introduced because they *were* sync programming languages and some cases just need to have promises/await to execute correct. One reason for multithreading / sync code is because of HDD searches or API calls, wich means sync code would perform better. Async isn't preventing the code from running multithreaded, just that one function (atliest in EcmaScript). — Randy, Apr 10 '17 at 17:25
This is a good example of how fine the line is between 'opinion-based' or not. It has a [post on Meta](https://meta.stackoverflow.com/questions/347744/why-was-this-reopen-audit-question-not-considered-primarily-opinion-based) because it was used as an audit question on the reopen queue as an example of 'properly open', which is interesting considering it has now been closed. — Kelly S. French, Apr 11 '17 at 13:52
@KellyS.French - I read the meta post, then read it again, and I still don't understand it. What _IS_ that "audit" thing? :D Anyways, I can see why it was marked as "too broad" and I'm not upset about it. Much valuable discussion was done anyway for which I'm satisfied. — Vilx-, Apr 11 '17 at 14:55
This is a very valid, valuable and educated question, stating facts and looking for facts. It's a shame that the community has decided to close it. — Saeb Amini, Nov 07 '18 at 23:03
The question might be focused on a particular language, as each lang implements async/threads differently, which determines performance. Here's some benchmark for Python: https://testdriven.io/blog/concurrency-parallelism-asyncio/ — Marcin Wojnarski, Dec 18 '21 at 15:56

score 17 · Answer 1 · answered Mar 17 '17 at 15:23

why is the general consensus that async is better performing than multithreading? (added: I'm talking about the case where either approach is viable and you get to make a choice)

On the server side, async lets you make maximum use of threads. Why have one thread handle a single connection when it can handle hundreds? On the server side, it's not an "async vs threads" scenario - it's an "async and threads" scenario.

On the client side - where either approach is truly viable - it doesn't matter as much. So what if you spin up an extra unnecessary thread? It's just not that big of a deal, even for mobile apps these days. While technically, async can help be more efficient especially in a memory- and battery-constrained device, at this point in history it's not that terribly important. However, even on the client side, async has a tremendous benefit in that it allows you to write serial code rather than mucking around with callbacks.

There are still N tasks running in parallel, SOMEBODY has to switch between them.

No. I/O tasks as used by async do not "run" anywhere, and do not need to be "switched" to. On Windows, I/O tasks use IOCPs underneath, and I/O tasks do not "run" - they only "complete", which happens as the result of a system interrupt. More info in my blog post "There Is No Thread".

Where then does the efficiency come from?

The word "efficiency" is tricky. For example, an asynchronous HTTP server handler will actually respond more slowly than a synchronous handler. There's overhead to setting up the whole async thing with callbacks, etc. However, that slowdown AFAICT is unmeasurably small, and asynchronous code allows that server to handle more simultaneous requests than a synchronous server ever could (in real-world tests, we're talking 10x as a conservative estimate). Furthermore, asynchronous code is not limited by the thread injection rate of the thread pool, so asynchronous server code responds faster to sudden changes in load, reducing the number of request timeouts as compared to a synchronous server in the same scenario. Again, this is due to "async and threads", not "async instead of threads".

A few years ago, Node.js was heralded as an incredibly efficient server - based on real-world measurements. At the time, most ASP.NET apps were synchronous (writing asynchronous apps was quite hard before async, and companies knew it was cheaper to just pay for more server hardware). Node.js, in fact, only has one server thread that ever runs your app. It was 100% asynchronous, and that's where it got its scalability benefits from. ASP.NET took note of this, and ASP.NET Core (among other changes) made its entire stack asynchronous.

Although it's strictly true that for sudden changes in load spawning threads would be slower, for most real-world cases that delay would be negligible. Two decades ago spawning a thread used to be a much heavier operation than it is now. With common server hardware and not-an-ancient Linux kernel, you can easily spawn 5,000 threads in about 0.1 sec. What would usually matter a lot more is _what_ those threads are going to be doing while they're running. — at54321, Jan 15 '23 at 13:05
@at54321: The slowness in sudden scaling doesn't come from spawning threads; it comes from the (deliberately limited) thread pool injection rate. On .NET at least - not sure about other runtimes. — Stephen Cleary, Jan 15 '23 at 13:16

David Haim · Answer 2 · 2017-03-17T14:23:19.467

(In this answer, I will talk about .NET, as it is the first technology which came out with async/await)

We use threads to parallelize CPU-bound tasks, and we use asynchronous IO to parallelize IO-bound tasks.

CPU - wise:
We all know that a thread per task is wrong. we don't want too many threads, because the context switches will freeze the entire system. we don't want too few of them, because we want the tasks to be finished as soon as possible. of-course, we're looking at some sort of a threadpool.
ThreadPool was the default way of scheduling an asynchronous task on the pre Task epoch. but the threadpool had one sore problem - it was really hard to know when the asynchronous is finished, and what is the asynchronous result or exception are.
Then, came the Task. not only does the task schedules a delegate on the thread pool, when the task is done, you can get the result or exception and continue working from them with Task.ContinueWith.

IO - wise:
We all know that thread-per connection is a bad thing. if we want our optimized server to serve millions of requests per seconds, we can't just spawn a new thread for each new connection. our system will suffocate on context switching. So we use asynchronous IO. in the pre Task epoch, we used BeginRead/EndRead and BeginWrite/EndWrite, which were error prone and just pain to work with - we had to work with terrible paradigm of event-driven programing
Then, came the Task. we can initiate an asynchronous IO action, and receive the result or exception with Task.ContinueWith. it made asynchronous IO much easier to work with.

Task is the glue which bridges Asynchronous CPU tasks with asynchronous IO tasks. with one interface, we can schedule an asynchronous function and get the result with Task.ContinueWith. No wonder programing with Tasks become so popular.

Task.ContinueWith is highly unreadable and unwritable.
basically, chaining a task to a task to a task a to task... is a headache. as Node.js developers complains (even in JS async/await will be standarized sometime in the future). async/await come to the rescue here. basically, the C# compiler does a neat voodo behind the scenes. in a nutshell, it takes everything that comes after await and package it with state-machine which is called when the everything before await is done. the compiler takes synchronous code (annotated with async/await) and does the ContinueWith for you.

So, why use async/await + Task instead of a multi-threaded code?

async/await is thre easiest way of getting the asynchronous result or exception. (and believe me, I wrote asynchronous code in C++, C#, Java and Javascript, async/await is a paradise in that field.)
async/await works both with CPU-bound tasks and IO-bound tasks. same interface for two different but similar fields.
If you want asynchronous IO, threads will not help you anyway.
Task anyway is a IThreadPoolItem and is scheduled to the .Net thread pool. async/await just removes the chaining hell away. back to step one -> multi thread code.
Tasks + async/await synchronizes the code flow for you. most developers are not system developers. they do not know the hidden costs of synchronization objects and techniques. in most cases, the implementation provided by the framework is faster than the average implementation that the average developer can think about. ofcourse if you really try, you can write something extremely customized for your needs hence more performant, but that doesn't apply for most developers.
Depending on you programing language, await can be faster than a callback. Gor Nishanov is the original (Microsoft) developer who offered standarize await in C++. In his 2015 lecture, he shows that the C++ version of await is actually more performant than callback-style asynchronous networking IO. (switch to 39:30)

For specific questions :

Less work for the OS scheduler. True

False. async/await compiles to state machine. a task continuation invokes that state machine when its done. a task run on the threadpool anyway. async/await yields the same amount of scheduling as a multi threaded code / queuing a thread pool work. it's the simplicity you get which matters.

Less memory wasted for stack space. Or is it?

False. again, async/await compiles to a state machine. when invoked on task completion, it will use the same amount of stack memory for local varaiables. the continuation will anyway run on a thread (usually a thread-pool thread), so that argument is invalid.

Why is async considered better performing than multithreading?

When your code CPU-bound, there will be a little difference between between Tasks + async/await and pure multi-threaded code. In IO bound code, multi threading is the worst throughput you can have. Tasks + async/await will blow away any IO-bound-threadpool you can write your own. threads don't scale. usually (especially on Server side) you have both. you read some data from a connection (IO), then continue processing it on the CPU (json parsing, calculations etc.) and write the result back to the connection (IO again). Tasks + async/await are faster in this case than a pure multi threaded code.

It's the simplicity which makes async/await so appealing. writing a synchronous code which is actually asynchronous. if this is not "high level programing", what is?

Comments are not for extended discussion; this conversation has been [moved to chat](http://chat.stackoverflow.com/rooms/138536/discussion-on-answer-by-david-haim-why-is-async-considered-better-performing-tha). — Bhargav Rao, Mar 20 '17 at 13:05

Why is async considered better performing than multithreading?

2 Answers2