0

Im trying to understand when to use async and when it can improve performance.

I know it’s most-powerful in replacing otherwise-blocking IO operations like file IO, http calls, database operations, etc., but I was wondering if there’s any point beyond that (at least as far as performance is concerned).

A specific example might be large memory copies (shifting a large vector or otherwise copying lots of data). This is a synchronous operation that might take a while as far as the rest of the program is concerned. Can this example take advantage of async/await in order to do other things while we wait for the copy, or must the thread wait for the memory copy before it continues to execute?

If not this direct example, maybe another synch operation that can be improved via async/await?

Jam
  • 476
  • 3
  • 9

2 Answers2

5

Async/await does not give you any parallelism, only concurrency. That means that any sync operation using async/await to divide the work will just run one part after the other, not faster than without async/await and in fact slower because it adds overhead.

You can only get parallelism via threads, and you are limited to the number of cores you have in your system.

Some async runtimes do run the code in parallel, using multiple threads, so you will get some parallelism (and therefore speed) if you'll use them correctly; but this just adds overhead over using threads directly, or using a library such as rayon that avoids async/await and parallelizes the code using threads only.

If you're not I/O bound, don't use async/await. You have nothing to gain.

Chayim Friedman
  • 47,971
  • 5
  • 48
  • 77
3

In theory, yes.

At CppCon 2015, Gor Nishanov presented his work on an experimental coroutine implementation for C++: C++ coroutines - a negative overhead abstraction. C++ coroutines are somewhat different (even implementation-wise) then Rust async/await, but the difference doesn't matter here.

The key highlight of the talk was that certain algorithms could get a speed up from using coroutines by exploiting memory pre-fetching. While from the point of view of the code, a memory access is typically synchronous, it is actually asynchronous at the CPU level. The trick, then, is to:

  • Trigger the pre-fetch.
  • Suspend execution -- and do something else.
  • Resume execution, at which point the pre-fetched memory should be in L1, immediately available.

Now, it is notable that a coroutine (or async/await) isn't particularly needed to achieve this effect... they are just an abstraction, after all. Still, abstractions allow expressing that in a much more succinct/expressive ways.

So, in practice, a number of so-called "synchronous" algorithms could potentially be optimized with async/await, exploiting CPU asynchronous operations.

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • This is kind of what I had in my head that made me wonder. It feels like we wouldn’t necessarily need to wait for a slow memory allocation/fetch in order to execute some code that isn’t dependent on the information. Would be cool to see a practical implementation of it – Jam Jun 15 '23 at 11:55
  • @Jam: Well, I invite you to look at the talk :) (We can argue whether it's pratical or not) I do note there are _other_ asynchronous implementations, such as non-temporal stores, but there's no "completion indication" for those, so I am not sure it would fit async/await too well. – Matthieu M. Jun 15 '23 at 12:25