A stateful lambda or a custom functor is almost always the better choice imho. In fact you can get more efficient coroutines by just using lambdas. Compare this:
Demo
#include <cstdio>
#include <cstdint>
int main() {
enum class cont_point : uint8_t {
init,
first,
second,
third,
end,
};
auto lambda = [cp = cont_point::init]() mutable -> void {
switch(cp) {
case cont_point::init:
printf("init\n");
cp = cont_point::first;
break;
case cont_point::first:
printf("first\n");
cp = cont_point::second;
break;
case cont_point::second:
printf("second\n");
cp = cont_point::third;
break;
case cont_point::third:
printf("third\n");
cp = cont_point::end;
break;
default:
return ;
}
};
lambda();
lambda();
lambda();
lambda();
}
Yields:
init
first
second
third
If you check the assembly you will see that the code is optimized to perfection which gives you a hint about how efficient compilers are in optimizing lambdas. The same is not true for coroutines (not yet at least).
But
Coroutines offer one very interesting niche case which no other language construct can fill, namely they solve the cactus stack problem. The cactus stack problem basically denotes the problem of code forks to run on the same stack - this is not possible so a seperate stack must be generated. If the executing thread on that stack then forks again, there must be another stack and so on. And what's even worse is that nobody knows how big these stacks are going to be.
C++20 coroutines are stackless which conversely means they do use a stack but not for the stateful data, only data that does not traverse the awaitable points will be thrown on the executing task's stack, so it can safely be deleted during stack unwinding while all stateful data remains on something called a coroutine frame, that typically (and unfortunately even in simple-to-optimise cases) rests on the heap (allocated via operator new
). This decision of what to put inside the coroutine frame and what to put on the callstack as execution goes on is done by the compiler in a process called coroutine transformation. It is this process that makes coroutines uniquely able to solve the cactus stack problem as follows:
Every newly allocated coroutine instance will keep a predefined amount of space on the heap, comparable to an object with its data fields. When the coroutine is executed additional data is put on the stack of whatever task is executing the continuation of the coroutine. This way, the stack can grow dynamically while and we don't have the problem of many stack overflows (like is the case for stackful coroutines) but we only have to make sure all threads have sufficient stackspace available to them as we usually do.