Is there any guidance on how to choose what each component of a coroutine should do?

Question

(... because there's so many degrees of freedom that I feel disoriented!)

For the purpose of understanding coroutines, I've implemented a generator which, given a unary function f of type T(T) and initial value x of type T, yields the infinite sequence of applications of f to x, i.e. the range [x, f(x), f(f(x)), f(f(f(x))), ...], via a range interface. (In fact, I was inspired by Haskell's iterate, hence the name I chose and the functional-programming tag.)

It wasn't that hard, as there's examples of generators everywhere, from cppreference to Josuttis' book.

However, then I started to play with it by moving things around, and I've realized that the degrees of freedom in implementing a couroutine are far more than an example like the one I mentioned needs. And since the functionality that the coroutine in my example provides is actually not a toy¹, I feel like even for production code it is a bit hard to decide how to "distribute work" across the various participants, namely the promise, the interface, the awaiters, and the body of the coroutine. Therefore I wanted to know if there some guidelines, or if C++23 will bring in a bit of clarity to this topic.

To give another hint as to why the whole matter confuses me, it's not clear to me how the "persons" that would write each of those participants would relate to each other. Would they all be the same person?² For instance, I don't see how the coroutine's interface's writer can be other than the same writer of the coroutine's promise.

Going to the concrete example, here is the version I originally came up with, live on Compiler Explorer. The main bits are these:

The main shows how the coroutine is used,

int main() {

    constexpr auto f = [](int i){ return i*2; };
    constexpr auto x0 = 1;

    for (auto const& x : iterate(f, x0) | take(10)) {
        std::cout << x << std::endl;
    } // prints the first 10 powers of 2: [1,2,4,8,16,32,64,128,256,512]
}

the coroutine itself is fairly intuitive,
```
IterCoro iterate(auto f, auto x) {
    while (true) {
        co_yield x;
        x = f(x);
    }
}
```
it accepts the function f and the initial value x, and loops infinitely, co_yielding the value to the caller and updating it afterwards;

for making the updated value available to the caller, yield_value stores it in its promise and accepts the suspension:

struct promise_type {
    int val;
    constexpr auto yield_value(int i) noexcept {
        val = i;
        return std::suspend_always{};
    }
    // "ordinary" implementiation for all the rest
};

as regards the range interface, I think it's relatively ordinary, but one point worth noticing is that begin resumes the coroutine via the next resumption function:

struct IterCoro {
    // ...
    struct iterator {
        //...
        constexpr void next() {
            if (hdl) {
                hdl.resume();
                if (hdl.done()) {
                    hdl = nullptr;
                }
            }
        }
        //...
    }
    // ...
    auto begin() const {
        if (!hdl || hdl.done()) {
            return iterator{nullptr};
        }
        iterator it{hdl};
        it.next(); // resume to make the first value available
        return it;
    }
    // ...
};

But then I thought: yield_value is a non-const member function of promise_type, and it stores stuff in it, but anything else having access to the handle can retrieve the promise and do the same; and one thing that has access to the handle is the await_suspend method of the Awaiter returned by yield_value, so an alternative approach is to construct an Awaiter with the value to be set, and let its await_suspend do the job. That's how I came up with the second solution, where

the couroutine is identical to before,
yield_value can be made const and simply forward the value to be stored in the promise to the Awaiter,
```
constexpr auto yield_value(int i) const noexcept {
    return Awaiter{i};
}
```

the Awaiter stores the value, accepts the suspension, and stores the value in the promise upon suspension,

struct Awaiter {
    constexpr auto await_ready() const noexcept { return false; }
    constexpr void await_suspend(auto h) const noexcept {
        h.promise().val = i;
    }
    constexpr auto await_resume() const noexcept {}
    constexpr Awaiter(int i)
        : i{i}
    {}
    int i;
};

the range interface is identical to before.

I see that the two solutions do look a lot alike, but I wouldn't be sure that they are indeed the same thing (the generated code is not identical down to the bit, after all).

But then I tried one more change in line the following reasoning. The body of the coroutine is just doing the some work after every suspension; but runs right after a suspension? await_suspend of the Awaiter! So why not moving the work there? The Awaiter doesn't have access to the parameters of the coroutine, so how can it do the work of the coroutine without f and x? The solution is suggested by the standard, we can make the promise_type accept the same arguments as the coroutine and store them locally, so that the Awaiter can retrieve them. That's how I came up with the third solution, where

The interface of the coroutine is templated, in order to be able to make the types of the function and its arguments available to the nested promise_type, for it to store them, for the Awaiter to use them,

template<typename Fun, typename Arg>
struct IterCoro {
    // ...
    struct promise_type {
        Fun fun;
        Arg val;
        promise_type(Fun f, Arg a)
            : fun{f}
            , val{a}
        {}
    // yield_value is not even necessary (see below)

the Awaiter, which can now be default-constructed as it is stateless, can then retrieve what it needs for the promise_type and update the value,

struct Awaiter {
    constexpr auto await_ready() const noexcept { return false; }
    constexpr void await_suspend(auto h) const noexcept {
        h.promise().val = h.promise().fun(h.promise().val);
    }
    constexpr auto await_resume() const noexcept {}
};

the coroutine is templated via explicit template arguments rather than the auto placeholder, in order to pass those template arguments to the interface, but this time the body of the while loop just needs to suspend, because the update of the value is done right after the suspension by the Awaiter,
```
template<typename Fun, typename Arg>
IterCoro<Fun, Arg> iterate(Fun, Arg) {
    while (true) {
        co_await Awaiter{};
    }
}
```
the range interface is almost identical to before, but this time begin must not resume the coroutine, because the update of the value is done right after suspension:
```
auto begin() const {
    if (!hdl || hdl.done()) {
        return iterator{nullptr};
    }
    return iterator{hdl};
}
```

To my inexperienced eyes, this does look fairly different from where I started and even from where I got with the first re-elaboration (and the assembly is visibly even if not substantially shorter).

(¹) It might look a toy example at first (and maybe it needs to be made more generic and robust, and maybe I should see what happens with a move-only argument; whatever), but it isn't. Think of how easy it makes to traverse a tree while looking for something:

auto nameOfOldestForeFather = *(iterate(getFather, me) | take_while(alive) | transform(name)).end();

or more generally

for_each(iterate(getParent, leafNode) | take_while(someCond), someWork)

(²) I say "person", but I'm referring to any group of people which work together on the same thing, if needed. As in, if I write a function/class/whatever for doing xyz and somebody helps me to any extent, we are still the same "collective brain" that works on that thing.

This all sounds very much like "I want to make everything as complicated and unintuitive as possible." You are making a generator. `promise::yield_value` is the function that gets called when you `co_yield` a value. What more needs to be thought of beyond that? Why would you even consider having `yield_value` shove the value into some awaiter object when `yield_value` has *100%* of the capabilities it needs right there? — Nicol Bolas, Apr 01 '23 at 21:28
@NicolBolas, how isn't coroutines an "as complicated and unintuitive as possible" solution to the task I've described, to begin with? [Other solutions are much simpler to read](https://godbolt.org/z/4YMd88cxh), no? — Enlico, Apr 01 '23 at 21:58
You're talking about making a generator coroutine. The particular nature of what that coroutine is doing internally is irrelevant; the coroutine machinery is appropriate for any generator coroutine which produces single values of one specific type, exposed through a range interface. — Nicol Bolas, Apr 01 '23 at 22:12
I'm not sure what the question is here. It sounds like you have two questions: why does `yield_value` return an awaitable (and therefore why does `co_yield` translate into `co_await`), and what is the purpose of the coroutine machinery as distinct from a coroutine function (the last version of the two intermingled these two things)? Which question are you looking for the answer to? — Nicol Bolas, Apr 01 '23 at 22:20
"*I don't see how the coroutine's interface's writer can be other than the same writer of the coroutine's promise.*" [`std::generator` and its attendant features demonstrate otherwise.](https://timsong-cpp.github.io/cppwp/coro.generator) ASIO's coroutine support doesn't require you to write anything other than a coroutine function returning the right type. Indeed, pretty much any interface that works with coroutines will give you the coroutine machinery; you only provide the function. — Nicol Bolas, Apr 02 '23 at 17:11

Nicol Bolas · Answer 1 · 2023-04-02T17:13:21.267

All of this seems to be confusing two distinct concepts: the coroutine function and what I will call the "coroutine machinery". A coroutine function is just any function that uses one of the co_* keywords.

The coroutine machinery are the various support objects and their implementation which are used to mediate between a coroutine function and code trying to interact with that function. The coroutine machinery includes, but is not limited to, the promise type, the corresponding future type returned by a coroutine function, and any types specific to those types.

Note that I say "a coroutine function", not "the coroutine function". This is because coroutine machinery is not intended to be coupled with any specific coroutine function. Coroutine machinery does not define what a coroutine does; it defines how that coroutine talks to and interacts with the outside world.

A coroutine function defines which machinery it is associated with by its function signature. Your coroutine uses the IterCoro machinery because it uses that as its return type.

But this is not an accident; this is an intentional part of design. A function signature describes how to interact with a function, but not what it does. Knowing that a function takes an int and returns an int tells you how to talk to it (passing an int and receiving one), but it tells you nothing about what that function will do.

The same goes for coroutine machinery; it defines how you can interact with any coroutine which uses that machinery. But it does not say what that coroutine will do. Coroutine machinery is meant to apply to a family of coroutine functions that all share a similar coroutine interface.

IterCoro represents a particular interface, a specific way of talking to coroutine functions. But only for functions which conform to its expected requirements of the coroutine.

Specifically, it expects the coroutine function to:

Be a yielding generator;
Which only yields values of type int (or types convertible to that); and
Will never terminate by flowing off the end of its function.

A coroutine function which conforms to this interface may use the IterCoro machinery. Things which your IterCoro does not support include:

Using co_await within the coroutine. That is, pausing execution of the coroutine and scheduling its resumption with some external code.
Using co_return within the coroutine.
Supplying a value to the coroutine when it uses co_yield. Yes, that's a thing you can do.

And that's fine; no coroutine machinery should allow any coroutine to function with it. The machinery has an expected interface, and it is bad for a coroutine to not conform to that interface.

But the coroutine machinery should not define what that coroutine function is actually doing. An IterCoro is a coroutine that infinitely generates values; how it does so is not IterCoro's business. If you put the business logic of a coroutine function within the coroutine machinery it happens to use, you are using the feature wrong.

Is there any guidance on how to choose what each component of a coroutine should do?

1 Answers1