Generators I: Toward a minimum viable product

We’re still not finished with the design of async/await, but it’s already become clear that it’s time to get the next phases of the feature into the pipeline. There are two extensions to the minimal async/await feature we’ve currently got that seem like the clear high priority:

  • Async methods: allowing async fn to be used in traits.
  • Generators: allowing imperative control flow to create Iterators and Streams the same way async fn allows imperative control flow to create a Future.

This post is the start of a series about that second use case. Generators already exist in some form in nightly, a very sketchy, experimental form that was initially designed as a compiler target for the futures-async-await library which experimented with async/await as procedural macros. The generator feature is very broad and has many potential use cases, because it can be used as a fundamental building block for various exotic forms of control flow. For that reason, the goal of this post is to narrow the scope of the feature to a “minimum viable product” (MVP) - something we could ship quickly that solves our immediate high priority problems and that can be easily extended to other use cases in the future.

Defining the boundaries of our use case

The many use cases of generators can be divided into two basic camps:

  • As a general purpose “coroutine” mechanism; usually, this means as the target for some kind of macro which defines a new, higher level control flow construct. Generators can be the building block for all kinds of exotic control flow (as they were for the async/await macro before that became a language feature of its own right). Most commonly, libraries generate the actual generator code, and end users don’t interact with generators directly at all.
  • As a means of representing some sort of iterable source using imperative code. The epitomizing example is using a generator to make an Iterator; another similar example is using a generator to make an async Stream. For this use case, the user interacts directly with the generator language feature, and ideally there is as little library code as possible between them and just using the feature.

It’s the latter case that we care about right now. The former case will certainly enable a lot of really interesting Rust projects, but directly using generators for iterable computation is extremely important, especially for the async case. When you want to write a function that returns an iterator, using the -> impl Iterator feature and some iterator adapters is usually adequate. But Stream runs into exactly the same problem as Future with that feature - you need to be able to borrow across “yield points” in your Stream, which library adapters are not able to handle well.

And functions that return streams are extremely important for networking code. They are the obvious way to model many kinds of network interactions, from things important to the web like streaming HTTP responses, websockets and HTTP/2 push messages, to use cases involving totally different or more underlying protocols that non-web networking services care about.

One of the most compelling reasons that async/await was made a first class feature was so that we could have this intersection of generators and async as independent features:

              │  SYNCHRONOUS            │  ASYNCHRONOUS
    ──────────┼─────────────────────────┼──────────────────────────────────
              │                         │
     FUNCTION │  returns T              │  returns Future of T
              │                         │
    GENERATOR │  retners Iterator of T  │  returns Stream of T
              │                         │

It’s also worth noting that futures-async-await library previously mentioned implemented a verison of “async generators” that compile to streams, and so we can take some lessons from the experience with that library.

Dividing up the design space

Having narrowed our target use case, we can now divide up all the unresolved questions about generators into two buckets:

  • The first bucket contains the questions we need to solve in order to stabilize enough of the generator features to solve our use case. I’ll call this the now bucket.
  • The second bucket contains the questions we don’t need to have fully resolved to solve for our use case. What we do need is to have a sketch of a path to someday implementing these extensions, or to decide concretely that we will not resolve them. This is the later bucket.

Here is how I have bucketed each major unresolved question that I could think of. Remember that even for things in the later bucket, we have to be forward compatible with extending to them (or explicitly decide we will never do that). So don’t fret if your pet feature of generators is being bucketed as later - we’re not going to accidentally forget about it.

  • The final syntax for generators. This is pretty necessary for stabilizing anything, you can’t have a language feature without knowing what its syntax is.

    Bucket: now.

  • The final signature of generator related traits and types. The type signatures of all the types that generators produce (the function that returns a generator, the generator itself, its return type, and so on).

    Bucket: now.

  • Generators that return non-() types. For the most part, generators that return types other than () are not necessarily interesting for our use case of Iterators and Streams. There is one quibble - ? and Try types - which I will be looking at more closely in the next post.

    Bucket: later (except for figuring out how ? interacts with generators).

  • Generators that take resumption arguments. Iterators and Streams cannot take resumption arguments, so this is not necessary to solve yet. We just need to decide about forward compatibility.

    Bucket: later.

  • Bridging the generator type to Iterator and Stream. We have to figure out the best way to do make peoples generators into iterators and streams, whether its a blanket impl or an adapter or something else.

    Bucket: now.

  • Self-referential and Unpin generators. Generators need to be able to be self-referential to support borrowing across yield, but the Iterator interface is incompatible with that. We need to figure out the best way to solve this problem.

    Bucket: now.

Unfortunately, we didn’t manage to scrape much of the design space away with this exercise. Even the MVP of generators leaves us with a lot of questions to resolve, several of which have many solutions with nuanced trade offs. That’s exactly why I’m exploring this design space now, before async/await has been stabilized, because its going to take a long time to get through.

In the next post in this series, I’m going to start by examining one important question: how do generators (which not only return, but also yield) interact with the ? feature?