A four year plan for async Rust

Four years ago today, the Rust async/await feature was released in version 1.39.0. The announcement post says that “this work has been a long time in development – the key ideas for zero-cost futures, for example, were first proposed by Aaron Turon and Alex Crichton in 2016”. It’s now been longer since the release of async/await than the time between the first design work on futures and the release of async/await syntax. Despite this, and despite the fact that async/await syntax was explicitly shipped as a “minimum viable product,” the Rust project has shipped almost no extensions to async/await in the four years since the MVP was released.

This fact has been noticed, and I contend it is the primary controllable reason that async Rust has developed a negative reputation (other reasons, like its essential complexity, are not in the project’s control). It’s encouraging to see project leaders like Niko Matsakis recognize the problem as well. I want to outline the features that I think async Rust needs to continue to improve its user experience. I’ve organized these features into features that I think the project could ship in the short term (say, in the next 18 months), to those that will take longer (up to three years), and finally a section on a potential change to the language that I think would take years to plan and prepare for.

Near-term features

These features are all features that I believe the Rust project would be able to ship within the next year or two. They all require relatively small changes to the compiler, because they depend on abstractive capabilities that are already implemented, and they involve relatively small changes to the surface syntax, largely new syntax implied already by the existing syntax. I think these are the things the project should focus its attention on, because they should be easier to ship and easier to build a consensus around.

AsyncIterator and async generators

I’ve harped on the importance of generators to Rust repeatedly in the past, so I won’t devote a lot of attention here. I’ve also highlighted before that the original plan for iterators included shipping generator syntax. Briefly, my opinion is that the absence of generators has left Rust in a confused state, in which the relationship between asynchrony and iteration is unclear (I elaborate more in my linked blog post). I want to focus specifically on async iterators and async generators, and the features that are needed to complete these.

An async generator is a natural transformation from a generator: just like functions, generators can be marked async, and now you can use the await operator inside of them. Using my preferred syntax, this would look something like this:

async gen fn sum_pairs(rx: Receiver<i32>) yields i32 {
    loop {
        let left = rx.next().await;
        let right = rx.next().await;
        yield left + right;
    }
}

The composition of these features falls out naturally from these syntaxes. Unlike a generator, an async generator compiles to an AsyncIterator.

There is one other piece of syntax that is needed: for await loops. These can be called from within any async context, and consume items from the AsyncIterator, yielding control when the AsyncIterator yields pending:

for await item in async_iter {
    println!("{}", item);
}

When I was working on async Rust, this syntax was held up on two different design tangents. On the one hand, Taylor Cramer thought that the feature was a poor choice because users should instead be using for_each_concurrent, to get some concurrency. I do not agree with that: it’s not always the case that users want to use for_each_concurrent, adding more internal concurrency to your async function is a decision that needs to be considered with care, and there should be an obvious syntax for when you don’t want that, which for await is. On the other hand, there was some speculation about making “await patterns” that destructure futures and then somehow making that work here; I think this would imprudent and leaving await as an expression, and for await as a special expression for handling AsyncIterator, is the most sensible choice.

Revisiting the table from my previous blog post, you could add this column for async iteration:

                           │   ASYNCHRONOUS ITERATION
      ─────────────────────┼───────────────────────────
                   CONTEXT │   async gen { }
        EFFECT (iteration) │   yield
      FORWARD (asynchrony) │   await
      COMPLETE (iteration) │   for await

The biggest thing blocking this is an issue on the library side: how should the AsyncIterator interface be expressed. I’ve already written about my preference for stabilizing AsyncIterator as-is, with the poll_next method. This remains a subject of some controversy, so I will return to it, but not in this post.

For now I’ll just say that I think the failure to stabilize AsyncIterator over the past 4 years (which was absolutely not our intention when we planned the async MVP) has been harmful to async Rust, because APIs based on async iteration have been relegated to unstable features and side-libraries, leaving users confused and poorly supported when they need to deal with repetitious asynchronous events, a very common pattern. The single best thing the Rust project could do for users is stabilize AsyncIterator so the ecosystem can build on it, and it could do that tomorrow.

The good news is that work is already underway on reserving the gen keyword in the next edition, so that generators could be implemented. This feature is using the same state machine transform that async functions already use, and by analogy should be feasible to implement without big changes to the compiler. The only big unresolved questions with generators (and which doesn’t apply to async generators, if AsyncIterator is stabilized as is) is how to make them self-referential. I’ll return to that question later in this post.

Coroutine methods

Orthogonal to the introduction of these additional kinds of coroutines is their integration into the trait system. Right now, you cannot define an async trait method in stable Rust. The good news is that this is changing, and in a soon-to-be-released version of Rust, it will be possible to write an async trait method. As other coroutines, generators and async generators should not require any special support to use them in traits that wasn’t already implemented for async functions. So when generators and async generators are implemented and stabilized, they should be supported as methods out of the box.

The only thing that remains to be implemented for coroutine methods is the concept of “Return Type Notation” (or RTN). The problem is that adding a coroutine method to a trait adds an anonymous associated type to that trait, which is the return type of that method. Sometimes (most importantly: when spawning that method in a task on a work-stealing executor or otherwise moving it to another thread) users need to add additional bounds to that anonymous associated type. So Rust needs some syntax for declaring that. This is RTN. For example:

trait Foo {
    async fn foo(&self);
}

// later:
where F: Foo + Send,
      F::foo(): Send

In my opinion, it is important to ship RTN because of a design principle I call the “Can you fix it?” principle. If an upstream dependency of yours has an async method, and you need to add a Send bound to the return type, can you fix it, or do you need to fork the library? Without the ability to add RTN bounds to where clauses, you cannot express the bounds that you require without changing the upstream code, even if your code is all perfectly valid (i.e. even if the async method you want to call is Send). It’s very frustrating for users to encounter a problem in which their code should compile fine, but the only way to satisfy the compiler is to fork a dependency.

Fortunately, the project is already focusing on this feature, and I expect it to be shipped in the next year. There seems to be some discussion around the exact syntax for this feature: I would encourage contributors not to be too obstinate over syntax differences that don’t substantially change the feature.

Coroutine closures

Another aspect of Rust’s language design in which coroutines are currently not well-supported is closures. Niko Matsakis has explored this issue in two recent blog posts, focusing only on async closures and not on generative or asynchronously generative closures. In the first, he proposed treating async closures as a new hierarchy of function traits (i.e. adding AsyncFn, AsyncFnMut, and AsyncFnOnce). In the second, he instead explores the idea of modeling async closures as closures returning impl Future (e.g. F: Fn() -> impl Future).

I prefer the second approach, because it does not result in a proliferation of more traits. This becomes especially apparent when you consider generative closures and asynchronously generative closures: if the function trait for each of these things were distinct, instead of 3 function traits, Rust would have 12. In contrast, by modeling coroutine closures as closures returning an impl Trait, no new traits are needed. It has the additional benefit that it involves modeling them in the exact way that Rust already desugars normal async functions.

As Niko highlights in his blog post, this would require adapting the Fn traits to allow their return type to capture input lifetimes. There are a few things that Niko calls out in his post that require changing Rust’s syntax, possibly across an edition boundary:

  • Adding a lifetime to the Output parameter of the Fn traits
  • Desugaring -> impl Trait to a bound on the associated type projection instead of a new variable

Because these may require an edition change, the project should work through the specifics of these changes immediately. But they do not seem like extremely thorny problems to work out.

There is one other thing I would add to this feature, though. Once you have Fn() -> impl Future and so on, it would be natural to extend the syntax to have a kind of “async sugar” (and “gen sugar”) just like functions do. That is to say, special syntax sugar should be added to the Fn traits that makes it possible to write closure bounds like this:

where F: async FnOnce() -> T
// equivalent to:
where F: FnOnce() -> impl Future<Output = T>

where F: gen FnOnce() yields T
// equivalent to:
where F: FnOnce() -> impl Iterator<Item = T>

where F: async gen FnOnce() yields T
// equivalent to:
where F: FnOnce() -> impl AsyncIterator<Item = T>

What’s nice about this is that it isn’t some new general-purpose abstractive concept like “trait transformers” or “effect generics:” it’s just a little bit of sugar that is a natural extension of sugar that already exists from one place (function declarations) to another (function trait bounds). And these function traits already have special syntax, because they use parens and arrows for their parameters and return type. This wouldn’t require a lot of implementation work or consensus on a controversial new feature.

Medium-term features

The features in the previous section were all features that I believe could be shipped without a huge amount of implementation effort, and which don’t have many thorny open questions in their design. The features in this section, on the other hand, are more difficult. It’s good that people are already investigating them now, but they don’t seem very close to shipping and I wouldn’t expect them in the next year or two.

Object-safe coroutine methods

Though async trait methods will soon be a stable feature, they will not initially be object-safe. I think this was the right decision, but it would be ideal if someday they could be. The problem with object-safety is this: each coroutine method implies an anonymous associated type, which would have a different size and layout in each implementation. In order to erase the static type of the trait object, you also need to erase the type of that method’s anonymous return type: in other words, it also needs to somehow be a trait object.

For our examples, we’ll consider this trait:

trait Foo {
    async fn foo(&self);
}

If I want to make a trait object of Foo, I need to specify the return type of Foo::foo. Thankfully, RTN starts to unravel this problem by allowing us this syntax: Box<dyn Foo<foo() = Something>> But what is Something? It can’t be a specific type, or else that limits the trait object to implementations that return that type: in practice, this means limiting it to a single specific type, and now it isn’t even a meaningful trait object at all. That’s why it needs to be a trait object itself.

For example, that might be Box<dyn Foo<foo() = Pin<Box<dyn Future<Output = ()>>>>>. Of course, that is incredibly verbose. There are basically two problems at play which shape the design space:

  • There needs to be some kind of transformer that takes your implementation of Foo, and includes the glue to allocate the future in the heap.
  • Some members of the project leadership have the very strongly held view that heap allocations should be “explicit,” where explicit means there should be more syntax required to do it.

As a result, the project has considered a new wrapper type that would be required, which would “explicitly” indicate (by virtue of being a different type) that the future type will be heap allocated. My understanding is that something like what I’ve written above would be Box<Boxed<dyn Foo>>, or maybe just Boxed<dyn Foo> (it’s not clear to me from the material I have available).

My own opinion is different. I think its reasonable to make the default behavior of a heap allocated trait object (i.e. Box<dyn Foo>, Rc<dyn Foo> and Arc<dyn Foo>) to allocate the state machine with the same allocator as that type. For non-owned trait objects, like &mut dyn Foo, I would also be fine making the default behavior allocating them with the global allocator, though here I see the point more (especially because this wouldn’ be possible in no_std contexts).

Regardless, I agree it would be important to allow users to override this behavior with some alternative glue mechanism. This requires an interface for writing your own glue code, which might do something else (like use alloca to allocate a dynamically sized type on the stack). I just think that there should be a reasonable default behavior, which for heap allocated trait objects is probably heap allocating that state. In my opinion, this is not “implicit” any more than requiring all users to use an adapter is “implicit,” it just involves setting a reasonable default. Still, resolving this controversy to everyone’s satisfaction would be a blocker on this feature, as well as developing the interface for the glue code.

I want to make one other note in this section: previous discussions of this issue treat the unstable dyn* feature as a prerequisite for object-safe coroutine methods. I do not believe this is the case. What dyn* does is create an existential type that all of the different trait object pointer types would implement, by virtualizing also their destructor code; if you can accept that trait objects using different allocation strategies for their virtual coroutine methods are different types, there’s no dependence on dyn* at all. I personally think the dyn* feature is a questionable direction for the Rust project to pursue.

Async destructors

Another very thorny issue is the problem of async destructors. Sometimes, a destructor might need to perform some kind of IO operation or otherwise block the current thread; it is desirable to support non-blocking destructors which instead yield control, so that other tasks can run concurrently. Unfortunately, there are several problems with this.

The first problem is that running the async destructor is best effort, even more-so than running any destructor. This is because if you drop a type with an async destructor in a non-async context, there’s no possibility of running the destructor because this is not in an async context. There have been a couple of different ideas about how to solve this, such as using let async bindings to indicate variables that can’t be moved into a non-async context, or just accepting it and treating the async destructor as only an optimization over the non-async destructor.

The second problem is actually very similar to the problem with trait objects: if the async destructor needs to use some sort of state, where do you store it? One option is to disallow async destructors from having state, using a poll method. This is simple, but it is problematic for things like data structures: a Vec for example has no way of storing which items it has polled already, and has to keep polling their destructors in a loop. This would be pretty unacceptable, probably. But then dealing with the state raises the same issues as trait objects.

The third problem with async destructors is how to handle their interaction with unwinding. In particular, if you are unwinding through an async destructor, which returns Pending, what happens? There would need to be some kind of asynchronous version of catch_unwind that the pending calls can jump to, so that other tasks can run. This problem I think is easier to solve than the other two, but it needs to be specced out.

I go back and forth between thinking that the difficulty with async destructors is one of the worst things about async Rust and thinking that maybe async destructors aren’t that useful anyway. Regardless of where you land, there is a lot of design work needed for this feature to be shippable, and I don’t think it will come soon.

Long-term features

In contrast to the near-term and medium-term features, there are certain larger problems with the design of Rust that I think should be considered carefully, such that they could not be addressed in the next few years. Still, the work of considering them must begin at some point, so that they can eventually be closed. I’m talking about “changing” the rules of Rust.

As of right now, there are a few valuable kinds of types that Rust cannot really support:

  • Immoveable types: types which can’t be moved once their address has been witnessed.
  • Unforgetable types: types which can’t go out of scope without running their destructor or destructuring them.
  • Undroppable types: types which can’t be dropped or forgotten but must be destructured.

(The latter two are usually grouped together as “linear types” when people talk about them, but there are very important differences.)

I think evidence has shown that there is a strong motivation for at least the first two categories.

To support self-referential coroutines and intrusive data structures, Rust needs some support for types that are known never to move again. Because Rust doesn’t support immovable types, we added this functionality using the Pin API. But the Pin API has a few big flaws: one is that the API is clunky and difficult to work with. More important, though, is that it requires an interface to explicitly opt in to supporting immovable types; traits that existed before Pin can’t gain the ability to work with immovable types.

There are two specific traits for which this is a big problem:

  • Iterator: because iterator doesn’t support immovable types, the project is at an impasse about how to support immovable generators.
  • Drop: because drop doesn’t support immovable types, an arcane implication is that you need crates like pin-project to access fields of pinned types. This is all very baroque and confusing, and wouldn’t be necessary if Drop supported immovable types.

On the other hand, if Rust had the Move trait, these problems would go away. Self-referential generators would just not implement Move, and work naturally. The Pin type could be completely deprecated, and a reference to a type that doesn’t implement Move would have the same semantics as a pinned reference to a type that doesn’t implement Unpin. Of course, this would require pretty major edition-crossing changes.

The scoped task trilemma presents a strong argument for types which cannot be forgotten. Stackless coroutines cannot use the destructor-based concurrent borrow trick: the only way to make it work is to use a closure-passing “internal” style, which is what Rust opted against when it went for stackless coroutines. This incompatibility between these two desirable aspects of Rust’s design makes a strong case that the decision not to support unforgettable types was the wrong decision.

I titled this post “a four year plan” for a reason: if Rust were to adopt these fundamental changes, it would have to be done across an edition boundary, and I strongly doubt that it could be done as part of the 2024 edition. This leaves the 2027 edition, four years from now, as the target for such a change. But the project should commit to a decision about this change sometime soon, in the next two years, and that should include a temporary solution for generators, such as requiring them to be pinned before they can be used as iterators.

I’ve been exploring what would be required to do this change on my blog this year because I think it is something the Rust project should seriously consider changing. I intend to continue to focus on this issue next year, because I think the implications of all of the different options needs to be fully understood. I’m trying to find ways to make this a collaborative process, but my options are limited. My goal isn’t really even to make a particular recommendation (though I will surely have opinions), but just to understand the full space of options for resolving these issues.

What are the exact trade offs between different options to handle the problem of self-referential generators? What different requirements would there be to support “unforgettable” types as opposed to “undroppable” types? If Move were to be added, how could Pin be removed across an edition boundary? These are the kinds of questions I want to answer.

However, I recognize that adding support for these kinds of types would be the biggest change to Rust since it was stabilized in 2015, and that making this change would bring with it enormous costs for both the project and the community. I also recognize that there are valid arguments why supporting these kinds of types isn’t really worth it (like the painful interaction with trait objects). For these reasons, the Rust project should build into its consideration of this idea the possibility that not doing anything may ultimately be the right outcome.

In general, my instinct is to doubt big changes to Rust at this point in its design process. What I think Rust needs is to finish integrating the features it has already committed to - features like external iterators, stackless coroutines, monomorphized generics, and unsized trait object types. I specifically feel changing the rules around moveability and linear types is justified because of the implications for the integration of these existing features.

Closing remarks

This post has once again gotten very long. I decided to focus this post on changes to the language; in another post to come I will focus my attention on the standard library and the async library ecosystem, as well as devote a specific post to the AsyncIterator interface. I want to make one other remark, which I tried to find a place for in this post and the previous one, but couldn’t. It concerns the controversy around the final syntax for the await operator which played out in 2019.

For those who don’t know, there was a big debate whether the await operator in Rust should be a prefix operator (as it is in other languages) or a postfix operator (as it ultimately was). This attracted an inordinate amount of attention - over 1000 comments. The way it played out was that almost everyone on the language team had reached a consensus that the operator should be postfix, but I was the lone hold out. At this point, it was clear that no new argument was going to appear, and no one was going to change their mind. I allowed this state of affairs to linger for several months. I regret this decision of mine. It was clear that there was no way to ship except for me to yield to the majority, and yet I didn’t for some time. In doing so, I allowed the situation to spiral with more and more “community feedback” reiterating the same points that had already been made, burning everyone out but especially me.

The lesson I learned from this experience is to distinguish between factors that are truly critical and factors that don’t matter. If you’re going to be obstinate about some issue, you’d better be able to articulate a deep reason why it is important, and it had better be something more pressing than the slight differences in affordances and aesthetics between syntax options. I’ve tried to take this to heart in how I engage in technical questions since then.

I worry that the Rust project took the wrong lesson from this experience. The project continues in its norm (as Graydon mentioned here) that with enough ideation and brainstorming, eventually a win-win solution to every controversy can be discovered. Rather than accepting that sometimes a hard decision has to be made, the project’s solution to the burnout of that comes from allowing these controversies to hang open indefinitely has been to turn inward. Design decisions are now documented primarily in unindexed formats like Zulip threads and HackMD documents. To the extent that there is a public expression of the design, it is one of a half dozen different blogs belonging to different contributors. As an outsider, it is nearly impossible to understand what the project considers a priority, and what the current state of any of these things are.

I’ve never seen the project’s relationship with its community be in a worse state. But that community contains invaluable expertise; closing yourselves off is not the solution. I want to see the relationships of mutual trust and respect rebuilt between project members and community members, instead of the present situation of hostility and dissatisfaction. To this, I want to thank those from the project who have reached out and engaged with me on design issues over the last few months.