A four year plan for async Rust
Four years ago today, the Rust async/await feature was released in version 1.39.0. The announcement post says that “this work has been a long time in development – the key ideas for zero-cost futures, for example, were first proposed by Aaron Turon and Alex Crichton in 2016”. It’s now been longer since the release of async/await than the time between the first design work on futures and the release of async/await syntax. Despite this, and despite the fact that async/await syntax was explicitly shipped as a “minimum viable product,” the Rust project has shipped almost no extensions to async/await in the four years since the MVP was released.
This fact has been noticed, and I contend it is the primary controllable reason that async Rust has developed a negative reputation (other reasons, like its essential complexity, are not in the project’s control). It’s encouraging to see project leaders like Niko Matsakis recognize the problem as well. I want to outline the features that I think async Rust needs to continue to improve its user experience. I’ve organized these features into features that I think the project could ship in the short term (say, in the next 18 months), to those that will take longer (up to three years), and finally a section on a potential change to the language that I think would take years to plan and prepare for.
Near-term features
These features are all features that I believe the Rust project would be able to ship within the next year or two. They all require relatively small changes to the compiler, because they depend on abstractive capabilities that are already implemented, and they involve relatively small changes to the surface syntax, largely new syntax implied already by the existing syntax. I think these are the things the project should focus its attention on, because they should be easier to ship and easier to build a consensus around.
AsyncIterator and async generators
I’ve harped on the importance of generators to Rust repeatedly in the past, so I won’t devote a lot of attention here. I’ve also highlighted before that the original plan for iterators included shipping generator syntax. Briefly, my opinion is that the absence of generators has left Rust in a confused state, in which the relationship between asynchrony and iteration is unclear (I elaborate more in my linked blog post). I want to focus specifically on async iterators and async generators, and the features that are needed to complete these.
An async generator is a natural transformation from a generator: just like functions, generators can be marked async, and now you can use the await operator inside of them. Using my preferred syntax, this would look something like this:
async gen fn sum_pairs(rx: Receiver<i32>) yields i32 {
loop {
let left = rx.next().await;
let right = rx.next().await;
yield left + right;
}
}
The composition of these features falls out naturally from these syntaxes. Unlike a generator, an
async generator compiles to an AsyncIterator
.
There is one other piece of syntax that is needed: for await
loops. These can be called from
within any async context, and consume items from the AsyncIterator
, yielding control when the
AsyncIterator
yields pending:
for await item in async_iter {
println!("{}", item);
}
When I was working on async Rust, this syntax was held up on two different design tangents. On the
one hand, Taylor Cramer thought that the feature was a poor choice because users should instead be
using for_each_concurrent
, to get some concurrency. I do not agree with that: it’s not always the
case that users want to use for_each_concurrent
, adding more internal concurrency to your async
function is a decision that needs to be considered with care, and there should be an obvious syntax
for when you don’t want that, which for await
is. On the other hand, there was some speculation
about making “await patterns” that destructure futures and then somehow making that work here; I
think this would imprudent and leaving await as an expression, and for await
as a special
expression for handling AsyncIterator
, is the most sensible choice.
Revisiting the table from my previous blog post, you could add this column for async iteration:
│ ASYNCHRONOUS ITERATION
─────────────────────┼───────────────────────────
│
CONTEXT │ async gen { }
│
EFFECT (iteration) │ yield
│
FORWARD (asynchrony) │ await
│
COMPLETE (iteration) │ for await
│
The biggest thing blocking this is an issue on the library side: how should the AsyncIterator
interface be expressed. I’ve already written about my preference for stabilizing
AsyncIterator
as-is, with the poll_next
method. This remains a subject of some controversy,
so I will return to it, but not in this post.
For now I’ll just say that I think the failure to stabilize AsyncIterator
over the past 4 years
(which was absolutely not our intention when we planned the async MVP) has been harmful to async
Rust, because APIs based on async iteration have been relegated to unstable features and
side-libraries, leaving users confused and poorly supported when they need to deal with repetitious
asynchronous events, a very common pattern. The single best thing the Rust project could do for
users is stabilize AsyncIterator
so the ecosystem can build on it, and it could do that tomorrow.
The good news is that work is already underway on reserving the gen
keyword in the next
edition, so that generators could be implemented. This feature is using the same state machine
transform that async functions already use, and by analogy should be feasible to implement without
big changes to the compiler. The only big unresolved questions with generators (and which doesn’t
apply to async generators, if AsyncIterator
is stabilized as is) is how to make them
self-referential. I’ll return to that question later in this post.
Coroutine methods
Orthogonal to the introduction of these additional kinds of coroutines is their integration into the trait system. Right now, you cannot define an async trait method in stable Rust. The good news is that this is changing, and in a soon-to-be-released version of Rust, it will be possible to write an async trait method. As other coroutines, generators and async generators should not require any special support to use them in traits that wasn’t already implemented for async functions. So when generators and async generators are implemented and stabilized, they should be supported as methods out of the box.
The only thing that remains to be implemented for coroutine methods is the concept of “Return Type Notation” (or RTN). The problem is that adding a coroutine method to a trait adds an anonymous associated type to that trait, which is the return type of that method. Sometimes (most importantly: when spawning that method in a task on a work-stealing executor or otherwise moving it to another thread) users need to add additional bounds to that anonymous associated type. So Rust needs some syntax for declaring that. This is RTN. For example:
trait Foo {
async fn foo(&self);
}
// later:
where F: Foo + Send,
F::foo(): Send
In my opinion, it is important to ship RTN because of a design principle I call the “Can you fix
it?” principle. If an upstream dependency of yours has an async method, and you need to add a Send
bound to the return type, can you fix it, or do you need to fork the library? Without the ability to
add RTN bounds to where clauses, you cannot express the bounds that you require without changing the
upstream code, even if your code is all perfectly valid (i.e. even if the async method you want to
call is Send
). It’s very frustrating for users to encounter a problem in which their code should
compile fine, but the only way to satisfy the compiler is to fork a dependency.
Fortunately, the project is already focusing on this feature, and I expect it to be shipped in the next year. There seems to be some discussion around the exact syntax for this feature: I would encourage contributors not to be too obstinate over syntax differences that don’t substantially change the feature.
Coroutine closures
Another aspect of Rust’s language design in which coroutines are currently not well-supported is
closures. Niko Matsakis has explored this issue in two recent blog posts, focusing only on async
closures and not on generative or asynchronously generative closures. In the first, he
proposed treating async closures as a new hierarchy of function traits (i.e. adding AsyncFn
,
AsyncFnMut
, and AsyncFnOnce
). In the second, he instead explores the idea of modeling
async closures as closures returning impl Future
(e.g. F: Fn() -> impl Future
).
I prefer the second approach, because it does not result in a proliferation of more traits. This
becomes especially apparent when you consider generative closures and asynchronously generative
closures: if the function trait for each of these things were distinct, instead of 3 function
traits, Rust would have 12. In contrast, by modeling coroutine closures as closures returning an
impl Trait
, no new traits are needed. It has the additional benefit that it involves modeling
them in the exact way that Rust already desugars normal async functions.
As Niko highlights in his blog post, this would require adapting the Fn
traits to allow their
return type to capture input lifetimes. There are a few things that Niko calls out in his post that
require changing Rust’s syntax, possibly across an edition boundary:
- Adding a lifetime to the
Output
parameter of theFn
traits - Desugaring
-> impl Trait
to a bound on the associated type projection instead of a new variable
Because these may require an edition change, the project should work through the specifics of these changes immediately. But they do not seem like extremely thorny problems to work out.
There is one other thing I would add to this feature, though. Once you have Fn() -> impl Future
and so on, it would be natural to extend the syntax to have a kind of “async sugar” (and “gen
sugar”) just like functions do. That is to say, special syntax sugar should be added to the Fn
traits that makes it possible to write closure bounds like this:
where F: async FnOnce() -> T
// equivalent to:
where F: FnOnce() -> impl Future<Output = T>
where F: gen FnOnce() yields T
// equivalent to:
where F: FnOnce() -> impl Iterator<Item = T>
where F: async gen FnOnce() yields T
// equivalent to:
where F: FnOnce() -> impl AsyncIterator<Item = T>
What’s nice about this is that it isn’t some new general-purpose abstractive concept like “trait transformers” or “effect generics:” it’s just a little bit of sugar that is a natural extension of sugar that already exists from one place (function declarations) to another (function trait bounds). And these function traits already have special syntax, because they use parens and arrows for their parameters and return type. This wouldn’t require a lot of implementation work or consensus on a controversial new feature.
Medium-term features
The features in the previous section were all features that I believe could be shipped without a huge amount of implementation effort, and which don’t have many thorny open questions in their design. The features in this section, on the other hand, are more difficult. It’s good that people are already investigating them now, but they don’t seem very close to shipping and I wouldn’t expect them in the next year or two.
Object-safe coroutine methods
Though async trait methods will soon be a stable feature, they will not initially be object-safe. I think this was the right decision, but it would be ideal if someday they could be. The problem with object-safety is this: each coroutine method implies an anonymous associated type, which would have a different size and layout in each implementation. In order to erase the static type of the trait object, you also need to erase the type of that method’s anonymous return type: in other words, it also needs to somehow be a trait object.
For our examples, we’ll consider this trait:
trait Foo {
async fn foo(&self);
}
If I want to make a trait object of Foo
, I need to specify the return type of Foo::foo
.
Thankfully, RTN starts to unravel this problem by allowing us this syntax: Box<dyn Foo<foo() = Something>>
But what is Something
? It can’t be a specific type, or else that limits the trait
object to implementations that return that type: in practice, this means limiting it to a single
specific type, and now it isn’t even a meaningful trait object at all. That’s why it needs to be a
trait object itself.
For example, that might be Box<dyn Foo<foo() = Pin<Box<dyn Future<Output = ()>>>>>
. Of course,
that is incredibly verbose. There are basically two problems at play which shape the design space:
- There needs to be some kind of transformer that takes your implementation of
Foo
, and includes the glue to allocate the future in the heap. - Some members of the project leadership have the very strongly held view that heap allocations should be “explicit,” where explicit means there should be more syntax required to do it.
As a result, the project has considered a new wrapper type that would be required, which would
“explicitly” indicate (by virtue of being a different type) that the future type will be heap
allocated. My understanding is that something like what I’ve written above would be Box<Boxed<dyn Foo>>
, or maybe just Boxed<dyn Foo>
(it’s not clear to me from the material I have available).
My own opinion is different. I think its reasonable to make the default behavior of a heap allocated
trait object (i.e. Box<dyn Foo>
, Rc<dyn Foo>
and Arc<dyn Foo>
) to allocate the state machine
with the same allocator as that type. For non-owned trait objects, like &mut dyn Foo
, I would also
be fine making the default behavior allocating them with the global allocator, though here I see the
point more (especially because this wouldn’ be possible in no_std
contexts).
Regardless, I agree it would be important to allow users to override this behavior with some
alternative glue mechanism. This requires an interface for writing your own glue code, which might
do something else (like use alloca
to allocate a dynamically sized type on the stack). I just
think that there should be a reasonable default behavior, which for heap allocated trait objects is
probably heap allocating that state. In my opinion, this is not “implicit” any more than requiring
all users to use an adapter is “implicit,” it just involves setting a reasonable default. Still,
resolving this controversy to everyone’s satisfaction would be a blocker on this feature, as well as
developing the interface for the glue code.
I want to make one other note in this section: previous discussions of this issue treat the unstable
dyn*
feature as a prerequisite for object-safe coroutine methods. I do not believe
this is the case. What dyn*
does is create an existential type that all of the different trait
object pointer types would implement, by virtualizing also their destructor code; if you can accept
that trait objects using different allocation strategies for their virtual coroutine methods are
different types, there’s no dependence on dyn*
at all. I personally think the dyn*
feature is a
questionable direction for the Rust project to pursue.
Async destructors
Another very thorny issue is the problem of async destructors. Sometimes, a destructor might need to perform some kind of IO operation or otherwise block the current thread; it is desirable to support non-blocking destructors which instead yield control, so that other tasks can run concurrently. Unfortunately, there are several problems with this.
The first problem is that running the async destructor is best effort, even more-so than running any
destructor. This is because if you drop a type with an async destructor in a non-async context,
there’s no possibility of running the destructor because this is not in an async context. There have
been a couple of different ideas about how to solve this, such as using let async
bindings to
indicate variables that can’t be moved into a non-async context, or just accepting it and treating
the async destructor as only an optimization over the non-async destructor.
The second problem is actually very similar to the problem with trait objects: if the async
destructor needs to use some sort of state, where do you store it? One option is to disallow async
destructors from having state, using a poll method. This is simple, but it is problematic for things
like data structures: a Vec
for example has no way of storing which items it has polled already,
and has to keep polling their destructors in a loop. This would be pretty unacceptable, probably. But
then dealing with the state raises the same issues as trait objects.
The third problem with async destructors is how to handle their interaction with unwinding. In
particular, if you are unwinding through an async destructor, which returns Pending
, what happens?
There would need to be some kind of asynchronous version of catch_unwind
that the pending calls
can jump to, so that other tasks can run. This problem I think is easier to solve than the other
two, but it needs to be specced out.
I go back and forth between thinking that the difficulty with async destructors is one of the worst things about async Rust and thinking that maybe async destructors aren’t that useful anyway. Regardless of where you land, there is a lot of design work needed for this feature to be shippable, and I don’t think it will come soon.
Long-term features
In contrast to the near-term and medium-term features, there are certain larger problems with the design of Rust that I think should be considered carefully, such that they could not be addressed in the next few years. Still, the work of considering them must begin at some point, so that they can eventually be closed. I’m talking about “changing” the rules of Rust.
As of right now, there are a few valuable kinds of types that Rust cannot really support:
- Immoveable types: types which can’t be moved once their address has been witnessed.
- Unforgetable types: types which can’t go out of scope without running their destructor or destructuring them.
- Undroppable types: types which can’t be dropped or forgotten but must be destructured.
(The latter two are usually grouped together as “linear types” when people talk about them, but there are very important differences.)
I think evidence has shown that there is a strong motivation for at least the first two categories.
To support self-referential coroutines and intrusive data structures, Rust needs some support for
types that are known never to move again. Because Rust doesn’t support immovable types, we added
this functionality using the Pin
API. But the Pin
API has a few big flaws: one is that the API
is clunky and difficult to work with. More important, though, is that it requires an interface to
explicitly opt in to supporting immovable types; traits that existed before Pin
can’t gain the
ability to work with immovable types.
There are two specific traits for which this is a big problem:
Iterator
: because iterator doesn’t support immovable types, the project is at an impasse about how to support immovable generators.Drop
: because drop doesn’t support immovable types, an arcane implication is that you need crates likepin-project
to access fields of pinned types. This is all very baroque and confusing, and wouldn’t be necessary ifDrop
supported immovable types.
On the other hand, if Rust had the Move
trait, these problems would go away. Self-referential
generators would just not implement Move
, and work naturally. The Pin
type could be completely
deprecated, and a reference to a type that doesn’t implement Move
would have the same semantics as
a pinned reference to a type that doesn’t implement Unpin
. Of course, this would require pretty
major edition-crossing changes.
The scoped task trilemma presents a strong argument for types which cannot be forgotten. Stackless coroutines cannot use the destructor-based concurrent borrow trick: the only way to make it work is to use a closure-passing “internal” style, which is what Rust opted against when it went for stackless coroutines. This incompatibility between these two desirable aspects of Rust’s design makes a strong case that the decision not to support unforgettable types was the wrong decision.
I titled this post “a four year plan” for a reason: if Rust were to adopt these fundamental changes, it would have to be done across an edition boundary, and I strongly doubt that it could be done as part of the 2024 edition. This leaves the 2027 edition, four years from now, as the target for such a change. But the project should commit to a decision about this change sometime soon, in the next two years, and that should include a temporary solution for generators, such as requiring them to be pinned before they can be used as iterators.
I’ve been exploring what would be required to do this change on my blog this year because I think it is something the Rust project should seriously consider changing. I intend to continue to focus on this issue next year, because I think the implications of all of the different options needs to be fully understood. I’m trying to find ways to make this a collaborative process, but my options are limited. My goal isn’t really even to make a particular recommendation (though I will surely have opinions), but just to understand the full space of options for resolving these issues.
What are the exact trade offs between different options to handle the problem of self-referential
generators? What different requirements would there be to support “unforgettable” types as opposed
to “undroppable” types? If Move
were to be added, how could Pin
be removed across an edition
boundary? These are the kinds of questions I want to answer.
However, I recognize that adding support for these kinds of types would be the biggest change to Rust since it was stabilized in 2015, and that making this change would bring with it enormous costs for both the project and the community. I also recognize that there are valid arguments why supporting these kinds of types isn’t really worth it (like the painful interaction with trait objects). For these reasons, the Rust project should build into its consideration of this idea the possibility that not doing anything may ultimately be the right outcome.
In general, my instinct is to doubt big changes to Rust at this point in its design process. What I think Rust needs is to finish integrating the features it has already committed to - features like external iterators, stackless coroutines, monomorphized generics, and unsized trait object types. I specifically feel changing the rules around moveability and linear types is justified because of the implications for the integration of these existing features.
Closing remarks
This post has once again gotten very long. I decided to focus this post on changes to the language;
in another post to come I will focus my attention on the standard library and the async library
ecosystem, as well as devote a specific post to the AsyncIterator
interface. I want to make one
other remark, which I tried to find a place for in this post and the previous one, but couldn’t. It
concerns the controversy around the final syntax for the await operator which played out in 2019.
For those who don’t know, there was a big debate whether the await operator in Rust should be a prefix operator (as it is in other languages) or a postfix operator (as it ultimately was). This attracted an inordinate amount of attention - over 1000 comments. The way it played out was that almost everyone on the language team had reached a consensus that the operator should be postfix, but I was the lone hold out. At this point, it was clear that no new argument was going to appear, and no one was going to change their mind. I allowed this state of affairs to linger for several months. I regret this decision of mine. It was clear that there was no way to ship except for me to yield to the majority, and yet I didn’t for some time. In doing so, I allowed the situation to spiral with more and more “community feedback” reiterating the same points that had already been made, burning everyone out but especially me.
The lesson I learned from this experience is to distinguish between factors that are truly critical and factors that don’t matter. If you’re going to be obstinate about some issue, you’d better be able to articulate a deep reason why it is important, and it had better be something more pressing than the slight differences in affordances and aesthetics between syntax options. I’ve tried to take this to heart in how I engage in technical questions since then.
I worry that the Rust project took the wrong lesson from this experience. The project continues in its norm (as Graydon mentioned here) that with enough ideation and brainstorming, eventually a win-win solution to every controversy can be discovered. Rather than accepting that sometimes a hard decision has to be made, the project’s solution to the burnout of that comes from allowing these controversies to hang open indefinitely has been to turn inward. Design decisions are now documented primarily in unindexed formats like Zulip threads and HackMD documents. To the extent that there is a public expression of the design, it is one of a half dozen different blogs belonging to different contributors. As an outsider, it is nearly impossible to understand what the project considers a priority, and what the current state of any of these things are.
I’ve never seen the project’s relationship with its community be in a worse state. But that community contains invaluable expertise; closing yourselves off is not the solution. I want to see the relationships of mutual trust and respect rebuilt between project members and community members, instead of the present situation of hostility and dissatisfaction. To this, I want to thank those from the project who have reached out and engaged with me on design issues over the last few months.