The AsyncIterator interface

In a previous post, I established the notion of “registers” - code in Rust can be written in different registers, and it’s important to adequately support all registers. I specifically discussed the low-level interface of the AsyncIterator trait, about which there is currently a debate. The interface it currently has is a method called poll_next, which is a “poll” method like Future::poll. Poll methods are very “low-level” and are harder to write correctly than async functions. Some people would like to see AsyncIterator shifted to have an async next method, simply the “asyncified” Iterator trait.

I objected to this on largely philosophical grounds: the interface of AsyncIterator is the “low-level” interface, and so it should prioritize giving users total control over being easy to use. However, I did not go into any particular details about this interface and the differences between poll_next and “async next.” After all, it’s not like the interface of AsyncIterator should be harder to implement by hand, just for the sake of it, even if there are easier ways to create AsyncIterators than implementing the trait by hand.

In this post, I will explore the differences between poll_next and “async next” more thoroughly. There are two key differences between these interfaces:

  1. In poll_next, there is only one place to store state, whereas in “async next” there are two - the longer-lasting state of the iterator, and the shorter-lasting state of the future that references the state of the iterator.
  2. In poll_next, that single state storage facility is pinned in place, whereas in “async next” only the shorter-term future state is pinned in place, and the longer term storage facility is unpinned.

These are not trivial differences: they have a fundamental impact on the way async iterators can be written and used. When writing APIs for the low-level register, what is most important is to give users the most advantageous representation, not to make code as easy as possible. Obviously, if it made no difference, the “easiest” API should be preferred. But when alternatives in a “high-level” register exist, the sacrifices of low-level control should not be made in the low-level register.

These differences have impacts on the experience across both sides of the interface: when consuming it by calling next, and when using it by defining an async iterator.

Calling next

The main differences when calling next arise from the fact that the “async next” introduces a second state machine, and the caller has to manage that second state machine separately from the original state machine of the iterator.

One major problem that arises from this has been framed as a problem of “cancellation safety.” Suppose you are selecting from two async iterators in a loop. If, when one completes, you drop the next future, any progress made in that state machine will be cancelled. When you go around the loop again and return to waiting for that next, instead you will start a new next future over from the beginning state. To implement this behavior correctly, you would be responsible for storing the next futures outside of the loop and replacing them whenever they finish. This is a major footgun that would be specifically problematic for AsyncIterator (though it can show up in other places already) because AsyncIterator is almost always polled repeatedly in a loop like this.

On the other hand, with the poll_next version, all of the state is stored in the async iterator. Therefore, whenever you drop the next future, this has no impact on in-flight state transitions: the next time you construct a next future, it will poll the async iterator from the same state you left it in. This is a much more useable API.

Similarly, the current poll_next definition of Stream in futures is trivially object safe, and is often used that way. Because there’s no second state machine, you only need an object which can store the state of the stream, not an additional object which stores the state of the next future. There’s been discussion of someday making async methods object safe, but with poll_next, async iterators can be object safe now, and without any additional allocations or codegen that could impact performance.

More vaguely, introducing a second state machine will almost certainly weaken optimizations, because those optimizations will need to “eat through” the indirection from the second state machine to the first state machine in order to coalesce states and optimize their layout. Our state machine optimization for stackless coroutines is already not as strong as it could be (and I’ve seen people reject async Rust for precisely this reason). This will make optimizing async iterators harder.

These are all negatives of using an “async next;” in contrast the one advantage of using “async next” is that you don’t have to pin the AsyncIterator in place before calling next on it. This is especially salient for users because there is no “for loop” for async iterators: instead users are currently expected to call the next method repeatedly with a while loop. The solution to this problem, though, would be to provide some sort of for await loop, which operates on async iterators and pins them in place for you.

Defining an async iterator

The advantage of not having to pin an AsyncIterator to call next arises from the fact that it does not pin the iterator state, but this is also the biggest disadvantage when implementing the the AsyncIterator, because it means the implementer cannot take advantage of the iterator state being pinned. What this amounts to is that async iterators will not be pinned across yield points, but only across await points.

This limits the kind of async iterators that can be defined. Some synchronization primitives (like those in tokio) use the pinning guarantee to store the states in intrusive data structures to avoid additional heap allocations. These may be limited by the fact that they cannot store the iterator state in an intrusive collection long term, only the the short-term future state.

More concerning, though, is the impact on async generators. With the interface as defined today, async generators can be self-referential across all points, making the problem of using references in async generators completely disappear. Without that pinning guarantee, async generators will only be able to be self-referential across await points, but not across yield points. This would probably be very confusing and irregular for users, and will require additional compiler work to implement.

poll_next is more fundamental

One incorrect claim I’ve seen made is that the “async next” version is more fundamental than the poll_next version, because the poll_next version could be defined in terms of the “async next” version. This claim is actually completely wrong, and the truth is the opposite.

The reason this claim is made is that the existence of future::poll_fn makes it seem like the user could implement an “async next” method by way of poll_fn to get the equivalent of a poll_next method. However, poll_fn does not get the user the same interface as poll_next, because poll_fn cannot capture the iterator state as pinned, for exactly the reason discussed above. And even though implementers can get part of the pattern of writing a poll method (but without the pinning guarantee), callers cannot get the advantages of a poll method that I discussed previously, because they will not be guaranteed that implementers did not use a stateful future.

On the other hand, you can trivially transform an “async next” trait into the poll_next version, because the poll_next version pins everything, and thus can contain self-references from the future state of the next method into the longer-lived iterator state. Others have already mentioned that you can implement this transform by hand with unsafe code and creating a library-defined safe abstraction, but you can actually implement it completely safely using async generators as well. Here is an example:

trait MyAsyncIterator {
     type Item;
     async fn next(&mut self) -> Option<Self::Item>;
}

async gen fn streamify<T>(mut iter: impl MyAsyncIterator<Item = T>) -> T {
    while let Some(item) = iter.next().await {
        yield item;
    }
}

An even simpler interface could exist in std - an asyncified version of iter::from_fn, which is functionally the same (the iterator struct being captured in the closure state):

async gen fn from_fn<T, F>(mut next: F) -> F) -> T
    where F: FnOnce() -> impl Future<Output = Option<T>>
{
    while let Some(item) = next().await {
        yield item;
    }
}

Thus, with only a few lines of code, anything using “async next” can be transformed into the poll_next implementation if that is the implementation chosen for the “real” AsyncIterator. I hope this also gives a taste of how much easier generators will be than next methods.

So, in effect you can use an “async next” method to implement an async generator, getting all the usability advantages of “async next” via the generator feature. This leaves the only advantage of “async next” being the fact that you can call next without pinning, and the disadvantages that it is not (yet) object safe, it is not (ever) cancellation safe, it will be harder to optimize, and it means async generators cannot borrow across yield points. This is the real trade off, and to me the advantage is pretty clearly on the side of poll_next.

Shipping

I have one final point: the poll_next interface already exists. The infrastructure for generators (including async generators) already exists, and just requires a bit of integration to put it over the finish line. It seems completely possible to me that generators, including async generators, could be stabilized on a timeline of about one year if it were prioritized by the project. The real blocker is that you probably need an edition boundary to reserve whatever keyword is used for generators (in my opinion, this should have been figured out in 2021 and generators shipped already).

Async methods might ship this year, but if they ship they won’t be object safe. Using the poll_next interface, async iterators would be object safe now, async generators would be able to borrow across all points now. And while async methods might ship soon, it seems like the keyword generics group would rather see an “async effect modifier” and not have an AsyncIterator trait at all. Setting aside whether or not this is a good idea (I’ve already elaborated on why I think it is poorly motivated), when would this ship? I would strongly encourage the Rust project not to block this highly demanded feature that’s been in development since 2016 on a grand new abstraction that is barely off the ground.

Toward the end of advancing my claim that generators could ship on a timeline of about a year, in my next post I will write up the remaining design questions for generators and my opinions about them.