poll_next
In my previous post, I said that the single best thing the Rust project could do for
users is stabilize AsyncIterator. I specifically meant the interface that already exists in
the standard library, which uses a method called poll_next
. Ideally this would have happened years
ago, but the second best time would be tomorrow.
The main thing holding up the AsyncIterator
stabilization is a commitment by some influential
contributors of the project to pursue an alternative design. This design, which I’ll call the
“async next” design, proposes to use an async method for the interface instead of the poll method of
the “poll next” design implemented today. In my opinion, continuing to pursue this design is a
mistake. I’ve written about this before, but I don’t have the sense my post was
fully received by the Rust project.
Yosh Wuyts, a leading contributor to the async working group, has written his own post about
why the async next design is preferable to poll next. A lot of this is structured as an attempted
refutation of points made by me and others about problems with the async next design. I do not find
the argument in this post compelling, and my position about what the project should do is unchanged.
I’ve written this to attempt to express again, in more detail and more definitively, why I believe
the project should accept the poll next design and stabilize AsyncIterator
now.
Are two state machines better than one?
The fundamental difference between the poll next design and the async next design is a difference of the representation of the state machine for asynchronous iteration. In the poll next design, there is a single state machine: the asynchronous iterator. In the async next design, there are two: the future method for each iteration which references the longer-lived iterator.
Let’s look at the two definitions from a “type system” perspective. I’m going to desugar the async
trait method into its ultimate form, so that the difference in the type signature of these traits
can be more clearly perceived. I’ll also desugar a for await
loop in both designs, to help
understand how each design would operate:
// POLL NEXT DESIGN
trait AsyncIterator {
type Item;
fn poll_next(self: Pin<&mut Self>, cx: &mut Context<'_>)
-> Poll<Option<Self::Item>>;
}
// ASYNC NEXT DESIGN
trait AsyncIterator {
type Item;
type Future<'a>: Future<Output = Option<Self::Item>> where Self: 'a;
fn next<'a>(&'a mut self) -> Self::Future<'a>;
}
// POLL NEXT DESIGN
let mut iter = pin!(iter);
'outer: loop {
let next = 'inner: loop {
match iter.as_mut().poll_next(cx) {
Poll::Ready(Some(item)) = break 'inner item,
Poll::Ready(None) => break 'outer,
Poll::Pending => yield Poll::Pending,
}
};
}
// ASYNC NEXT DESIGN
'outer: loop {
let mut future = pin!(iter.next());
let next = 'inner: loop {
match future.as_mut().poll(cx) {
Poll::Ready(Some(item)) = break 'inner item,
Poll::Ready(None) => break 'outer,
Poll::Pending => yield Poll::Pending,
}
};
}
The salient differences between these two designs are:
- In
poll_next
, there is a single state machine (theAsyncIterator
) which is pinned and alive for the entire iteration of the loop. - In
async next
, there are two state machines (theFuture
returned by next, and the underlyingIterator
). TheFuture
is pinned and alive for only a single iteration of the loop, whereas theIterator
is unpinned and alive for the entire loop.
I often find that describing things like this in text is not very clear, so I’ve created a visual diagram to drive the point home. In this diagram, the different blocks represent stateful objects, and the arrows are references to them.
╔═══════════════╗ ╔═══════════════╗
║ ║░░ ║ ║░░
║ POLL_NEXT ║░░ ║ ASYNC NEXT ║░░
║ ║░░ ║ ║░░
╚═══════════════╝░░ ╚═══════════════╝░░
░░░░░░│░░░░░░░░░░ ░░░░░░│░░░░░░░░░░
│ │
│ pin
─────────────────────── │ ───────────────────────────────── │ ────────────
│ │
│ ▼
│ ╔═══════════════╗
ALIVE FOR │ ║ ║░░
A SINGLE pin ║ FUTURE ║░░
ITERATION │ ║ ║░░
│ ╚═══════════════╝░░
│ ░░░░░░│░░░░░░░░░░
│ │
│ mut
─────────────────────── │ ───────────────────────────────── │ ────────────
│ │
▼ ▼
╔═══════════════╗ ╔═══════════════╗
ALIVE FOR ║ ║░░ ║ ║░░
THE ENTIRE ║ ASYNCITERATOR ║░░ ║ ITERATOR ║░░
LOOP ║ ║░░ ║ ║░░
╚═══════════════╝░░ ╚═══════════════╝░░
░░░░░░░░░░░░░░░░░ ░░░░░░░░░░░░░░░░░
These differences in the representation of asynchronous iterations have far reaching implications regarding the affordances that each interface provides.
First, I want to talk about the performance implications of these two interfaces. Yosh spends some
time in his blog post into showing that a simple example (an async iterator that immediately returns
once) is optimized into the same code by LLVM. In the async version of this example, the Future
state machine has no state, whereas the Iterator
has the same state as the AsyncIterator
in the
poll version: a single option.
It’s no shock that LLVM can eliminate unnecessary indirections, especially in simple cases like this. But it certainly can’t be assumed that LLVM will always eliminate this indirection: it depends on a lot of optimization heuristics that can’t be guaranteed to trigger when both state machines become a lot more complex. Asserting that the Rust project would treat a failure to inline as a bug doesn’t make this a zero cost abstraction. In the async next design, you are introducing an indirection by having a second, short-lived state machine that references the original longer-lived state machine, and eliminating that indirection isn’t guaranteed.
This is especially problematic for dynamic dispatch, which eliminates the possibility of inlining,
because next
or poll_next
become virtual calls. Yosh does not address this at all in his post
when he talks about object safety. He says:
Needing to replace
Box::new
withBoxed::new
is not a major difference.
Here, he is only talking about affordances without talking about representation. What the Boxed
adapter does is make the async method allocate its state machine in the heap, that’s the reason the
API is different! With async next in every iteration of the asynchronous loop, if your iterator is a
dynamically dispatched object, you’re going to need to dynamically allocate the Future
state
machine.
For for await
loops, it would be theoretically possible (though how the compiler would determine
this I am not certain) to use an alternative dynamic allocation method, such as alloca
, or
re-using the same heap allocation. But now we’re piling on additional hypothetical compiler features
(beyond even the basic notion of dynamically dispatched async methods) to avoid repeatedly
allocating in a loop. With the poll next design, dynamic dispatch works for free, immediately,
because there is no additional state machine to be dynamically allocated as part of the method call.
Having to accommodate two different state machines to perform asynchronous iteration will never be a more optimal representation, even if LLVM can eliminate the indirection in some static cases. Rust’s future model was specifically designed to avoid such indirections and dynamic allocations at great cost. Asynchronous iteration, as a fundamental abstraction for constructing asynchronous state machines, should similarly avoid unnecessary indirections. Doing otherwise would not be a zero cost abstraction and not in line with Rust’s prior commitments.
Pinning
Separate from the fact that there are two state machines, the other issue is that only the
shorter-lived state machine would be pinned in place in the async next
design. This means that
only the shorter-lived state machine can take advantage of immoveability during a single iteration
of loop; the longer-lived state machine would need to assume it could be moved between iterations.
This is not a theoretical problem. Already, concurrency primitives in tokio, smol and async-std all use an intrusive linked list to implement synchronization: whenever an event occurs, the other handles that have been waiting for that event are notified using a queue which is implemented as an intrusive linked list, with the nodes stored in the state machines of those concurrency primitives. This requires that those state machines be pinned in place.
By not supporting pinned long-lived state machines, this means that only the single iteration state machine can be held in the queue. For example, consider any sort of multi-consumer channel. Every receiver in that channel will be in the notification queue, waiting to be awoken when a message is sent to the channel. If only the shorter-lived state machine can be in the channel, when that shorter-lived future is dropped (e.g., because it is being raced with some other future), it will have to lose its place in the notification queue. This could result in certain receivers being starved as they lose their place (of messages, in anycast channels, or of CPU time in all cases).
Because tokio doesn’t directly depend on the unstable stream API, tokio’s broadcast
channel only provides a recv
method, which has these semantics (i.e. when the
Recv
future is dropped, it loses its place in the queue). On the other hand, smol’s anycast
channel provides both a recv
method with this behavior and also implements
Stream
, which keeps its place in line.
These differences matter. Rust’s core abstractions should be defined so that they can support all
of these use cases. The only way with async next to pin the long-lived state machine in place is to
implement AsyncIterator
for a pinned reference to your state machine, instead of for your state
machine. Here’s what that diagram would look like in that case:
╔═══════════════╗ ╔═══════════════╗
║ ║░░ ║ ║░░
║ POLL_NEXT ║░░ ║ ASYNC NEXT ║░░
║ ║░░ ║ ║░░
╚═══════════════╝░░ ╚═══════════════╝░░
░░░░░░│░░░░░░░░░░ ░░░░░░│░░░░░░░░░░
│ │
│ pin
─────────────────────── │ ───────────────────────────────── │ ────────────
│ │
│ ▼
│ ╔═══════════════╗
ALIVE FOR │ ║ ║░░
A SINGLE │ ║ FUTURE ║░░
ITERATION │ ║ ║░░
pin ╚═══════════════╝░░
│ ░░░░░░│░░░░░░░░░░
│ │
│ mut
─────────────────────── │ ───────────────────────────────── │ ────────────
│ │
│ ▼
│ ╔═══════════════╗
│ ║ ║░░
│ ║ PIN ║░░
│ ║ ║░░
▼ ╚═══════════════╝░░
╔═══════════════╗ ░░░░░░│░░░░░░░░░░
ALIVE FOR ║ ║░░ │
THE ENTIRE ║ ASYNCITERATOR ║░░ pin
LOOP ║ ║░░ │
╚═══════════════╝░░ │
░░░░░░░░░░░░░░░░░ ▼
╔═══════════════╗
║ ║░░
║ ITERATOR ║░░
║ ║░░
╚═══════════════╝░░
░░░░░░░░░░░░░░░░░
Yosh proposes that instead there will be a “pinned iterator” and “pinned async iterator” trait, which are similar to the normal traits except that they take self by a pinned reference instead of a mutable reference. This is a lead-in to a longer term vision of introducing a “pinned effect.”
I think a better way to think about this than an “effect” (which collapses every axis of abstraction into a single, vaguely elaborated concept) is basically “mutability polymorphism,” the idea that we should be able to abstract over the different borrowing & ownership variants (shared references, mutable references, pinned references, owning references, etc). This idea has been tossed around occasionally, but the design has never really gotten anywhere. I have a lot of doubts about the viability and prudence of pursuing this idea, like I do about adding any new axis of abstraction to Rust.
There are other ways to solve the problem of immoveable iterators. One would be to change the
Iterator
definition across an edition boundary. Another would be to add a
Move
trait and get rid of Pin
entirely. Each of these would leave the two designs in the same
space: in the first, the underlying Iterator
trait would be required to be pinned, making both
designs feature pinning; in the second, Pin
would disappear and this would no longer be such a
difference.
But for AsyncIterator
, the problem can be avoided by using the poll next design. Any change to
make iterators support immoveability will necessarily be slow and disruptive. Instead, I propose the
project enable AsyncIterator
with support for immoveability now, by using poll next as the API.
Cancellation
The introduction of a second state machine doesn’t just have performance implications, it also has logical implications that arise from its interaction with cancellation. To analyze this issue will require a somewhat digressive discussion of the concept of “cancellation safety.”
The issue is that every time you drop the next
future, that future is cancelled, and the next time
you start polling next
, a new future has to be prepared from the state of the underlying iterator.
The problem I described above of synchronization primitives “losing their spot in the queue” is
actually a special case of this scenario. If you cancel the next
future in the poll_next
case,
nothing happens: you’ll be in the same state the next time you call next
.
A common problem with async Rust is that users will cancel futures without realizing that that’s what’s occurring or what the implications of cancelling a future is. In many other languages, futures continue to run after users stop waiting for them; Rust’s “drop as cancellation” design is more optimized but often confuses users. A concept that has emerged is the concept of “cancellation safety:” cancelling a future that is “cancellation safe” will have no visible effect.
In his post, Yosh attempts to formalize this notion of “cancellation safety” using the Rust trait
system. Yosh’s definition of “cancellation safe” as a future which has no local state is correct,
though his elaboration of this as “has only one await point” doesn’t adequately capture what it
means to have no local state (he’s aware of this and alludes to locking a Mutex
as an example
which his definition has failed to capture).
It’s true that an async function with two await points necessarily has local state, but you also
need to look at the “low-level” poll-based futures to see whether or not they have local state. In
the case of Mutex::lock
, that local state is its position in the notification queue, the primitive
I previously discussed in reference to pinning. Ultimately, a future is “cancellation safe” if its
state consists of nothing but references to other stateful objects. Because of this, cancelling such
a future and then constructing a new future from the same arguments will produce a state machine in
the exact same state as the cancelled one, having no visible effect.
However, the problem with trying to introduce cancellation safety into the type system is that it is not the case that cancelling a future that isn’t “cancellation safe” is always a bug. The concept of cancellation safety in tokio is to alert you to the fact that if you cancel certain futures, it is meaningful behavior. But this can be the behavior you want! This is very different from something like a data race, which is never correct behavior.
Yosh actually gives a good example of this when he suggests that “read” is a cancellation safe operation. This depends. For a readiness-based reactor like epoll, it is true that read is “cancellation safe,” in that subsequent reads will read the data that would have been read. But this is emphatically not true for completion-based reactors like io-uring: if you cancel a read that you’ve already issued, its possible that that read will complete, but you will not see the result, and therefore a subsequent read would lose that data.
Is that correct behavior? It depends! If you have nothing to do with the data that would be in the next read call, cancelling it is correct behavior. On the other hand, if you want to keep reading from that object and don’t want to miss any data, even if you wanted to cancel this particular read operation, that behavior would be a bug.
Because of the fact that meaningful cancellation can be a desirable behavior, I’m dubious of the
framing of cancellation in terms of “cancellation safety” in general. But I think there’s actually a
different problem here which the concept of “cancellation safety” misdirects from: the problem is
that some APIs cause futures to be cancelled in a way that isn’t obvious to users. These users don’t
just miss that cancellation is meaningful, they miss that cancellation is happening at all. This
biggest culprit here is select!
in a loop.
It’s very common in my experience to see tasks that have a fixed set of internally concurrent operations, which are performed repeatedly. Select in a loop is a very nice pattern for this, because it lets the task use shared state when responding to events from different sources. For example:
loop {
select! {
msg1 = rx1.recv() => {
// handle msg1
}
msg2 = rx2.recv() => {
// handle msg2
}
_ = async_function() => {
// handle async_function finishing
}
}
}
This pattern suffers from a serious problem: in every iteration of the loop, all the futures that didn’t complete first will be cancelled and then constructed again on the next iteration. If a future had a meaningful cancellation, this behavior is visible. But users often tend to think of each select branch as being polled repeatedly in a loop, without really realizing that they are also constructing and cancelling futures each iteration as well.
The way to avoid cancelling any future in the loop right now is to “hoist” it out of the loop, but that is not obvious and requires pinning and fusing the future (so that it can be polled by reference and so that polling it after it’s finished doesn’t panic):
let mut future = pin!(async_function().fuse());
loop {
select! {
msg1 = rx1.recv() => {
// handle msg1
}
msg2 = rx2.recv() => {
// handle msg2
}
_ = &mut future => {
// handle async_function finishing
}
}
}
The async next design presents the same footgun: if next
can have a meaningful cancellation -
which is inherent in having next
be an async function - users who cancel a call to next
might
walk into the same trap. Given that the iterator is also a state machine, it seems likely that users
will be especially unlikely to realize that the next
future itself could contain meaningful state,
and especially likely to accidentally cancel it.
Sidebar: merge!
In my view, the biggest problem here is that select in a loop is an API which is too easy to misue. The solution (which Yosh has also blogged about) is to instead use streams and a merge operation here. Merge behaves like select, but instead of operating on futures, it operates on streams. One could imagine a very similar macro to the widely used select, but which operates on streams repeatedly instead of futures once:
merge! {
msg1 = rx1 => {
// handle msg1
}
msg2 = rx2 => {
// handle msg2
}
_ = once(async_function()) => {
// handle async_function finishing
}
}
By merging streams, instead of repeatedly selecting futures, the whole thing not only becomes
simpler, but in the case that you want one branch to use a future instead of a stream, you have to
explicitly convert it to a stream using a constructor that specifies its semantics. For example,
instead of using once
in that code block, if I wanted it to call that function again after it
finished, I could have used something like repeat_with
.
There’s actually a table of straightforward concurrency operators for both future and async iterator. In the first column, there’s the operators that operate on only a single item. In the second column, there are those which are “sum” operators - they yield as soon as one subtask is ready. In the third column, there are those which are “product” operators - they yield only once all of them are ready. Like this:
│ SINGLE │ SUM │ PRODUCT
───────────────┼─────────────┼───────────┼──────────
│ │ │
FUTURE │ await │ select! │ join!
│ │ │
ASYNCITERATOR │ for await │ merge! │ zip!
│ │ │
Unfortunately, the AsyncIterator
-based concurrency combinators are not as explored in the
ecosystem because the underlying trait has not been stabilized. For that reason, though a pairwise
merge
combinator exists as a method on Stream
, a macro-based approach similar to select!
is
not available in the ecosystem as far as I know. This is a great example of how the failure to ship
in the core language has held back the ecosystem at large.
This is all a bit of digression: the merge operator could be implemented with either the poll next
design or the async next design. However, I’ll note that with the async design, the merge operator
would need to separately store the next
futures, making it more complicated to implement and more
plausible that someone (perhaps implementing it in a bespoke way locally) would get it wrong.
A tradeoff of affordances
Ultimately, there’s a trade off here. On the one hand, the poll next design simplifies the
representation so that there is a single, pinned state machine that lives for the entire iteration.
This enables a few behaviors that aren’t possible, aren’t easy or aren’t zero cost in the async next
design. But it does eliminate one big affordance from the AsyncIterator
API: you can’t define an
async iterator with an async next method. Is that affordance worth the negative impact elsewhere?
Yosh’s commentary on why this affordance is worth supporting is interesting:
I expect pretty much everyone will agree that on a first look the
async fn next
-based trait seems easier to use. Rather than needing to think about whatPin
is, or howPoll
works, we can just write our async functions the way we usually do, and it will just work. Pretty neat!
It’s clear that what Yosh sees as the ease enabled by this affordance is to not deal with pinning or the task APIs. The advantage is that users can write an async iterator without worrying about those arcane APIs from the “low-level” register. But I think there’s more to the story.
I agree that being able to define an AsyncIterator
without using these APIs is critically
important to async Rust’s usabiliy. As I’ve written about before, the way that I want Rust to enable
users to do that is by providing an asynchronous generator syntax. Yosh used “once” as an example of
how the async next design makes things easier to implement than the poll next design. Here’s how
once looks with async generators:
// As an async generator function
async gen once<T>(value: T) yields T {
yield value;
}
// As an async generator block
async gen {
yield value;
}
But I think there’s actually a deeper difference between async generators and async next that is
really important to emphasize. Though Yosh centers the difficult APIs that poll_next
deals with as
“the problem” with poll_next
, from my perspective the much more challenging aspect is hand-writing
the state machine. When you deal with more complex examples than once
, which have multiple states
they transition through, this quickly becomes more apparent.
I want to draw readers’ attention to the recent curl CVE which resulted from an error in implementing an asynchronous state machine. A variable was being stored on the stack in the function polling the state machine, instead of in the state machine itself, so it was re-set every time the state machine was polled. This is the kind of bug that can occur when writing a state machine by hand, and in this case it was devastating. Coroutines like asynchronous functions and generators prevent users from making this mistake, because the compiler generates the state machine for you, storing any state that is needed to ensure the coroutine is resumed from the same place that it yielded.
The real problem with implementing an AsyncIterator
“by hand” is that you are responsible for
correctly implementing the state machine, and these kinds of mistakes are easy to make. In the case
of the async next design, there are two state machines, and one of them is generated by the compiler but
the other is written by hand. This might seem like it eliminates at least some of the complexity
(part of the state machine is generated), but it really doesn’t. You still have two places you could
store variables (in each state machine instead of on the stack vs the state machine), and if you
need state to persist between iterations, it is your responsibility to realize this and move that
variable into the hand-written iterator state machine.
I actually think this kind of “mixed register” API is particularly dangerous, because users can be lulled into a false sense of security by the fact that they can use high-level async/await syntax, while still being responsible for figuring out which state needs to be persisted between iterations. Instead, users wishing to remain in a “high-level” register should be directed to either asynchronous generators (which let them use an imperative coding style) or combinators (which let them use a functional coding style). Both of these avoid the pitfalls of hand-written state machines.
The poll_next
interface is there for users who want to use a “low-level” register, because they
want to precisely control the layout and behavior of their asynchronous iterator. For these users,
the async next design is strictly worse than the poll next design. They would need to use a combinator like
poll_fn
to get access to the Context
; they would need to manage the existence of two different
state machines; they would need an additional indirection if they need the long-lived state machine
to be pinned.
And finally, as I’ve written in a previous post, with asynchronous generators the poll next design does
enable users who really want to have an async fn next
. All that’s needed to convert an object with
an asynchronous next method to an async generator is this little snippet of code:
// This would convert an async next AsyncIterator to a poll_next AsyncIterator
async gen {
while let Some(item) = iter.next().await {
yield item;
}
}
AsyncIterator
’s relation to Future
and Iterator
There’s one other angle I want to examine the two designs from, which is more theoretical than practical. Yosh makes these remarks:
When people say that “async iterator is not the async version of iterator” they are correct. … Instead it’s better to ask whether async iterator should be the “async version of iterator” - and I certainly believe it should be.
Incidentally that has also been the framing of the trait WG-async has been communicating to T-lang and T-libs, who have signed off on it. I’m not suggesting that this decision should bind us (I don’t like to rules lawyer). What I’m instead trying to show with this is that this has been an accepted framing of what the design should achieve for years now, and we’ve already rejected the framing that “async iterator” (or “stream”) should be its own special thing. That certainly can be changed again, but it is not a novel insight by any stretch.
This framing seems to be based on a misunderstanding of a comment of mine from an earlier blog post:
I don’t object to the name change [from
Stream
toAsyncIterator
], but I do think it has been bundled up with an ideological commitment I do object to - namely the consideration of AsyncIterator as “just” the async version of Iterator. We shouldn’t forget that it’s also the iterative version of Future.
People within the project have made similar remarks about not wanting to “duplicate” traits,
believing that having both AsyncIterator
and Iterator
is worse than “just” having Iterator
-
after all, one trait is better than two, no? But my objection has been misunderstood: I’m not saying
that AsyncIterator
is not the async version of iterator (obviously it is!) but that it is both
that and also the iterative version of future. It is both of these things, because it represents a
coroutine that is simultaneously asynchronous and iterative.
I mean this literally. There is another representation of AsyncIterator
that would have just as
much validity as the async next version, which I would call an “iterative poll” version of the
interface. This version would take a Future
and make its poll method into an iterator, in the same
way that the async next design takes an Iterator
and makes its next method into a future. Let’s
instead call this interface IteratorFuture
:
// With generator methods:
trait IteratorFuture {
type Item;
gen fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>)
yields Poll<Self::Item>;
}
// Desugared:
trait IteratorFuture {
type Item;
type Iter<'a>: Iterator<Item = Poll<Self::Item>> where Self: 'a;
fn poll<'a>(self: Pin<&'a mut Self>, cx: &'a mut Context<'_>)
-> Self::Iter<'a>;
}
// A for await loop looks like:
let mut iter_future = pin!(iter_future);
let mut iter = iter_future.poll(cx);
'outer: loop {
let next = 'inner: loop {
match iter.next() {
Some(Poll::Ready(item)) => break 'inner item,
Some(Poll::Pending) => yield Poll::Pending,
None => break 'outer,
}
};
}
In this design, the relationship between Future
and Iterator
is inverted from the async next design:
╔═══════════════╗
║ ║░░
║ ITERATOR POLL ║░░
║ ║░░
╚═══════════════╝░░
░░░░░░│░░░░░░░░░░
│
mut
────────────────────── │ ────────────────────────
│
▼
╔═══════════════╗
║ ║░░
║ ITERATOR ║░░
║ ║░░
╚═══════════════╝░░
░░░░░░│░░░░░░░░░░
ALIVE FOR │
THE ENTIRE pin
LOOP │
│
▼
╔═══════════════╗
║ ║░░
║ FUTURE ║░░
║ ║░░
╚═══════════════╝░░
░░░░░░░░░░░░░░░░░
This design has several advantages over the async next design:
- Both state machines exist for the entire loop, so there’s no risk of reallocating the inner state machine repeatedly in the loop.
- The longer lived state machine is pinned, so all of the state can be treated as pinned in place.
- There’s no possibility of a meaningful cancellation for the next future.
- The affordance it provides match “low-level” interfaces better, which frequently iterate over an underlying asynchronous IO or concurrent object, rather than the reverse.
Do I think this design would be the correct approach? Absolutely not! This would be contorting
the API to provide a strange affordance: writing all asynchronous iterators as generator methods
on futures. My point instead is that without the specific path dependence of Rust’s history, it
doesn’t particularly make sense to frame the separation of AsyncIterator
from Iterator
as
“splitting a trait in two” any more than it makes sense to frame the separation of AsyncIterator
from Future
as “splitting a trait in two.”
In reality, AsyncIterator
is the product of Iterator
and Future
; it draws equally from each
interface. By having Iterator
and Future
, Rust has already committed to having multiple traits
for specific coroutine patterns: iterative and asynchronous operation. To combine these two
patterns, the solution is a third trait that has similarities with each of them. It would be
arbitrary and wrong to modify the iterative coroutine to yield up asynchronous coroutines, just as
it would be arbitrary and wrong to modify the asynchronous coroutine to return an iterative
coroutine.
This becomes clear as well if you examine a table of the types involved in three coroutine traits,
imagining them as specializations of a general Coroutine
interface. AsyncIterator
shares a
column with each of type with each of Iterator
and Future
, and in one column the types for each
are different:
│ YIELDS │ RETURNS │ RESUMES
──────────────┼─────────────────────┼─────────────────┼─────────────────
│ │ │
FUTURE │ () │ Self::Output │ &mut Context
│ │ │
ITERATOR │ Self::Item │ () │ ()
│ │ │
ASYNCITERATOR │ Poll<Self::Item> │ () │ &mut Context
│ │ │
Conclusion
Often, I find there is a complex trade off between two possible designs, and while I may prefer one approach, I can see the merit in the other. This is not one of those cases. I think the case for the poll next design is ironclad. It is a simpler representation, it guarantees better runtime representation, supports better dynamic dispatch, has a better interaction with pinning, presumes a better set of default affordances, is more theoretically correct, and is compatible with users who really want to define an iterator with an async next method in the first place.
Setting aside its preferability in the abstract, we also need to consider the real world context of Rust as-it-is. Pursuing the poll next design would require implementing and shipping generators to provide the high-level register in the imperative style, the recommended way of implementing an async iterator. I believe this should be a priority for the project regardless, but the need to prioritize generators is an implication of shipping the poll next design.
What are the implications of the async next design? One fact I’ve avoided dwelling on is that
contributors who prefer it to the poll next design don’t just want an AsyncIterator
trait with an
async method, what they want is to define Iterator
so that its method can maybe be async
. In
other words, their design implicates an entire new axis of abstraction for Rust: that traits should
be abstract over whether or not their methods are asynchronous. This concept was previously called
“keyword generics” and has been rebranded to “effects.” Such proposals have been very controversial
in the community. I’ll also note that it’s been 16 months since the “Keyword Generics Initiative”
was launched, and as of yet I have seen no concrete design proposal for how such
a system would really work, only vague code samples to demonstrate syntax.
Beyond the dependence on shipping an “effects system,” to achieve the same affordances as the poll next design it would also requires a piling-on of additional features. For object safety, async methods need to be made object safe, and to avoid allocating in the loop, the compiler needs to desugar async iterator trait objects in a special way. For pinning, Yosh proposes a new pinned trait really as a lead in to a new “effect” that would someday need to be added. When will all of these features ship that are needed to reach parity with the API that already exists? How many more years will users be forced to wait to get this highly desirable API, on the basis of slowly evolving proposals for more and more language features?
I’ve written publicly that I have concerns about the idea of “keyword generics” as a sensible way to extend Rust, but I am not saying here that such systems should not ever be pursued. It’s within the realm of possibility that a design that resolves users’ concern could be developed, and such a system could prove to be net positive and someday implemented. If that’s what people want to spend their time working toward, that’s their choice.
I am just questioning whether such an abstraction is a sensible way of handling the intersection of iteration and asynchrony, and whether it is a good trade off to block shipping real value to users on a seemingly indefinite design ideation process. I implore the Rust project to consider these questions seriously, and pursue a path of shipping incremental value now.