Without boats, dreams dry up
In July, I described a way to make pinning more ergonomic by integrating it more
fully into the language. Last week, I develoepd that idea further with the notion of
UnpinCell
: a wrapper type that lets a user take an &pin mut UnpinCell<T>
and produce an &mut T
, similar to how other cells let a user take a shared reference to the cell and produce a mutable
reference to its contents. I believe that this notion can also solve the biggest outstanding issues
facing generators: the fact that the Iterator
interface does not permit
self-referential values.
As I wrote in my explanation of Pin’s design, the biggest advantage that Pin had over other
design ideas was that it was a trivially backward compatible way of introducing a contract that an
object will never be moved. But this meant that a trait could only opt into that contract using the
new interface; traits that existed before Pin and don’t opt into that contract cannot be implemented
by types that have self-referential values. The most problematic trait here is Iterator
, because
generators (functions that evaluate to iterators in the same way async functions evaluate to
futures) would ideally support self-referential values just like async functions do. So long as the
interface for Iterator
takes a mutable reference and not a pinned mutable reference, implementers
must assume the iterator can be moved around and therefore can’t be self-referential.
A variation on my previous design for pinned places has occurred to me that would be more consistent with Rust’s existing feature set.
The most outlandish aspect of the previous design was the notion of “pinned fields,” which support pinned projection. This is quite different from how field projection normally works in Rust: if you have a mutable reference to a struct, you can get a mutable reference to its field, period. (I know Niko Matsakis has recently explored ideas that would change this; this post won’t go into any deep consideration of that proposal.) I’ve come up with a design which would have similar properties, instead of introducing a kind of field marker.
…In the previous post, I described the goal of Rust’s Pin
type and the history of how it
came to exist. When we were initially developing this API in 2018, one of our explicit goals was the
limit the number of changes we would make to Rust, because we wanted to ship a “minimum viable
product” of async/await syntax as soon as possible. This meant that Pin
is a type defined in the
standard library, without any syntactic or language support except for the ability to use it as a
method receiver. As I wrote in my previous post, in my opinion this is the source of a “complexity
cliff” when users have to interact with Pin
.
We knew when we made this choice that pinned references would be harder to use and more confusing
than ordinary references, though I think we did underestimate just how much more challenging they
would be for most users. Our initial hope was that with async/await, pinning would disappear into
the background, because the await
operator and the runtime’s spawn
function would pin your
futures for you and you wouldn’t have to encounter it directly. As things played out, there are
still some cases where users must interact with pinned references, even when using async/await. And
sometimes users do need to “drop down” into a lower-level register to implement
Future
themselves; this is when they truly encounter a huge complexity cliff: both the essential
complexity of implementing a state machine “by hand” and the additional complexity of understanding
the APIs to do with Pin
.
My contention in my previous post was that the difficulties involved in this have very little to do
with the complexity inherent in the pinned typestate as a concept, or in pinned references as a way
of representing it, but instead arises from the fact that Pin
is a pure library type without
support from the language. Users who deal with Pin
are almost always doing something that is
totally memory safe, the problem is just that the idioms to do so with Pin
are different from and
less clear than the idioms for doing so with ordinary references.
In this post, I want to propose a set of language changes - completely backward compatible with the
language as it exists and the async ecosystem built on Pin
- which will make interacting with
pinned references much more similar to interacting with ordinary references.
The Pin
type (and the concept of pinning in general) is a foundational building block on which
the rest of the the Rust async ecosystem stands. Unfortunately, it has also been one of the least
accessible and most misunderstood elements of async Rust. This post is meant to explain what Pin
achieves, how it came to be, and what the current problem with Pin
is.
There was an interesting post a few months ago on the blog of the company Modular, which is
developing a new language called Mojo. In a brief section discussing Pin
in Rust, I found that it
very succinctly captured the zeitgeist of the public discussion of the subject:
In Rust, there is no concept of value identity. For a self-referential struct pointing to its own member, that data can become invalid if the object moves, as it’ll be pointing to the old location in memory. This creates a complexity spike, particularly in parts of async Rust where futures need to be self-referential and store state, so you must wrap Self with Pin to guarantee it’s not going to move. In Mojo, objects have an identity so referring to self.foo will always return the correct location in memory, without any additional complexity required for the programmer.
Some aspects of these remarks confuse me. The term “value identity” is not defined anywhere in this
post, nor can I find it elsewhere in Mojo’s documentation, so I’m not clear on how Modular claims
that Mojo solves the problem that Pin
is meant to solve. Despite this, I do think the criticism of
Pin
’s usability is well stated: there is indeed a “complexity spike” when a user is forced to
interact with it. The phrase I would use is actually a “complexity cliff,” as in the user suddenly
finds themself thrown off a cliff into a sea of complex, unidiomatic APIs they don’t understand.
This is a problem and it would be very valuable to Rust users if the problem were solved.
As it happens, this little corner of Rust is my mess; adding Pin
to Rust to support
self-referential types was my idea. I have ideas of how this complexity spike could be resolved,
which I will elaborate in a subsequent post. Before I can get there though I need to first try to
explain, as efficiently as I know how, what Pin
accomplishes, how it came to exist, and why it is
currently difficult to use.
When we developed the Pin API, our vision was that “ordinary users” - that is, users using the “high-level” registers of Rust, would never have to interact with it. We intended that only users implementing Futures by hand, in the “low-level” register, would have to deal with that additional complexity. And the benefit that would accrue to all users is that futures, being immovable while polling, could store self-references in their state.
Things haven’t gone perfectly according to plan. The benefits of Pin
have certainly been accrued -
everyone is writing self-referential async functions all the time, and low-level concurrency
primitives in all the major runtimes take advantage of Pin
to implement intrusive linked lists
internally. But Pin
still sometimes rears its ugly head into “high-level” code, and users are
unsurprisingly frustrated and confused when that happens.
In my experience, there a three main ways that this happens. Two of them can be solved by better
affordances for AsyncIterator
(a part of why I have been pushing stabilizing this so hard!). The
third is ultimately because of a mistake that we made when we designed Pin
, and without a breaking
change there’s nothing we could about it. They are:
Future
in a loop.Stream::next
.Future
behind a pointer (e.g. a boxed future).