July 19, 2024

The Pin type (and the concept of pinning in general) is a foundational building block on which the rest of the the Rust async ecosystem stands. Unfortunately, it has also been one of the least accessible and most misunderstood elements of async Rust. This post is meant to explain what Pin achieves, how it came to be, and what the current problem with Pin is.

There was an interesting post a few months ago on the blog of the company Modular, which is developing a new language called Mojo. In a brief section discussing Pin in Rust, I found that it very succinctly captured the zeitgeist of the public discussion of the subject:

In Rust, there is no concept of value identity. For a self-referential struct pointing to its own member, that data can become invalid if the object moves, as it’ll be pointing to the old location in memory. This creates a complexity spike, particularly in parts of async Rust where futures need to be self-referential and store state, so you must wrap Self with Pin to guarantee it’s not going to move. In Mojo, objects have an identity so referring to self.foo will always return the correct location in memory, without any additional complexity required for the programmer.

Some aspects of these remarks confuse me. The term “value identity” is not defined anywhere in this post, nor can I find it elsewhere in Mojo’s documentation, so I’m not clear on how Modular claims that Mojo solves the problem that Pin is meant to solve. Despite this, I do think the criticism of Pin’s usability is well stated: there is indeed a “complexity spike” when a user is forced to interact with it. The phrase I would use is actually a “complexity cliff,” as in the user suddenly finds themself thrown off a cliff into a sea of complex, unidiomatic APIs they don’t understand. This is a problem and it would be very valuable to Rust users if the problem were solved.

As it happens, this little corner of Rust is my mess; adding Pin to Rust to support self-referential types was my idea. I have ideas of how this complexity spike could be resolved, which I will elaborate in a subsequent post. Before I can get there though I need to first try to explain, as efficiently as I know how, what Pin accomplishes, how it came to exist, and why it is currently difficult to use.

Requirements

To explain why Pin exists, we need to step back to the original development of async/await. The problem we were trying to solve was that in order to support references in async functions, we needed to be able to store those references inside of a Future. The problem was that those references might be self-references, meaning they point to other fields of the same object.

Consider this toy example:

async fn foo<'a>(z: &'a mut i32) { ... }

async fn bar(x: i32, y: i32) -> i32 {
    let mut z = x + y;
    foo(&mut z).await;
    z
}

Both of these functions evaluate to an anonymous future type; the future type that an async function evaluates to has a state for each possible step at which it could pause: when it starts, every await point, and when it finishes.

For the purposes of our example, we will call the anonymous future that foo evaluates to Foo<'a> (the 'a being the lifetime of the z argument) and the anonymous future that bar evaluates to Bar. Let’s ask ourselves, what would the internal states of Bar be? Something like this:

enum Bar {
    // When it starts, it contains only its arguments
    Start { x: i32, y: i32 },

    // At the first await, it must contain `z` and the `Foo` future
    // that references `z`
    FirstAwait { z: i32, foo: Foo<'?> }

    // When its finished it needs no data
    Complete,
}

Note the '? in for the lifetime of the Foo<'_> future: what lifetime could that be? It isn’t a lifetime that outlives Bar, Bar has no lifetimes. The Foo object instead borrows the z field of Bar, which is stored along side it in the same struct. This is why these future types are said to be “self-referential:” they contain fields which reference other fields in themselves.

Here we must make a clarifying distinction: the goal of Pin is not to allow users to define their own self-referential type in safe Rust. Today, if you tried to define Bar by hand, there is really no safe way to construct its FirstAwait variant. Making this possible would be a worthy objective, but it is orthogonal to the goal of Pin. The goal of Pin is to make it safe to manipulate self-referential types generated by the compiler from an async function or implemented with unsafe code in a runtime like tokio.

However a self-referential type has been defined, once it exists it presents a problem. Imagine that Bar has been put into the FirstAwait state, so it contains references to its own z field. If you were to move Bar, those references would now dangle and point to dead memory, which may be re-used for a different value. Therefore, it is essential that once Bar could be put into the FirstAwait state, it is not moved again. Prior to the development of Pin, any object in Rust could be moved if you had ownership of it, or even if you had a mutable reference to it. So this was the problem that we needed to solve: we needed to express the requirement that from a certain point an object cannot be moved.

Non-solutions: move constructors and offset pointers

Before we continue, I want to spend a moment to discuss two solutions to the problem which are often proposed but don’t work (at least in Rust). These both take a rather different approach from the approach taken by Pin: instead of saying the value cannot be moved again, they try to make it so that self-referential values can be moved after all.

The first of these is the move constructor. The idea is that you would run some code whenever a value is moved, similar to the destructor that is run when the value is dropped. This code could then “fix-up” any self-referential pointers so that they now point to the new location. I’ve discussed this in the past in my post about the history of async Rust, but this is not a viable solution because in Rust, those pointers could exist anywhere, not just “inside” the value being moved. For example, you could instead have a vector of pointers into your own state, and so the move constructor would need to be able to trace into that vector. It ultimately requires the same kind of runtime memory management as garbage collection, which wasn’t viable for Rust.

The other reason move constructors don’t work is that Rust very early on affirmed that it would never have move constructors, and a lot of unsafe code exists which assumes it is possible to move values by just copying their memory. Adding move constructors would be a breaking change for Rust.

The other non-solution that is sometimes proposed is the offset pointer. The idea in this case is that rather than compile self-references to normal references, they are compiled to offsets relative to the address of the self-referential object that contains them. This does not work because it is not possible to determine at compile time if a reference will be a self-reference or not: its possible for the same value to be both in different branches. For example, here’s a modified version of bar from before:

async fn bar(x: i32, y: i32, mut z: &mut i32) {
    let mut z2 = x + y;
    if random() {
        z = &mut z2;
    }
    foo(z).await;
}

By the time you call foo, z may be a pointer into the same object or it may be a pointer elsewhere. Its not possible to determine at compile time. You would need to compile references to some sort of enum of offset and reference; this was deemed unrealistic when we were working on async/await.

The “pinned typestate”

Having eliminated any option to make these objects movable, we therefore have a requirement that the object be immovable. But we need to clarify exactly what the requirements are, because people often make the wrong assumption about what is required.

Most importantly, these objects are not meant to be always immovable. Instead, they are meant to be freely moved for a certain period of their lifecycle, and at a certain point they should stop being moved from then on. That way, you can move a self-referential future around as you compose it with other futures until eventually you put it into the place it will live for as long as you poll it. So we needed a way to express that an object is no longer allowed to be moved; in other words, that it is “pinned in place.”

While we were experimenting with APIs for expressing this requirement, Ralf Jung was kind enough to formalize the idea. In Ralf’s model, even before the work on async/await, objects could be in one of two “typestates”: they are “owned,” in which state they can be moved freely, or they are “shared,” in which state they cannot be moved for some lifetime (because they have references pointing to them). To support self-referential future types, Ralf’s model gained a third typestate, which is called “pinned.”

Once an object enters the pinned typestate, it can never be moved again. More specifically, its memory cannot be invalidated without first executing its destructor. This definition also includes some other edge cases, like freeing memory without running the destructor, but the main way you invalidate an object’s memory without the destructor running is by moving the object to a new location. The easiest way to understand the pinned typestate is to think of it as requiring that the object never is moved again.

Another fact about the pinned typestate is that for most types it is completely irrelevant. If the value of type can never contain any self-references, pinning it is useless. So for most types of objects, one would want types to be able to opt out of ever entering the pinned typestate so that you can move them again if you want.

There is a more detailed description of the pinned typestate in the formal model of Rust on Ralf’s blog for people who are interested. But having understood the requirements of pinning (first informally, and then formally by Ralf), we had the problem of finding the best way to represent an object entering the pinned typestate in the surface language of Rust. Ralf’s model describes the semantics of the language, but doesn’t specify a user-facing API or syntax. The solution that we ended up with was the Pin type, but it wasn’t the first solution we tried.

`?Move`

Before we tried Pin, we tried a solution based on a new trait which we called Move. The idea was that most types would implement Move, and nothing about them would change, but any type that could contain self-references would not implement Move. For these types which don’t implement Move, whenever you take a reference to a value of that type, that value enters the pinned typestate and no longer can be moved.

This definition is at the same time somewhat complex - people often assume Move controls moving at all, which wasn’t the original proposal - and also in other ways somewhat intuitive - you can’t possible store a self-reference in a value without taking a reference to that value to store, so tying the transition to the pinned typestate to the taking of a reference provides a straightforward guarantee of safety. And the check for this could be implemented automatically in the compiler: for types that don’t implement Move, disallow moving values of those types after they’ve been referenced in the same way you disallow moving values of non-Copy types after they’ve been moved. This behavior was even implemented in a branch.

The design has one fundamental limitation, which is that sometimes you want to take a reference to a value that will later be self-referential without pinning it in place. For example, maybe you want to store it briefly in an Option and then use Option::take to take it away. This would probably be the most significant problem with this original Move trait, but we didn’t even get to the point of really identifying that problem, because we discovered early on that adding Move would not be a backward compatible change.

I’ve written about this before, but let me reiterate. There are two kinds of automatically implemented “marker traits” in Rust:

Auto traits: these are automatically implemented for types if all their fields implement them. Major examples include Send and Sync.
?Traits: these are automatically implemented for types if all their fields implement them, and generic parameters are assumed to implement them as well unless they explicitly opt out. The only example of this is ?Sized.

We knew all along that we couldn’t make Move an auto trait, because there are stable APIs that depend on the fact that you can always move out of a mutable reference. The classic example of this is mem::swap, which swaps the location of two values of the same type. You can’t allow swapping types that don’t implement Move, but there’s no Move bound on that API, and adding a new bound to it would be a breaking change.

Our assumption, therefore, was that we would need to add Move as a ?Trait: ?Move. By default, all generics would be assumed to be Move, but if an API doesn’t require the ability to move the parameter it can add a T: ?Move bound to the API. This was already not very appealing: a lot of APIs don’t need the value to be movable and would presumably gain a ?Move bound, making Rust documentation harder to understand across the board. But the whole plan went down with the fact that adding Move as a ?Trait was also not backward compatible.

The problem is with associated types: the place to add ?Trait bounds to associated types is at the trait’s definition site. If the associated type of a trait does not have a ?Trait bound, all code which uses that trait is allowed to assume that the associated type implements that trait. Moreover, relaxing that bound on an existing trait would be a breaking change, because code is allowed to exist which relies on that bound

Here is an example using IntoFuture, which assumes that the associated future type has the behavior of a type that implements Move:

fn swap_into_future<T: IntoFuture>(into_f1: T, into_f2: T) {
    let mut f1 = into_f1.into_future();
    let mut f2 = into_f2.into_future();
    // This would become an error if you add
    // `type IntoFuture: ?Move` to the trait:
    mem::swap(&mut f1, &mut f2);
}

This problem is widespread, because many fundamental operators involve associated types. For example, you could not even have a mutable reference to a ?Move type implement DerefMut, because the target of a pointer is an associated type:

fn swap_derefs<T: DerefMut<Target: Sized>>(mut r1: T, mut r2: T) {
    // This would become an error if you add
    // `type Target: ?Move` to the trait:
    mem::swap(&mut *r1, &mut *r2);
}

The same is true of the return type of a function, the item of an iterator, the value returned by the index operators, by the arithmetic operators, and so on. It is simply not backward compatible to add another new ?Trait, and an edition cannot be easily used to solve the problem, because it is necessary that the interfaces of traits remain the same so that two crates in different editions can be composed together.

`Pin`

Given that limitation, we set out to solve the problem in a completely different direction. Instead of making the pinned typestate a property of the object’s type which it enters whenever it is referenced, we designed a new category of reference which puts the object into the pinned typestate when the reference is created. This is represented with the Pin type.

Pin is a wrapper type that can wrap any kind of pointer (both the built-in reference types and library-defined “smart pointers” like Box). It means that that pointer puts its target into the pinned typestate so it must never be moved again. To make the changes necessary as minimal as possible, we implemented this as a library API, rather than having immovability enforced by the compiler. This means that when code actually needs to mutate the object that is pinned, it must use an unsafe API get access to it and guarantee that the object is not moved through that ordinary mutable reference.

Since most types don’t have a meaningful difference between the pinned typestate and normal, the Unpin auto trait was added. This allows getting a mutable reference from a pinned pointer without unsafe code if the type can’t be self-referential. It’s perfectly safe to move the object out of a Pin if it implements Unpin. This is a lot like Move, but by tying this behavior only to pinned pointers, we avoid the issue of backward compatibility as well as the original problem that you couldn’t reference a ?Move object without pinning it in place. Because pinning only applies to pinned pointers, ordinary unpinned references still work just fine with types that are not Unpin.

There are more details in the documentation for the Pin type and for the pin module, which over the years have grown into a comprehensive and clear explanation of pinning as it exists in Rust today.

Of course, the biggest advantage of the Pin interface was that it was backward compatible to add. Because all of the APIs that can move referenced data like swap require a mutable reference, once you pin a object with Pin you can’t call them on that object anymore. But because the new pinned typestate only applies to special pinned references, it doesn’t require breaking changes to the rest of the Rust language. This is why we went forward with this design: it was possible to add without breaking any existing code and violating Rust’s backward compatibility guarantees.

The problems with `Pin`

Despite meeting our requirements in a backward compatible way, Pin has proven to have several problems in terms of usability. It is indeed a “complexity spike” when users have to deal with Pin. But what is the cause of this complexity?

One theory would be that the problem is that whereas the Move trait was to be enforced by the compiler, the Pin trait requires unsafe code to mutate objects while they’re pinned. With the Move trait, this was automatically enabled by marking mutating APIs which don’t move the object ?Move. To some extent this is true, but we should be careful not to exaggerate it. For example, you can already assign to a pinned object using Pin::set, which is totally safe. And code which actually needs to mutate pinned objects is actually rare: in general, that’s code generated by the compiler when it lowers your async function to a future, not code you write yourself.

Another theory (advanced by Yosh Wuyts here) is that the reason Pin is difficult to use is that its “conditional.” This also does not strike me as the problem. Plenty of things in Rust and programming are “conditional” but are lauded as making programmers’ lives easier. For example, non-lexical lifetimes were all about making lifetimes end at different point in different branches of a conditional, and everyone sees this as making Rust easier to understand. Maybe there was some naming issue that has made it hard to understand the relationship between Pin (a type) and Unpin (a trait), but I don’t think this is the heart of the problem.

In my opinion, the problem with Pin is that it was implemented as a pure library type, whereas the ordinary reference types have plenty of syntactic sugar and support as built-in types that are part of the language. A lot of nice features that references have disappear when you are dealing with pinned references. This makes the experience much worse, and more importantly breaks many users' mental models, because they’ve built up an understanding of how references behave based on what the compiler accepts, and similar code stops being accepted once you are dealing with pinned references.

A very prominent example is the notion of “reborrowing,” which normal mutable references have but pinned references do not. Consider this: &mut T does not implement Copy, and yet its perfectly permissible to pass it as an argument multiple times in a row, like so:

fn incr(x: &mut i32) {
    *x += 1;
}

fn incr_twice(x: &mut i32) {
    incr(x);
    incr(x);
}

It never occurs to most users to ask why this is allowed, but in fact it violates a basic rule of Rust: types which don’t implement Copy can’t be moved more than once. The reason is that there is an implicit coercion in the compiler called “reborrowing,” in which when mutable references are used, it will functionally insert a “reborrow” (as if you wrote &mut *x instead of x), to borrow the reference again, instead of moving it.

Pin does not have this affordance, because it is a normal library type that doesn’t implement Copy. This means that when you use a Pin<&mut T> more than once, you get an error about using the value after a move or sometimes an even more inscrutable error about lifetimes. Instead, you have to explicitly reborrow Pin using the Pin::as_mut function. This difference is the cause of a lot of confusion when users try to use Pin.

I could go on and on. Consider the example of set mentioned above: that its safe to assign to Pin, but you need to use the set method. You can just assign to a mutable reference using a dereference and the assignment operator. But this is not true of Pin, you need to learn the special API. Many such special cases exist, and they are because Pin is a library type with no support in the language’s syntax.

Undoubtedly the worst of problem in this category is the problem of “pinned projections.” A “projection” is the fancy programming language term for a field access: “projecting” from an object to a field of that object (I guess in the same sense that an awning might project out of a wall). The problem of pinned projections is that it is challenging to take a pinned reference to an object and get a pinned reference to a field of that object. There are third party crates like pin-project-lite to solve this problem, but they require learning a complex new API involving macros, emphasizing the way that pinned references are much harder to use than ordinary references because they are a library type.

The worst part of this is a very unfortunate interaction between pinned projections and the Drop trait. The problem arises because Drop::drop takes a normal mutable reference. Consider this possibility: you have a type which has a field that is a self-referential field. You pin project to that field and poll it. Then, in the destructor, you move out of that field (because you now have an unpinned mutable reference), pin that future to the stack, and poll it there. You’ve just violated the pinning guarantees.

The solution that crates like pin-project-lite use is to restrict your ability to define a destructor. This works alright in practice, but this very fact is additional complexity that has to be documented when explaining what precisely the pinning guarantees are. Unfortunately, Drop was stable before Pin, so we had to work around it.

In my next post…

Despite these problems of usability, Pin is my proudest accomplishment of my work on Rust. We enabled users to compile async functions which contain arbitrary references safely into self-referential objects; without this async/await would not have been nearly as usable a feature as it has been, because references are a fundamental part of how users write Rust. And we did so in a manner which was entirely backward compatible with the language that already existed. Pin is now a fundamental component of a thriving ecosystem for high-performance network services and other use cases of asynchronous programming.

But I do agree with the critics: Pin represents a complexity cliff and working with a pinned reference is much harder than working with an ordinary reference. That’s why in a few days I will turn to the subject of how Pin could be improved. The key concept is the notion of pinned places.

Pin