Pinned places


In the previous post, I described the goal of Rust’s Pin type and the history of how it came to exist. When we were initially developing this API in 2018, one of our explicit goals was the limit the number of changes we would make to Rust, because we wanted to ship a “minimum viable product” of async/await syntax as soon as possible. This meant that Pin is a type defined in the standard library, without any syntactic or language support except for the ability to use it as a method receiver. As I wrote in my previous post, in my opinion this is the source of a “complexity cliff” when users have to interact with Pin.

We knew when we made this choice that pinned references would be harder to use and more confusing than ordinary references, though I think we did underestimate just how much more challenging they would be for most users. Our initial hope was that with async/await, pinning would disappear into the background, because the await operator and the runtime’s spawn function would pin your futures for you and you wouldn’t have to encounter it directly. As things played out, there are still some cases where users must interact with pinned references, even when using async/await. And sometimes users do need to “drop down” into a lower-level register to implement Future themselves; this is when they truly encounter a huge complexity cliff: both the essential complexity of implementing a state machine “by hand” and the additional complexity of understanding the APIs to do with Pin.

My contention in my previous post was that the difficulties involved in this have very little to do with the complexity inherent in the pinned typestate as a concept, or in pinned references as a way of representing it, but instead arises from the fact that Pin is a pure library type without support from the language. Users who deal with Pin are almost always doing something that is totally memory safe, the problem is just that the idioms to do so with Pin are different from and less clear than the idioms for doing so with ordinary references.

In this post, I want to propose a set of language changes - completely backward compatible with the language as it exists and the async ecosystem built on Pin - which will make interacting with pinned references much more similar to interacting with ordinary references.

Pin


The Pin type (and the concept of pinning in general) is a foundational building block on which the rest of the the Rust async ecosystem stands. Unfortunately, it has also been one of the least accessible and most misunderstood elements of async Rust. This post is meant to explain what Pin achieves, how it came to be, and what the current problem with Pin is.

There was an interesting post a few months ago on the blog of the company Modular, which is developing a new language called Mojo. In a brief section discussing Pin in Rust, I found that it very succinctly captured the zeitgeist of the public discussion of the subject:

In Rust, there is no concept of value identity. For a self-referential struct pointing to its own member, that data can become invalid if the object moves, as it’ll be pointing to the old location in memory. This creates a complexity spike, particularly in parts of async Rust where futures need to be self-referential and store state, so you must wrap Self with Pin to guarantee it’s not going to move. In Mojo, objects have an identity so referring to self.foo will always return the correct location in memory, without any additional complexity required for the programmer.

Some aspects of these remarks confuse me. The term “value identity” is not defined anywhere in this post, nor can I find it elsewhere in Mojo’s documentation, so I’m not clear on how Modular claims that Mojo solves the problem that Pin is meant to solve. Despite this, I do think the criticism of Pin’s usability is well stated: there is indeed a “complexity spike” when a user is forced to interact with it. The phrase I would use is actually a “complexity cliff,” as in the user suddenly finds themself thrown off a cliff into a sea of complex, unidiomatic APIs they don’t understand. This is a problem and it would be very valuable to Rust users if the problem were solved.

As it happens, this little corner of Rust is my mess; adding Pin to Rust to support self-referential types was my idea. I have ideas of how this complexity spike could be resolved, which I will elaborate in a subsequent post. Before I can get there though I need to first try to explain, as efficiently as I know how, what Pin accomplishes, how it came to exist, and why it is currently difficult to use.

Ownership


This post is meant as an explainer about how substructural type theory can be applied in programming language design. Terms like “substructural type theory” tend to scare and confuse programmers who don’t write Haskell on the weekends, so one thing programming language designers should do when thinking about how they will present their language is invent metaphors, even slightly misleading ones, to help more ordinary programmers understand how their language works. One such term is “ownership.”

Not infrequently, objects in a program come to represent something outside of themselves; they are not “pure data” but some kind of resource with identity. A classic example of this might be a sort of handle granting exclusive access to a resource. For this to really work well you want to know that to get that object you had to execute certain code (a “constructor”), that when the object is no longer used some other code will be executed (a “destructor”), and that while the object is in scope, no concurrently executing code also has an object representing the same exclusive resource (that it is not “aliased”). This is what ownership (as presented in Rust, at least) is all about.

I want to demystify ownership and substructural types in the hopes that this will become more common knowledge. Nothing in this post is really groundbreaking - if you’re already “in the know,” it will contain no new information - but it does contain some notes on aspects of Rust’s implementation that I think are incorrect (one of which would even be easy to change).

References are like jumps


In a high-level language, the programmer is deprived of the dangerous power to update his own program while it is running. Even more valuable, he has the power to split his machine into a number of separate variables, arrays, files, etc.; when he wishes to update any of these he must quote its name explicitly on the left of the assignment, so that the identity of the part of the machine subject to change is immediately apparent; and, finally, a high-level language can guarantee that all variables are disjoint, and that updating any one of them cannot possibly have any effect on any other.

Unfortunately, many of these advantages are not maintained in the design of procedures and parameters in ALGOL 60 and other languages. But instead of mending these minor faults, many language designers have preferred to extend them throughout the whole language by introducing the concept of reference, pointer, or indirect address into the language as an assignable item of data. This immediately gives rise in a high-level language to one of the most notorious confusions of machine code, namely that between an address and its contents. Some languages attempt to solve this by even more confusing automatic coercion rules. Worst still, an indirect assignment through a pointer, just as in machine code, can update any store location whatsoever, and the damage is no longer confined to the variable explicitly named as the target of assignment…

Unlike all other values (integers, strings, arrays, files, etc.) references have no meaning independent of a particular run of a program. They cannot be input as data, and they cannot be output as results. If either data or references to data have to be stored on files or backing stores, the problems are immense. And on many machines they have a surprising overhead on performance, for example they will clog up instruction pipelines, data lookahead, slave stores, and even paging systems. References are like jumps, leading wildly from one part of a data structure to another. Their introduction into high-level languages has been a step backward from which we may never recover.

— C.A.R. Hoare, Hints on programming-language design 1974

How embarrassing it is for the practice of software development that, when it comes to the subject of references, we have spent half a century creating an enormous object proof that Sir Tony was correct. Null pointers may have been his billion dollar mistake, but the decision to ignore his remarks about the problem of pointers in general was a trillion dollar mistake that everyone else made.

What Tony Hoare was writing about when he said that references are like jumps was the problem of mutable, aliased state. If you have in a language the ability to alias two variables so that they refer to the same location in memory, and also the ability to assign values to variables as execution progresses, your ability to locally reason about the behavior of a component of your system becomes badly inhibited. Depriving the user of the ability to mutate aliased state by accident is critical to enabling the user to easily create correctly functioning systems.

Coroutines and effects


For the past few months I’ve been mulling over some things that Russell Johnston made me realize about the relationship between effect systems and coroutines. You can read more of his thoughts on this subject here, but he made me realize that effect systems (like that found in Koka) and coroutines (like Rust’s async functions or generators) are in some ways isomorphic to one another. I’ve been pondering the differences between them, trying to figuring out the advantages and disadvantages of each.

A few weeks ago, Will Crichton posted something on Twitter that helped bring the contrast into sharper focus for me:

The entire field of PL right now: what if it was dynamically scoped…. but statically typed…………..? (effects, capabilities, contexts, metavariables…)

I’m just a humble language designer (and not a theorist of anything, especially not PL), so my focus is the difference in user experience and affordance. But this seems like a cutting insight and this property of effect handlers - static typing but dynamic scoping - seems to me to be a good jumping off point for understanding the difference between effect handlers and coroutines from a user perspective.

Iterators and traversables


This is a brief note about the definition of iterator.

Asynchronous clean-up


One problem with the design of async Rust is what do about async clean-up code. Consider that you have a type representing some object or operation (like an async IO handle) and it runs clean up code when you are done using it, but that clean up code itself is also non-blocking and could yield control. Async Rust has no good way to handle this pattern today.

The nicest solution seems to be to just use the mechanism that already exists: destructors. If only you could await inside a destructor, everything would seem to be solved. Alas, this would present several problems, and I personally do not believe it is realistic to imagine Rust gaining this feature in the same way that destructors work.

The first problem is this: what happens if you drop the the value in a non-async scope? It’s not possible to await there! There are two options: either the async destructor doesn’t run (considered too easy a mistake to make), or there is a type-checking rule that prevents users from dropping values with async destructors in non-async scopes. The second solution reduces to undroppable types, which I will discuss later in this post: this rule is just undroppable types with an exception to allow them to be dropped in an async scope. What I can say with certainty is that undroppable types, even with an exception, would be very difficult to add to Rust.

The second problem is the way that the state of the async destructor would impact the state of any future any containing it. This is actually a re-emergence of the problems with async methods, but now applied to any generic type (because you don’t know of a generic type T has an async destructor). The first problem is that you have any trait object, when it drops, what happens if it has an async destructor? This introduces the same object safety issues as async methods: you have nowhere to store the future returned by the async destructor of a trait object. The second problem is that you want to send a value to a different thread, that state of its async destructor also needs to be Send. This is the same problem that motivated RTN, except that now its a problem for every generic type being moved to another thread, not only types on which you explicitly call an async method. I wrote about this problem years and years ago, but it seems to have been misunderstood and ignored since then.

The third problem is that users are concerned about having implicit await points added to their future without them realizing it. Therefore there would need to be some restriction that not only doesn’t allow these types to be dropped in a non-async scope, but also makes it so that they are destructed at an already explicit await point. This would make the rules around when their async destructors run very different from other destructors, if its even possible to make them coherent.

The fourth problem, I believe maybe never raised before, is that it is not the ideal code generation to run async destructors sequentially no matter what. For example, if I have two values that I am asynchronously dropping, possibly I want to join the destructors so they run concurrently. But doing this implicitly would be very risky, because maybe I actually carefully expect one to run before the other.

All of these problems hint at a different way to frame the problem of asynchronous clean-up: the problem is not that there is no async drop, but that destructors really only work when you can write a destructor function that returns (). Async clean-up is just a special case of clean-up which does not return (). In this case it returns a future, but there are also scenarios in which the issue is a lack of destructors that can return Result, for example.

I want to explore the design space for asynchronous clean up and clean up code that returns values in general, without a focus on destructors specifically. The proposal I’ve fleshed out here, based heavily on the work of others (especially Eric Holk and Tyler Mandry), combines two distinct features - async future cancellation and a dofinal construct - to enable users to write asynchronous clean up code that is consistently called. I will also show how these constructs are required for any sort of “linear type” mechanism in Rust, so rather than seeing them as alternative to type-based async clean up code, they should be seen as prerequisites that can be implemented in the nearer term.

FuturesUnordered and the order of futures


In my previous post, I wrote about the distinction between “multi-task” and “intra-task” concurrency in async Rust. I want to open this post by considering a common pattern that users encounter, and how they might implement a solution using each technique.

Let’s call this “sub-tasking.” You have a unit of work that you need to perform, and you want to divide that unit into many smaller units of work, each of which can be run concurrently. This is intentionally extremely abstract: basically every program of any significance contains an instance of this pattern at least once (often many times), and the best solution will depend on the kind of work being done, how much work there is, the arity of concurrency, and so on.

  • Using multi-task concurrency, each smaller of work would be its own task. The user would spawn each of these tasks onto an executor. The results of the task would be collected with a synchronization primitive like a channel, or the tasks would be awaited together with a JoinSet.

  • Using intra-task concurrency, each smaller unit will be a future run concurrently within the same task. The user would construct all of the futures and then use a concurrency primitive like join! or select! to combine them into a single future, depending on the exact access pattern.

Each of these approaches has its advantages and disadvantages. Spawning multiple tasks requires that each task be 'static, which means they cannot borrow data from their parent task. This is often a very annoying limitation, not only because it might be costly to use shared ownership (meaning Arc and possibly Mutex), but also because even if it isn’t going to be problematic in this context to use shared ownership, (I’d love to see this change! Cheap shared ownership constructs like Arc and Rc should have non-affine semantics so you don’t have to call clone on them.)

When you join multiple futures, they can borrow from state outside of them within the same task, but as I wrote in the previous post, you can only join a static number of futures. Users that don’t want to deal with shared ownership but have a dynamic number of sub-tasks they need to execute are left searching for another solution. Enter FuturesUnordered.

FuturesUnordered is an odd duck of an abstraction from the futures library, which represents a collection of futures as a Stream (in std parlance, an AsyncIterator). This makes it a lot like tokio’s JoinSet in surface appearance, but unlike JoinSet the futures you push into it are not spawned separately onto the executor, but are polled as the FuturesUnordered is polled. Much like spawning a task, every future pushed into FuturesUnordered is separately allocated, so representationally its very similar to multi-task concurrency. But because the FuturesUnordered is what polls each of these futures, they don’t execute independently and they don’t need to be 'static. They can borrow surrounding state as long as the FuturesUnordered doesn’t outlive that state.

In a sense, FuturesUnordered is a sort of hybrid between intra-task concurrency and multi-task concurrency: you can borrow state from the same task like intra-task, but you can execute arbitrarily many concurrent futures like multi-task. So it seems like a natural fit for the use case I was just describing when the user wants that exact combination of features. But FuturesUnordered has also been a culprit in some of the more frustrating bugs that users encounter when writing async Rust. In the rest of this post, I want to investigate the reasons why that is.

Let futures be futures


In the early-to-mid 2010s, there was a renaissance in languages exploring new ways of doing concurrency. In the midst of this renaissance, one abstraction for achieving concurrent operations that was developed was the “future” or “promise” abstraction, which represented a unit of work that will maybe eventually complete, allowing the programmer to use this to manipulate control flow in their program. Building on this, syntactic sugar called “async/await” was introduced to take futures and shape them into the ordinary, linear control flow that is most common. This approach has been adopted in many mainstream languages, a series of developments that has been controversial among practitioners.

There are two excellent posts from that period which do a very good job of making the case for the two sides of the argument. I couldn’t more strongly recommend reading each these posts in full:

The thesis of Eriksen’s post is that futures provide a fundamentally different model of concurrency from threads. Threads provide a model in which all operations occur “synchronously,” because the execution of the program is modeled as a stack of function calls, which block when they need to wait for concurrently executing operations to complete. In contrast, by representing concurrent operations as asynchronously completing “futures,” the futures model enabled several advantages cited by Eriksen. These are the ones I find particularly compelling:

  1. A function performing asynchronous operations has a different type from a “pure” function, because it must return a future instead of just a value. This distinction is useful because it lets you know if a function is performing IO or just pure computation, with far-reaching implications.
  2. Because they create a direct representation of the unit of work to be performed, futures can be composed in multiple ways, both sequentially and concurrently. Blocking function calls can only be composed sequentially without starting a new thread.
  3. Because futures can be composed concurrently, concurrent code can be written which more directly expresses the logic of what is occurring. Abstractions can be written which represent particular patterns of concurrency, allowing business logic to be lifted from the machinery of scheduling work across threads. Eriksen gives examples like a flatMap operator to chain many concurrent network requests after one initial network request.

Nystrom takes the counter-position. He starts by imagining a language in which all functions are “colored,” either BLUE or RED . In his imaginary language, the important difference between the two colors of function is that RED functions can only be called from other RED functions. He posits this distinction as a great frustration for users of the language, because having to track two different kinds of functions is annoying and in his language RED functions must be called using an annoyingly baroque syntax. Of course, what he’s referring to is the difference between synchronous functions and asynchronous functions. Exactly what Eriksen cites as an advantage of futures - that functions returning futures are different from functions that don’t return futures - is for Nystrom it’s greatest weakness.

Some of the remarks Nystrom makes are not relevant to async Rust. For example, he says that if you call a function of one color as if it were a function of the other, dreadful things could happen:

When calling a function, you need to use the call that corresponds to its color. If you get it wrong … it does something bad. Dredge up some long-forgotten nightmare from your childhood like a clown with snakes for arms hiding under your bed. That jumps out of your monitor and sucks out your vitreous humour.

This is plausibly true of JavaScript, an untyped language with famously ridiculous semantics, but in a statically typed language like Rust, you’ll get a compiler error which you can fix and move on.

One of his main points is also that calling a RED function is much more “painful” than calling a BLUE function. As Nystrom later elaborates in his post, he is referring to the callback-based API commonly used in JavaScript in 2015, and he says that async/await syntax resolves this problem:

[Async/await] lets you make asynchronous calls just as easily as you can synchronous ones, with the tiny addition of a cute little keyword. You can nest await calls in expressions, use them in exception handling code, stuff them inside control flow.

Of course, he also says this, which is the crux of the argument about the “function coloring problem”:

But… you still have divided the world in two. Those async functions are easier to write, but they’re still async functions.

You’ve still got two colors. Async-await solves annoying rule #4: they make red functions not much worse to call than blue ones. But all of the other rules are still there.

Futures represent asynchronous operations differently from synchronous operations. For Eriksen, this provides additional affordances which are the key advantage of futures. For Nystrom, this is just an another hurdle to calling functions which return futures instead of blocking.

As you might expect if you’re familiar with this blog, I fall pretty firmly on the side of Eriksen. So it has not been easy on me to find that Nystrom’s views have been much more popular with the sort of people who comment on Hacker News or write angry, over-confident rants on the internet. A few months ago I wrote a post exploring the history of how Rust came to have the futures abstraction and async/await syntax on top of that, as well as a follow-up post describing the features I would like to see added to async Rust to make it easier to use.

Now I would like to take a step back and re-examine the design of async Rust in the context of this question about the utility of the futures model of concurrency. What has the use of futures actually gotten us in async Rust? I would like us to imagine that there could be a world in which the difficulties of using futures have been mitigated or resolved & the additional affordances they provide make async Rust not only just as easy to use as non-async Rust, but actually a better experience overall.

poll_progress


Last week, Tyler Mandry published an interesting post about a problem that the Rust project calls “Barbara battles buffered streams.” Tyler does a good job explaining the issue, but briefly the problem is that the buffering adapters from the futures library (Buffered and BufferUnordered) do not interact well with for await if the processing in the body is asynchronous (i.e. if it contains any await expressions).

I think we can better understand the problem if we examine it visually. First, let’s consider the control flow that occurs when a user processes a normal, non-asynchronous Iterator using a for loop:

                ┌── SOME ────────────────┐ 
        ╔═══════════════╗        ╔═══════▼═══════╗ 
        ║               ║▐▌      ║               ║▐▌
  ──────▶      NEXT     ║▐▌      ║   LOOP BODY   ║▐▌
        ║               ║▐▌      ║               ║▐▌
        ╚════════════▲══╝▐▌      ╚═══════════════╝▐▌
         ▀▀│▀▀▀▀▀▀▀▀▀│▀▀▀▀▘       ▀▀▀▀▀▀▀│▀▀▀▀▀▀▀▀▀▘
           │         └───────────────────┘
           └── NONE ──────────────────────────────▶

The for loop first calls the iterator’s next method, and then passes the resulting item (if there is one) to the loop body. When there are no more items, it exits the loop.