Async/Await IV: An Even Better Proposal

I did not plan to write this blog post. I thought that the fourth post in my series would explain how we could go from the generator API in my previous post to a futures API in which you don’t have to heap allocate every async call. But eddyb surprised me, and now I have to revisit the API in the previous post, because we can implement everything we need from immovability with a safe interface afterall.

Making Anchors do more

In my previous post, I introduced the concept of an Anchor: the idea of an anchor was that once something was inside an anchor, there is no way to get it out again. You can’t move out of an anchor, and it is unsafe to get a mutable reference to the thing in it.

The problem with the Anchor API was with its constructor: it’s possible to construct an Anchor for which that guarantee just doesn’t buy you much. Specifically, you could construct an Anchor<&mut T>, but once the anchor gets dropped, so does that mutable reference. After that, you can move T as much as you want, violating our guarantees.

If we could guarantee that once something is in an Anchor, it will truly never move again, the API of Generator could become safe:

trait Generator {
    type Yield;
    type Return;
    // Because `Anchor` guarantees that self will never move again, even if
    // this generator contains self-references, this is safe to call.
    fn resume(self: Anchor<&mut Self>) -> CoResult<Self::Yield, Self::Return>;
}

So, if we could provide this guarantee, the only language-level extension we need to is to let an Anchor<&mut Self> be a valid method receiver. This is a miniscule change in comparison to adding immovable types at the language level.

Enter StableDeref

What we need is a way to constrain Anchor::new, so that the type you pass to it provides these guarantees:

  • It is a dereferenceable type which owns its data, rather than borrowing it.
  • Between dereferences, the data will never move around, even if you move this around.

We have a mechanism already for allowing types to say they uphold certain invariants: unsafe traits. We can add these two traits to the standard library:

// This is a smart pointer, and the item it points to doesn't move around
// between immutable dereferences.
unsafe trait StableDeref: Deref { }

// This is a smart pointer, and the item it points to doesn't move around
// between any dereferences, mutable or immutable.
unsafe trait StableDerefMut: StableDeref + DerefMut { }

// Here are a few of the impls of these traits that could exist in std
unsafe impl<T> StableDeref for Box<T> { }
unsafe impl<T> StableDeref for Rc<T> { }
unsafe impl<T> StableDeref for Arc<T> { }
unsafe impl<T> StableDeref for Vec<T> { }

unsafe impl<T> StableDerefMut for Box<T> { }
unsafe impl<T> StableDerefMut for Vec<T> { }

Having expressed that, we can now constrain the Anchor::new constructor appropriately:

pub struct Anchor<Ptr> {
    ptr: Ptr,
}

impl<Ptr: StableDeref> Anchor<Ptr> {
    pub fn new(ptr: ptr) -> Anchor<Ptr> {
        Anchor { ptr }
    }
}

impl<Ptr: Deref> Anchor<Ptr> {
    pub fn as_ref(this: &Anchor<Ptr>) -> Anchor<&Ptr::Target> {
        Anchor { ptr: &*this.ptr }
    }
}

impl<Ptr: DerefMut> Anchor<Ptr> {
    pub fn as_mut(this: &mut Anchor<Ptr>) -> Anchor<&mut Ptr::Target> {
        Anchor { ptr: &mut *this.ptr }
    }
}

This is great! We can now construct a heap-allocated anchor and call resume on it safely like this:

let anchor: Anchor<Box<G>> = Anchor::new(Box::new(generator()));

anchor.as_mut().resume();
anchor.as_mut().resume();

Pinning to the stack

Of course it wasn’t enough for eddyb to show how we could make self-referential resume safe - he also figured out how we could make it safe, even if you want to anchor the item to a location in the stack. To do so, the API is a bit more of a complex API.

First, a function called pin:

fn pin<'a, T>(data: T) -> Pin<'a, T> {
    Pin { data: T, _marker: PhantomData }
}

struct Pin<'a, T> {
    data: T,
    _marker: PhantomData<&'a mut &'a ()>,
}

The pin function binds the data forever (since you cannot move the data out of the Pin wrapper) with an invariant lifetime. If you don’t know what an invariant lifetime is, don’t worry too much - the important thing is that you cannot use subtyping to coerce it to a different lifetime.

If we can tie this lifetime to the lifetime of a reference, we can have a reference that borrows the data “forever” - the reference will borrow the data until the data falls out of scope, and there’s no way for that to change. If we combine that with an Anchor, you can have a mutable reference which you cannot move out of and which is guaranteed to last until the data falls out of scope. That is, even though the data is on the stack, you’ve guaranteed that it can never be moved again.

The trick is to add a second constructor to Anchor:

impl<'a, T: ?Sized> Anchor<&'a mut T> {
    pub fn pinned(pin: &'a mut Pin<'a, T>) -> Anchor<&'a mut T> {
        Anchor { ptr: &mut pin.data, }
    }
}

Because 'a is the same lifetime in the reference you pass to pinned and the lifetime of the Pin type, we’ve provided our guarantee - the data held behind this anchor will never move again.

Looking at this, there are two ways now to anchor data:

// Anchor it in the heap. This costs a heap allocation, but you can return the
// anchor out of this function if you want.
let heap_anchored = Anchor::new(Box::new(generator()));

// Anchor it in the stack. This does not heap allocate, but the anchor can't
// outlive the current stack frame. You have to use a special constructor.
let stack_anchored = Anchor::pinned(&mut pin(generator()));

Conclusion

This is very impressive! With only a couple dozen lines of library code - almost all of it safe - we’re able to make the interface of self-referential generators safe. In fact, we’ve implemented the core guarantee of ?Move - once you put something in an anchor, it will never move again. I imagine that standardizing on these interfaces could be very useful for other experiments in self-referential structs that don’t have to do with generators as well. I’ve implemented the entire API in this demo crate.

Alright, having followed this surprising digression, next time we will finish this series with an explanation of how to make async functions work without anchoring them every time you call one.