Generators with UnpinCell

In July, I described a way to make pinning more ergonomic by integrating it more fully into the language. Last week, I develoepd that idea further with the notion of UnpinCell: a wrapper type that lets a user take an &pin mut UnpinCell<T> and produce an &mut T, similar to how other cells let a user take a shared reference to the cell and produce a mutable reference to its contents. I believe that this notion can also solve the biggest outstanding issues facing generators: the fact that the Iterator interface does not permit self-referential values.

As I wrote in my explanation of Pin’s design, the biggest advantage that Pin had over other design ideas was that it was a trivially backward compatible way of introducing a contract that an object will never be moved. But this meant that a trait could only opt into that contract using the new interface; traits that existed before Pin and don’t opt into that contract cannot be implemented by types that have self-referential values. The most problematic trait here is Iterator, because generators (functions that evaluate to iterators in the same way async functions evaluate to futures) would ideally support self-referential values just like async functions do. So long as the interface for Iterator takes a mutable reference and not a pinned mutable reference, implementers must assume the iterator can be moved around and therefore can’t be self-referential.

One potential solution would be to add a new trait which is pinned. I’ll call this Generator, but there are other naming options I won’t dwell on here. Like Iterator, it would be paired with a similar IntoGenerator trait, and all generators implement IntoGenerator just like all iterators implement IntoIterator:

trait Generator {
    type Item;

    fn next(&pin mut self) -> Option<Self::Item>;
}

trait IntoGenerator {
    type Item;
    type IntoGen: Generator<Item = Self::Item>;

    fn into_gen(self) -> Self::IntoGen;
}

impl<T: Generator> IntoGenerator for T {
    type Item = T::Item;
    type IntoGen = T;

    fn into_gen(self) -> T {
        self
    }
}

The problem, though, is how to bridge from iterators to generators. For combinators, the solution is simple: Generator will gain a host of provided methods just like Iterator has. This requires duplicating some code in the standard library, but that doesn’t significantly impact users, just the maintainers of the standard library.

The bigger problem is what to do about for loops. Currently, a for loop can loop over any type that implements IntoIterator; how could we change it to support any type that implements IntoGenerator instead? A solution could be this implementation, which uses UnpinCell:

impl<T: IntoIterator> IntoGenerator for T {
    type Item = T::Item;
    type IntoIter = UnpinCell<T::IntoIter>;

    fn into_gen(self) -> UnpinCell<T::IntoIter> {
        UnpinCell::new(self.into_iter())
    }
}

impl<T: Iterator> Generator for UnpinCell<T> {
    type Item = T::Item;
    
    fn next(&pin mut self) -> Option<T::Item> {
        self.inner.next()
    }
}

By wrapping the iterator in an UnpinCell, users can call its next method, which expects an unpinned mutable reference, through the pinned reference provided to Generator, which makes it easy to bridge IntoIterator to IntoGenerator. The desugaring of a for loop would be changed from:

let mut iter = IntoIterator::into_iter($collection);
while let Some($elem) = Iterator::next(&mut iter) {
    $body
}

to:

let pin mut gen = IntoGenerator::into_gen($collection);
while let Some($elem) = Generator::next(&pin mut gen) {
    $body
}

This would be backward compatible, because all IntoIterator types are automatically IntoGenerator types thanks to this bridge implementation. But now, for loops can process self-referential generators.

Problem 1: Coherence

There’s one problem: the set of impls I just provided are not coherent. Specifically, these two impls are overlapping:

impl<T: Generator> IntoGenerator for T {
    type Item = T::Item;
    type IntoGen = T;

    fn into_gen(self) -> T {
        self
    }
}

impl<T: IntoIterator> IntoGenerator for T {
    type Item = T::Item;
    type IntoIter = UnpinCell<T::IntoIter>;

    fn into_gen(self) -> UnpinCell<T::IntoITer> {
        UnpinCell::new(self.into_iter())
    }
}

What about a type that implements both Generator and IntoIterator? Some solution would need to be arrived at.

One solution would be to expand the unstable negative impl feature (or use a special case in the compiler) to make it illegal to implement both IntoIterator and Generator for the same type:

impl<T: Generator> !IntoIterator T  { }
impl<T: IntoIterator> !Generator for T { }

This is probably the easiest and most realistic way to solve the problem. Allowing both impls on the same type directing the IntoGenerator impl to use the right one seems like it would run into the same lifetime non-parametricity issues that has blocked specialization.

Problem 2: Reverse bridging

However, this leaves the problem of reverse bridging: how do you pass a generator to an interface that expects an Iterator or IntoIterator? There are plenty of such interfaces in the ecosystem.

In general, it is not sound to treat a generator as an iterator, because an iterator can be moved between iterations, and generators can be self-referential. Therefore, the generator would need to be pinned in place before treating it that way, either to the stack or to the heap:

impl<T: Generator + ?Sized> Iterator for &pin mut T {
    type Item = T::Item;

    fn next(&mut self) -> Option<T::Item> {
        (*self).next()
    }
}

impl<T: Generator + ?Sized> Iterator for Pin<Box<T>> {
    type Item = T::Item;

    fn next(&mut self) -> Option<T::Item> {
        (*self).next()
    }
}

But if a generator implements Unpin, it actually would be fine for it to implement Iterator without pinning it. You might want to add, for example:

impl<T: Generator + Unpin + ?Sized> Iterator for T {
    type Item = T::Item;

    fn next(&mut self) -> Option<T::Item> {
        self.next()
    }
}

But this would cause T: Generator + Unpin to implement IntoIterator (via the blanket impl for iterators), which would contradict the solution for coherence discussed above: that generators can never implement IntoIterator.

The only solution that comes to mind (other than somehow enabling enough specialization that all of this can just be made coherent without user-facing difficulty) would be a wrapper type that would have to be specifically called by users in the case that they want to pass a generator to an interface expecting an iterator. This isn’t great, but the user-facing impact is now restricted to users trying to pass new generators to old interfaces using iterators.

When possible, these old interfaces can be backward compatibly upgraded to take IntoGenerator instead of IntoIterator (as long as they don’t move the iterator around while iterating through it, as usually they don’t). If a user is using a library which hasn’t upgraded, they can either use an indirection (if their generator is not Unpin) or a wrapper type (possibly provided by std).

In my view, the negative user impact - transitioning users from the unpinned interface to the pinned interface, a bit of difficulty passing new-style generators to outdated libraries - seems tolerably minor, especially compared to contrasting proposals that involve dramatic shift to how Rust handles mutability, ownership, async, iteration, etc. As a user, I would like to see the Rust project focus its attention on shipping incremental improvements that round out the user experience of the language as it exists.