Shifgrethor III: Rooting

After the digression in the previous post, it’s time to get back to what I promised in the first post: a look at how shifgrethor handles rooting. Shifgrethor’s solution is somewhat novel and takes advantage of some of Rust’s specific features, so I want to start by looking briefly at some of the other options.

How to root a GC’d object

There are two broad categories of rooting strategies that are common among precise, tracing garbage collectors:

  • Stack Maps: In this implementation, the compiler generates metadata about each stack frame to let the collector know where the Gc’d pointers in that stack frame are. The collector walks the stack using that metadata (the stack map) to identify all the active roots.
  • Runtime collections: In this implementation, the runtime maintains a global collection of active roots. For historical reasons, most Rust implementations following this pattern have used an intrusive doubly linked list. The collector uses this collection to find all of the active roots.

Stack maps are usually considered the better and more first class approach, but Rust runs into some problems here. First of all, a library can’t implement stack maps (so that would require language support). Second of all, stack maps can only map the stack, not the heap (the layout of the heap is not known at compile time). Most languages that use stack maps just don’t have an unmanaged heap, so this isn’t a problem.

The use of a runtime collection also presents challenges. Essentially, this collection want to “borrow” the Gc pointer for the entire time the Gc pointer is alive, which would mean you cannot move the pointer around. If it doesn’t, how do you make sure that the collection’s tracking of the root gets updated when you move it?

Felix went into some detail about this problem in his blog series several years ago if you’re interested in learning more. There’s a particular quote which I found quite intriguing when I read it again recently:

However, Rust today does not have good support for intrusive data structures (RFC Issue 417). The already-mentioned capability to move values freely, as well as the capability to swap a pre-existing T with the value held beind a &mut T reference, are among the reasons that intrusive structures are difficult today, since it changes the addresses associated with objects, and thus requires updating of the interior links.

Reading this paragraph in 2018, I had an important realization: Rust does have good support for intrusive data structures in 2018 thanks to the Pin API, as Ralf discussed in his blog post earlier this year. So when I started working on shifgrethor, this was the first thing I did: I implemented rooting as an intrusive, doubly linked list of pinned pointers into the GC heap. (That isn’t the implementation in shifgrethor now, but we’ll get to that later).

Designing a Gc API with Pin’d roots

The most obvious way to me to design the API for constructing a root was to copy the stack pinning API from the pin-utils crate. In order to create a pin pointing to the stack, you need to use a macro which generates the unsafe code correctly (using a tricky shadowing technique). When I applied this to my situation, I ended up with this macro for creating roots:

// root has the type `Root<'root>`

The internal type of the Root struct is Pin<&'root RootInner>. In the initial implementation, RootInner was a node in the intrusive linked list, so it had these fields:

  • A previous and next pointer for the root list.
  • A pointer (initially null) to the GC’d object this root is supposed to be rooting.

Because you can only ever get a Pin’d reference to the root object, you can’t move it around and invalidate the root list. Now all we needed was an API for setting the third pointer into the GC so that this root can actually root something. This is why the Root type has a gc method, which enroots the object you pass to it in the GC heap:

// The `Foo` object will be put into the managed
// GC heap and rooted by this root:

The power of the 'root lifetime

This representation isn’t ideal, though: if you just access the Gc object through the Root<'root> type, you’ve got this double indirection: first, there’s the pinned reference to the root object on the stack, and then from there the reference into the heap. It would be better if we could just pass around the reference directly into the heap.

In fact, we can! As long as that pointer can’t outlive the root, we can pass around a pointer into the Gc heap separate from the one stored in the root. After all, it won’t be collected for as long as it’s rooted. So we made a small API change so that root.gc returns a new type, Gc<'root, T>, which is just a simple Copy pointer into the Gc’d heap:

let gc: Gc<'root, Foo> = root.gc(Foo::new());

This ended up being really useful for transitive traced objects in the Gc (that is, objects that aren’t themselves rooted, but are owned by another rooted object). We’ll look at those later in this series.

What we’ve done here is very interesting, though: using the 'root lifetime, we have associated references to the object (the Gc type) with the root type without actually tracking the association at runtime, leveraging Rust’s compile time lifetime checking.

(This has its limitations: since we’re no longer tracking every conceptual root, copying collectors which rewrite references when they move things around face additional challenges. Hopefully eventually we’ll get to the point in the series where we write about copying collectors specifically.)

Returning Gc’d objects and the limits of 'root

One problem emerges with this API. Consider a constructor like this:

fn new<'root>() -> Gc<'root, Self> {
    root.gc(Self {
        // ...

This introduces a very clear borrowing problem: the 'root lifetime is tied to a root you’ve declared inside of the constructor, but you’re trying to return a GC’d object, so your object lives longer than the root. Unfortunately, to return a Gc’d object, you have to pass the root into the function:

fn new<'root>(root: Root<'root>) -> Gc<'root, Self> {
    root.gc(Self {
        // ...

This is less convenient than types like Box and Rc, which have straightforward constructors. This is why garbage collecting things just for the convenience seems pretty unlikely. What garbage collection allows you to do, even with this inconvenient API, is cheaply and dynamically decide to extend the lifetime of objects at runtime. All you have to do is reroot them into an outer stack frame, which might be inconvenient but isn’t very expensive.

Going even futher with 'root

Looking at the API again, I noticed a few things:

  • First, we’re no longer rooting every single pointer from the unmanaged memory into the GC’d memory; all we’re doing is making sure there is one rooted pointer for as long as there are any pointers at all.
  • Second, these rooted pointers actually end up forming a perfect stack.

Because of the API, we can guarantee that the roots drop in exactly a lexical, stack-like ordering: the inverse of their construction ordering. This is because they are created on the stack and then never moved:

// No matter what you do, at the end of this scope, `root2` will be dropped and
// then root1 be dropped. And nested scopes work the same way.

Looking at this, it occurred to me: what if we just stopped tracking roots in the stack at all? Instead of this doubly linked list of roots, we could actually use a proper global stack. Every time you create a root, you push a slot onto that stack, and every time the root gets dropped, the top of the stack gets popped. I call this second stack the “root stack.”

I think this is at least a marginal performance improvement (I haven’t actually measured) because the collector doesn’t have to bounce around your entire stack frame looking for Gc’d objects and growing the root stack is free memory-wise as long as it has, at some point, been at least this tall before. In any event its novel and interesting and worth trying.

To be clear, this is the behavior of the three phases of a root’s lifecycle:

  1. When you create a root, it pushes a null root onto the root stack. The index of that root is tracked in the object you created on the real stack.
  2. When you set what the root tracks with the gc method, we use the index we’re tracking to set what object is rooted in the rootstack.
  3. When the root you’ve created is dropped, it pops the top of the root stack, which is guaranteed to be the same root you created when the made the original root.


That’s a good overview of how shifgrethor’s rooting system works right now. In the future, I’m interested in thinking about other ways we could use the borrow checker to support rooting. For example, an alternative to stack maps which would better support the unmanaged heap would be to somehow generate information at compile-time about which objects are currently being borrowed. This requires language support, though, so its not about to be implemented in shifgrethor any time soon. I might write more about this idea later on.

In the next post, though, I’m going to turn to the second half of the problem of the mark phase: tracing. Now that we’ve rooted some objects in the stack, how do we trace from those to other objects in the GC heap; how do we make the “rootedness” property transitive?