Shifgrethor IV: Tracing

The post before this one covered how shifgrethor handles rooting: how we track for the garbage collector that this object is alive. That isn’t sufficient for implementing a tracing garbage collector though: the idea of a tracing garbage collector is that we can trace from rooted objects through all of the objects they reference. That way, instead of having to root everything you use, you can only root a few objects from which all of the live objects can be traced.

Tracing itself is not solved in a particular groundbreaking way in shifgrethor, but the problems that tracing introduces, and how shifgrethor solves them, are worth going into some detail about. But we’ll start with tracing itself and then turn our attention to the problems.

The Trace trait

The tracing algorithm works like this: starting at a root, visit every Gc object it references and mark that object so that it will not be collected. This algorithm is an instance of the visitor pattern, which Rust has a well-known manner of encoding: a trait and a derive for that trait which calls the trait’s method recursively on all fields. serde’s traits, for example, are an instance of this pattern.

For that reason, shifgrethor introduces the Trace trait:

pub unsafe trait Trace {
    fn mark(&self);
    // ...
}

In order to put something into a Gc, you must derive Trace for it, and all of its fields must also implement Trace. Types which don’t contain any GC pointers have a very trivial trace implementation: a noop. The compiler, after aggressively inlining all of these function calls, should be able to reduce the Trace::mark function to directly visiting all of the GC objects without unnecessary overhead.

You may notice that in the code sample I posted above, the Trace trait is an unsafe trait, making it unsafe to implement it by hand. This is because it is essential that your implementation of Trace function correctly. This is because implementing Trace is not the tricky part of the problem: validating that a GC object is properly traced is the tricky part.

Proving rootedness

We can refer to this validation of tracing as “proving rootedness” - e.g. proving that this GC object is rooted, or transitively rooted through a proper trace from a root. It’s essential that any time you dereference a Gc pointer, we have proof that the pointer is rooted.

We can’t just let you construct unrooted Gc pointers to wrap around your fields, because then we won’t know that you’ve properly rooted those pointers. You could go to dereference them and, because they weren’t traced properly, you’ll start reading from freed memory. We need to prove that whenever you have a Gc<'root, T>, T is being rooted or traced for at least the lifetime 'root.

The solution to this in shifgrethor is to introduce a new type, GcStore. A GcStore is a kind of gc pointer that does not implement Deref, so you can’t actually see into it. Instead, you have to prove that you have rooted any GcStore type by either:

  • Rooting it directly, using the Root API
  • Having a Gc<'root, (GcStore<'_, T>, ..)>, which proves that it is rooted transitively through the tracing implementation.

What this means, practically speaking, is that if you have a struct with a proper trace implementation, projecting from a Gc of that struct to a Gc of a field in a GcStore is safe. To provide this, the derive in shifgrethor provides an attribute which generators an accessor method.

(The derive is called GC, instead of Trace, because it also generates another trait as well as these accessors).

#[derive(GC)]
struct Foo<'root> {
    #[gc] bar: GcStore<'root, Bar>,
}

This generations a method called bar with the signature Gc<'root, Foo<'_>> -> Gc<'root, Bar>.

This is one of the core aspects of how shifgrethor works: in order to get a reference to something in the GC, you must always prove that you have rooted it for the lifetime of that reference. In order to achieve this, we thread this 'root lifetime parameter through all our code using GC’d objects.

Conclusion

This handles tracing for the normal part of an object’s lifecycle: while its being managed by the GC and used. However, the story becomes a bit more complicated at two points in the lifecycle of a GC object: the beginning and the end. Next, we’ll look at construction & destruction: how they can be tricky, and what shifgrethor does about that.