Shifgrethor IV: Tracing
The post before this one covered how shifgrethor handles rooting: how we track for the garbage collector that this object is alive. That isn’t sufficient for implementing a tracing garbage collector though: the idea of a tracing garbage collector is that we can trace from rooted objects through all of the objects they reference. That way, instead of having to root everything you use, you can only root a few objects from which all of the live objects can be traced.
Tracing itself is not solved in a particular groundbreaking way in shifgrethor, but the problems that tracing introduces, and how shifgrethor solves them, are worth going into some detail about. But we’ll start with tracing itself and then turn our attention to the problems.
The Trace
trait
The tracing algorithm works like this: starting at a root, visit every Gc object it references and mark that object so that it will not be collected. This algorithm is an instance of the visitor pattern, which Rust has a well-known manner of encoding: a trait and a derive for that trait which calls the trait’s method recursively on all fields. serde’s traits, for example, are an instance of this pattern.
For that reason, shifgrethor introduces the Trace
trait:
pub unsafe trait Trace {
fn mark(&self);
// ...
}
In order to put something into a Gc, you must derive Trace
for it, and all of
its fields must also implement Trace
. Types which don’t contain any GC
pointers have a very trivial trace implementation: a noop. The compiler, after
aggressively inlining all of these function calls, should be able to reduce the
Trace::mark
function to directly visiting all of the GC objects without
unnecessary overhead.
You may notice that in the code sample I posted above, the Trace
trait is an
unsafe trait, making it unsafe to implement it by hand. This is because it is
essential that your implementation of Trace
function correctly. This is
because implementing Trace
is not the tricky part of the problem: validating
that a GC object is properly traced is the tricky part.
Proving rootedness
We can refer to this validation of tracing as “proving rootedness” - e.g.
proving that this GC object is rooted, or transitively rooted through a proper
trace from a root. It’s essential that any time you dereference a Gc
pointer,
we have proof that the pointer is rooted.
We can’t just let you construct unrooted Gc
pointers to wrap around your
fields, because then we won’t know that you’ve properly rooted those pointers.
You could go to dereference them and, because they weren’t traced properly,
you’ll start reading from freed memory. We need to prove that whenever you have
a Gc<'root, T>
, T
is being rooted or traced for at least the lifetime
'root
.
The solution to this in shifgrethor is to introduce a new type, GcStore
. A
GcStore
is a kind of gc pointer that does not implement Deref
, so you can’t
actually see into it. Instead, you have to prove that you have rooted any
GcStore
type by either:
- Rooting it directly, using the Root API
- Having a
Gc<'root, (GcStore<'_, T>, ..)>
, which proves that it is rooted transitively through the tracing implementation.
What this means, practically speaking, is that if you have a struct with a
proper trace implementation, projecting from a Gc
of that struct to a Gc
of
a field in a GcStore
is safe. To provide this, the derive in shifgrethor
provides an attribute which generators an accessor method.
(The derive is called GC
, instead of Trace
, because it also generates
another trait as well as these accessors).
#[derive(GC)]
struct Foo<'root> {
#[gc] bar: GcStore<'root, Bar>,
}
This generations a method called bar
with the signature Gc<'root, Foo<'_>> -> Gc<'root, Bar>
.
This is one of the core aspects of how shifgrethor works: in order to get a
reference to something in the GC, you must always prove that you have rooted it
for the lifetime of that reference. In order to achieve this, we thread this
'root
lifetime parameter through all our code using GC’d objects.
Conclusion
This handles tracing for the normal part of an object’s lifecycle: while its being managed by the GC and used. However, the story becomes a bit more complicated at two points in the lifecycle of a GC object: the beginning and the end. Next, we’ll look at construction & destruction: how they can be tricky, and what shifgrethor does about that.