Shifgrethor II: Notes on tracing garbage collectors

In the previous post I said that in the second post in the series we’d talk about how rooting works. However, as I sat down to write that post, I realized that it would be a good idea to back up and give an initial overview of how a tracing garbage collector works - and in particular, how the underlying garbage collector in shifgrethor is implemented.

In the abstract, we can think of the memory of a Rust program with garbage collection as being divided into three sections: the stack, the “unmanaged” heap, and the “managed” heap. The unmanaged heap is the part of the program memory a Rust programmer might normally think of as just the heap: its memory given to you by the allocator which expects you to call free, and in Rust you get access to it through safe APIs like Box and Vec that allocate when you create them and free when you drop them.

In contrast, the “managed” heap presents a similar allocation API, but it does not present an API to free memory: instead, the garbage collector is responsible for freeing the memory you are no longer using. This is why it’s called the “managed” heap, because that memory is managed for you.

You might think of the managed heap as a completely separate memory section from the unmanaged heap - a whole second heap, just like the stack and the unmanaged heap are very separated. But that’s not actually necessary: a library can implement a garbage collector by allocating into the unmanaged heap and then handling collection for you in an automated way. And this is exactly what shifgrethor does.

All of the data in the “managed heap” is actually stored in a linked list in the unmanaged heap. This is basically a bunch of Boxes each of which contains an object you’ve asked to be garbage collected as well as some header data. This header data includes (among other things), a pointer to the next node of the linked list, and a “mark” boolean.

During the garbage collection phase the program first marks every object that is alive (the “mark” phase). Then it visits every object that’s not been freed yet, and if they weren’t marked, it frees it (the “sweep” phase). This sweep phase is straightforward to implement: we walk the linked list of heap allocations, deleting nodes as we visit them if their “mark” boolean is set to false. It’s the “mark” phase that’s a bit trickier, because we need to look at the state of the program and figure out which objects are still alive.

The marking phase has two important components, each of which we will cover in turn in this series:

  • Rooting: The references into the managed heap from the unmanaged portion of your program (that is, the stack and the rest of the heap) must be “rooted,” so that we know where to start looking for living objects. There are a lot of different rooting strategies; the one used in shifgrethor is somewhat unique.
  • Tracing: Once you know where all the roots are, you also want to know where all the objects that are reachable from the rooted objects are. That is, you want to follow and mark all the references from gc’d object to gc’d object, starting at the roots. You can think of this as a kind of graph tracing algorithm.

Finally, you may have noticed some potential efficiency problems with the GC implementation I described above. In particular, a linked list of objects has some problems with - for example - memory locality: the objects in your GC won’t necessarily be close together, making the memory accesses between them poorly cached and more expensive. This is problem can be solved by “compacting” garbage collectors, which move live objects around so that they will be closer together. Compacting garbage collectors are one kind of “copying” garbage collector - a garbage collector which copies data around to improve its performance. Copying collectors introduce unique problems for Rust which I’ll address later in this series.

Alright, now that we’ve laid out this technical background, in the next post in the series we’ll look at rooting and how shifgrethor handles it.