Shifgrethor I: Garbage collection as a Rust library
I’m really excited to share with you an experiment that I’ve been working on for the past 5 or 6 weeks. It’s a Rust library called shifgrethor. shifgrethor implements a garbage collector in Rust with an API I believe to be properly memory safe.
I’ll be going through all of the technical details in future blog posts, so I want to kick this series off with a high level overview of the project’s purpose and design decisions.
Shifgrethor is a research project
The first thing I want to be clear about is that shifgrethor is not intended for practical use right now. The APIs have not been verified for safety enough to recommend using them yet, and the garbage collector implemented in shifgrethor is not optimized at all: it is a very simple mark-and-sweep collector. I do not recommend that you use it right now.
However, assuming shifgrethor’s API is safe - and I believe it is - it represents an exciting step forward in understanding how Rust’s language faculties can be used to encode complex memory management requirements in the safe API. That is to say, you can implement garbage collection as a library in Rust! That’s amazing! How far can we push that?
In the long term, shifgrethor may lead to a usable, performant garbage collector in Rust. However, the API I am presenting here is not simpler or more convenient to use than Rust’s normal APIs, for a variety of reasons. Its unlikely, then, that this project will result in garbage collection becoming a commonly used feature by Rust projects. That is, to be emphatic, I do not expect we will ever “add GC to Rust” as it exists in other languages, or that we will recommend people use it as an “escape hatch” around the borrow checker.
Rather, I expect that if shifgrethor finds a use case, it will be as a garbage collector that users can reach for when their data is actually better suited to garbage collection than automatic memory management: when they have complex graphs of data, possibly cyclic, definitely involving shared ownership, which they have to manage in a dynamic manner. That is, we can reach a state in which Rust can provide garbage collection when you really need it.
Shifgrethor is a precise, tracing garbage collector
There are many different kinds of garbage collectors, shifgrethor is designed around some specific constraints:
- It is a tracing garbage collector which allows cycles. That is, the collection algorithm traces what data is alive from the “roots” (unmanaged references to the managed data) and then cleans everything that isn’t alive. Unlike other strategies like reference counting, cyclically owned data will not be leaked by this kind of garbage collector.
- It is a precise garbage collector. That means that the garbage collector knows exactly what data is rooted, setting it apart from “conservative” collectors, which don’t know where their roots are and have to pessimistically assume that anything that looks like a root is one. Boehme is an example of a conservative collector. Shifgrethor is not like this: when the collector runs it knows exactly what is alive and what isn’t.
Shifgrethor allows for full freedom of references
Another thing that separates shifgrethor from other previous iterations on garbage collection libraries is that it is designed to allow arbitrary, direct references between the managed memory and unmanaged memory. That is to say that all of these kinds of configurations are safely enabled by shifgrethor’s API:
- You can dereference from a Gc pointer to a field of the object in the GC heap: you get a normal reference back.
- An object in the GC heap can hold borrowed references to objects not in the GC heap, whether they’re on the stack or in the unmanaged heap.
- An object in the GC heap can own unmanaged heap objects, like a
Vecfor example. The garbage collector properly traces through those types, so that - for example - you can have a GC’d object containing a vector of other GC’d objects.
- You can move
Gcpointers from the stack into the heap.
Gc type itself is a copyable pointer, just like a regular reference.
Its important to note that, because garbage collection is a kind of shared
ownership, we only provide immutable access to garbage collected objects (just
Rc type), so you need to use interior mutability to mutate them.
Having access direct references like this can be problematic for some garbage collector algorithms, especially so-called “copying” or “moving” collectors. These kinds of garbage collectors move objects around to optimize performance in one or more dimension: if you have a direct reference into the heap, it is unsafe to move that object while that reference is live. As a result, in order to comply with shifgrethor’s API, such a collector would need to implement pinning - the ability to dynamically mark an object as immovable for the time being.
Shifgrethor’s API is designed so that this would work successfully, but the current garbage collector backing shifgrethor does not ever move managed objects (it is a very basic mark and sweep implementation). The design is intended to be extensible to both copying collectors and concurrent collectors, and we’ll hopefully touch on the pitfalls this can introduce as the series goes on.
Hopefully this has given you a good idea of what shifgrethor is for and what it is capable of doing. All of the code is available on GitHub, as well as some more discussion of the API in the README. This series is going to continue with a deep dive into how the API of shifgrethor works & achieves safety, as well as some outstanding problems and shortcomings. I hope to conclude the series - eventually - with some of the broader lessons I’ve learned from this experience about the design of Rust and the design of APIs in Rust.
Before I go any further, I also want to thank everyone who has given me feedback on shifgrethor so far, and also to acknowledge the impact of prior work in this area. In particular, Nika Layzell, Aaron Turon and Nick Fitzgerald have all provided very helpful feedback at different points in the design. Prior work on GC in Rust that influenced shifgrethor include:
- this blog series by Felix Kloch
- rust-gc by Manish Goregaokar & Nika Layzell
- josephine by Alan Jeffrey
- cell-gc by Jason Orendorff
In the second post in this series, we’re going to take a deep dive into the problem of rooting a tracing garbage collector in Rust.