Unsafety in Rust is often discussed in terms the primitive operations that can only be performed inside of unsafe blocks (such as dereferencing raw pointers and accessing mutable statics). I want to look at it from a different angle from these primitive operations, and instead focus on the capability to produce unsafe abstractions.
The general concept of unsafe abstractions
An unsafe abstraction is a new abstraction which requires the
to apply to some context (this is an intentionally “abstract” definition,
because as we will see there are several highly divergent forms of unsafe
abstraction supported in Rust). The
unsafe keyword is required to apply the
abstraction because the abstraction introduces some invariant which cannot be
type checked and which the rest of the program is allowed to assume is
maintained in order to assume type safety.
To give a single concrete example, the
slice::from_raw_parts function is an unsafe abstraction
which allows users to create a slice from a raw pointer and a length. This
function has several untyped invariants which must be maintained:
- The pointer must refer to an array of type
Tof at least the length of the second argument.
- The array must be valid to dereference into for the entire lifetime
In any unsafe abstraction, there are always three components, which I am going to assign (possibly arbitrary) names to:
- The abstractive: A component which introduces a new untyped invariant.
- The applicative: A component which upholds that invariant.
- The assumptive: A component which is correct so long as that invariant has been correctly upheld.
It’s worth noting that the assumptive component can only rely on the applicative component upholding the particular invariant introduces by the abstractive component. You cannot just assume additional invariants will be upheld. If additional invariants are necessary, they can be introduced, but doing so is a breaking change to the API.
Finally, it is very important that the applicative component - in which the
user asserts that they are upholding the invariant - involve the
keyword. The assumptive component, in contrast, does not necessarily need to
unsafe - it just assumes that an invariant is upheld, it does not claim
to uphold any invariant.
Now I want to go through some of the kinds of unsafe abstractions that can be introduced in Rust.
Functions and inherent methods can be marked unsafe in Rust. The
slice::from_raw_parts function mentioned earlier is one
example, but there are many.
An unsafe function is broken up like this:
- The abstractive: The function signature, in using the
unsafekeyword, introduces a new untyped invariant, which should be documented.
- The applicative: Any caller of the function applies that abstraction and guarantees that it upholds the invariant the function requires.
- The assumptive: The body of the function, and any other code that relies on state controlled by that function (e.g. something that uses its return value) assumes that the invariant is upheld by the function’s caller.
Another kind of abstraction which is quite different from function abstraction
is the unsafe trait. The
Send, for example, is an unsafe trait.
Here the breakdown is quite different:
- The abstractive: The trait definition introduces an invariant which must
be true of any implementation of this trait (for example,
Send’s invariant is that the type it is implemented for can be passed between threads).
- The applicative: Any implementation of this trait must uphold the invariant introduced by that trait.
- The assumptive: Any time the bound
T: Sendis used, the assumption is made that the invariant - that
Tcan be passed between threads - is upheld by every implementation of that trait.
Unsafe associated items (e.g. unsafe trait methods)
The most subtle form of unsafe abstraction is probably unsafe associated items.
This refers to methods and associated functions primarily. Within a trait, it
is possible to tag a particular function declaration as
unsafe. The breakdown
of the components of this unsafe abstraction is similar to unsafe functions:
- The abstractive: The function signature in the trait introduces a new untyped invariant.
- The applicative: Any caller of any instance of that trait function must uphold the invariant introduced by the trait definition.
- The assumptive: All implementers of that trait can assume that the invariant is upheld, but only the invariant introduced by the trait declaration. Implementations must not introduce new invariants.
The most surprising aspect of unsafe trait methods is the distance between introducing the invariant and relying on it. It might seem natural, when implementing an unsafe method, to think that you can introduce invariants of your own. But if you’re in a trait, this is not correct - in order for generic method calls to work, every implementation must rely on the same invariants, not new ones of their own.
Safe implementations of unsafe methods
What becomes even more frustrating, though, is when your particular
implementation doesn’t actually rely on the invariant that the trait has
introduced. What you’d like to be able to do here is drop the unsafe keyword,
asserting that your particular implementation is safe. Then, in a concrete
context, others can call this method from safe code without upholding the
invariant or using an
unsafe block. You’ll also get all the checking a safe
function would have in your implementation, helping you assure that your
implementation actually is safe.
This does seem like a particularly difficult feature to add to the language - just allow safe implementations of unsafe methods - but it runs into the problem that some code in the wild is currently relying (incorrectly, in my opinion) on the requirement that every implementation be marked unsafe.
A particular example of what I mean is tokio’s AsyncRead trait. This trait has one unsafe method, but as it is clearly documented, that method is actually safe:
This function isn’t actually
unsafeto call but
unsafeto implement. The implementer must ensure that either the whole
bufhas been zeroed or
read_buf()overwrites the buffer without reading it and returns the correct value.
Here, the division of the unsafe abstraction is somewhat different from what I outlined above:
- The abstractive: The method signature introduces an invariant.
- The applicative: Every implementation upholds that invariant.
- The assumptive: Every caller can assume that invariant is upheld.
Because the applicative is the implementation, it is not acceptable to ever allow safe implementations of this method.
Given the set of tools we have available, this actually most clearly follows
the pattern of an unsafe trait, not an unsafe method, and probably the unsafe
keyword “should” be moved to
unsafe trait AyncRead instead of on this
particular method. However, this also works today (as long as we don’t
allow safe implementations of unsafe methods), so there currently isn’t an
impetus to change it.
It also is possible that there is a way to be more expressive in declaring the
components of this unsafe abstraction, while keeping the relatively simple
mental model that the current
unsafe keyword has (e.g. some way to say
clearly that this method is unsafe to implement, not unsafe to call).