Notes on a smaller Rust

July 17, 2019

Many people who use Rust for a bit - especially those who like the language but do not fall in love with it - feel a sense that there must be a smaller, simpler variation on the same theme which would maybe be a little less powerful, but would also be much easier to use. I agree with them, but I think they are almost always wrong about what would need to change. Here are some notes on where I would start to create that smaller Rust.

What makes Rust work

People almost always start in precisely the wrong place when they say how they would change Rust, because they almost always start by saying they would add garbage collection. This can only come from a place of naive confusion about what makes Rust work.

Rust works because it enables users to write in an imperative programming style, which is the mainstream style of programming that most users are familiar with, while avoiding to an impressive degree the kinds of bugs that imperative programming is notorious for. As I said once, pure functional programming is an ingenious trick to show you can code without mutation, but Rust is an even cleverer trick to show you can just have mutation.

Here are the necessary components of Rust to make imperative programming work as a paradigm. Shockling few other production-ready imperative languages have the first of these, and none of them have the others at all (at least, none have them implemented correctly; C++ has unsafe analogs). Unsurprisingly, the common names for these concepts are all opaque nonsense:

“Algebraic data types”: Having both “product types” (in Rust structs) and “sum types” (in Rust enums) is crucial. The language must not have null, it must instead use an Option wrapper. It must have strong pattern matching and destructuring facilities, and never insert implicit crashing branches.
Resource acquisition is initialization: Objects should manage conceptual resources like file descriptors and sockets, and have destructors which clean up resource state when the object goes out of scope. It should be trivial to be confident the destructor will run when the object goes out of scope. This necesitates most of ownership, moving, and borrowing.
Aliasable XOR mutable: The default should be that values can be mutated only if they are not aliased, and there should be no way to introduce unsynchronized aliased mutation. However, the language should support mutating values. The only way to get this is the rest of ownership and borrowing, the distinction between borrows and mutable borrows and the aliasing rules between them.

In other words, the core, commonly identified “hard part” of Rust - ownership and borrowing - is essentially applicable for any attempt to make checking the correctness of an imperative program tractable. So trying to get rid of it would be missing the real insight of Rust, and not building on the foundations Rust has laid out.

However, once you get away from that, there is must about Rust that is circumstantial complexity, only because of Rust’s extensive efforts to support a low-overhead high-control systems programming use case.

How Rust could be simpler if it weren’t a “systems” language

The first thing I would do is abandon any guarantee about whether variables are allocated on the stack or the heap. Rust has to provide users with this sort of low level control to be viable in all the places that it is, but this is the singular source of complexity that doesn’t contribute to the benefits of Rust I described in the previous section. Like Go, the compiler will decide for you if your variables go on the stack or the heap.

The main way this simplifies things is it allows ownership to be detatched from representation. An owned, borrowed, and mutably borrowed T would all share the same representation; they’re just type-level modifiers used to determine what kind of control you have (an owned T is just a mutably borrowed T that will drop at the end of this scope). Probably I would also add a fourth modifier which is shared ownership, probably implemented via garbage collection. These would have the same semantics as Arc, except that it would be possible to implement them so that they don’t leak on cycles.

This would not simplify ownership and borrowing in any fundamental way, but I believe it would eliminate a lot of the confusions people run into as they’re getting the hang of ownership and borrowing. For example, there could be only one string type.

The next change would to make trait objects the primary form of polymorphism. Every trait would be object safe, and casting a type into a trait it implements would be lightweight and easy because it can be heap-allocated by the compiler behind the scenes. No more monomorphization (at least, except as an optimization). Generics would exist only for creating polymorphic container types, not as the main way of making polymorphic functions.

I would also take Rust’s commitment to concurrency-first and make all the available primitives threadsafe. No Rc, no Cell and RefCell. Interior mutability is only allowed through a mutex type, and everything can be moved across threads. Send and Sync would only exist as some built-in checks on ownership-related types.

I think closures and arrays/vecs/slices could also be simplified a great deal by not guaranteeing anything about where they’re allocated, though I haven’t worked out all the details.

Such a language would almost certainly also use green threads, and have a simple CSP/actor model like Go does. There’d be no reason to have “zero cost” futures and async/await like we do in Rust. This would impose a runtime, but that’s fine - we’re not creating a systems language. I already said earlier that we’d probably have garbage collection, after all.

And there are also many things I’m not sure about. I’m not sure that basing the interface polymorphism on Haskell type classes would be the best decision for this language (though I love it for Rust). I’m not sure what kind of metaprogramming facilities the language should have (I would not include macro_rules-style macros without a long iterative design process that ours did not get.) I’m not sure what I would change about cargo - probably nothing fundamental, cargo is great.

But the biggest thing I’m unsure about - and the most controversial - is error handling. I would probably experiment with exceptions if I were making this language. I believe that all of the big problems with exceptions are solved by those core benefits of Rust I described initially, which provide strong guarantees that program state will be correctly cleaned up as a program crashes, and correctly restored as the crash is halted. I think I would at least make panicking the normal response to most IO failures. (Even with exceptions users could still use Result for the kinds of errors exceptions don’t make sense for).

I would expect this language to have performance characteristics similar to Java or Go, and to be viable for all the use cases they are (plus on the front end!) It would introduce some additional complexity in the form of ownership and borrowing, but would not be nearly the slog for new users that Rust currently is. And in exchange you would get a language which prevents a huge class of errors that no other imperative language except Rust helps you with today.

Finally, if I were creating a non-systems language today, even if I weren’t taking this advice, I would do this one important additional thing. I would be careful to design and implement the compiler so that it could be embedded in different runtimes, and I would have two primary targets: an LLVM based one creating a standalone binary on the mainstream UNIX and windows OSes, and a WASM target that would be intended to use the host VM’s runtime for threads, garbage collection, and so on. This is the best decision anyone creating a new production-ready language in 2019 could make if they want to see widespread adoption.

Oh and of course, I would implement this language and its runtime in Rust!