Lately, I have decided to jump in and learn Zig. Learn the whole thing. After a weekend of reading the whole reference, following some guides, doing [ziglings], and even started migrating some — WIP — Rust projects of mine just to learn correctly, I think I’m now ready to start talking about Zig. More specifically, I want to talk about something specific to the safety of Zig.
Disclaimer: this article expects some sort of fluency with Zig to read smoothly. However, the examples and snippets will be explainied for the uninitiated ones.
Problem
Zig has many interesting features and safeties. I won’t cover them up here, it’s not the aim of the article, and I’m sure you’ll find that information on your own — there are plenty of articles talking about how well designed the language is.
No, instead, I want to talk about memory safety and static checks. And actually, it’s even more than memory safety. It’s more about correctness. Instead of taking the — traditional and boring — example of using an allocator (people using Rust, Python, C++, Java, etc. wouldn’t get the problem because they do not have a direct access to allocators — I want to show that the problem exists not only for memory allocations, but for a much more general category: resources. So let’s take an example:
const Storage = struct {
// … hidden
const Self = @This();
pub fn create_resource(self: *Self) StorageError!Handle {
// … hidden
}
pub fn destroy_resource(self: *Self, handle: Handle) StorageError!void {
// … hidden
}
};
Okay, so that is basically a Storage
type with create_resource
and
destroy_resource
methods. The former creates a new resource in the Storage
and returns a Handle
to it. The second one destroyes a resource in a Storage
providing its Handle
.
You would use it like so:
// …
const h = try storage.create_resource();
// do something with the storage …
// … then later, at some point:
try storage.destroy_resource(h);
It’s not important to use
defer
here; the problem would be the same.
Now, what happens if you forget to call destroy_resource
? Well, your code
still compiles, while it’s not correct. What happens if you call
destroy_resource
twice with the same pointer? Probably something bad, but it
still compiles.
The rest of the article assumes that we want to solve this problem at compile-time, because yes, Zig will catch the problem at runtime, probably (eventually?) resulting in a resource leak.
Some ideas
To solve this problem, we historically used — mainly — two approaches:
- Introduce a garbage collected form of resource management. Every function that may create a resource — allocate memory; open a file; etc. — may wrap the pointer in a special pointer that the GC will track. When no more live object exists, the resource is eventually destroyed.
- Introduce destructors. This is the direction Rust took, for instance. The destructor allows you to run arbitrary code when a value is dropped.
The latter option is interesting, because it’s actually composed of two different features:
- The drop part, that is automatically inserted by the compiler when something goes out of scope.
- Linearity — the ability to recognize that a value should be used only once. Indeed, without it, you cannot know where and when place the drop call.
If we take a bit of hindsight, the drop part is actually not that good, because it hides some implicitly called code from the programmer. It makes it harder to think in terms of when some code will be called — is this value of a type that implements a drop interface? if so, where does it go out of scope? etc.
Moreover, the real problem is not really to automatically call the cleanup code. The real problem is to be sure that the cleanup code is called. Eh, the nuance is subtle, but I think it makes all the difference.
Ensuring instead of enforcing
What we really want to do is to be sure that the user call destroy_resource
.
We should not need destructors for that. Let’s continue with our example:
const handle = try storage.create_resource();
The only way we can emit a compiler error here — because handle
is not
destroyed! — is to recognize that handle
has some special – compilation-only —
properties that need to be passed to the destroy_resource
. ATS has some
answer to this — careful if you click the link; it’s a pretty complex language!
That answer is called Programing with Theorem-Proving (PwTP). My idea here is
not to take all of ATS but just the part that is interesting to our problem:
proofs.
A proof is a compile-time (comptime
) value that is returned by a function,
typically during resource creation (allocation, opening files, etc.) and that is
linear. A linear value is a value that must be consumed at least once and at
most once — it must be consumed exactly once! Consider this:
fn do_something(x: Foo) void {
// …
}
const x = try alloc.create(Foo);
do_something(x);
If this program states that x
is linear, after we pass the value to
do_something
, x
is not available anymore. If you have done some Rust, you
should be familiar with the concept. In Rust, we say that x
was moved into the
function.
Proof marking
Back to the proof thing. We can simplify ATS proofs and create a simpler proof system by doing the following:
- Define a proof as a set of tying variables, and a tag.
- Allow a function to return a value which type can be marked with a proof.
- A value which type is proof-marked becomes linear.
- Only a function taking as argument the same type with the exact same proof marked can consume the argument.
I know, that’s a lot of new ideas. I’ll introduce the syntax I have on my mind to explain a bit more what I think. Let’s start with a regular alloc/destroy wrapper and then enhance it:
fn create(alloc: Allocator, comptime T: type) Allocator.Error!*T {
const ptr = try alloc.create(T);
return ptr;
}
fn destroy(alloc: Allocator, ptr: anytype) void {
alloc.destroy(ptr);
}
Nothing fancy with this base code. Now, let’s introduce proof marking. The idea is that we want that not to be in the way too much. The syntax of a proof mark goes as follows:
- Full syntax
| [a, b, …, e]#tag
. - Inferred syntax
| #tag
.
The | …
syntax marks a type with a proof.
[a, b, …, e]
is a list of tying arguments, which ties the proof to those
values. If omitted (hence using the second form), all of the function arguments
are used as tying the proof.
The #tag
syntax declares a tag symbol in the current Zig file. That tag along
with the tying variables uniquely identifies the proof.
Proofs can only be created in return-position of a function. Let’s mark our
create
function with a proof:
fn create(alloc: Allocator, comptime T: type) Allocator.Error!*T | #create {
const ptr = try alloc.create(T);
return ptr;
}
As you can see, proofs only live at compilation time, and the implementation
has not changed. | #create
marks the return value of create
with a proof
which is basically [alloc, comptime T]#create
. Now, let’s create something:
test "linear allocations" {
const alloc = std.testing.allocator;
const ptr = try create(alloc, u32);
}
The type of ptr
is *u32 | [alloc, u32]#create
. Here, alloc
is our testing
allocator, and thus we tie the ptr
variable to the alloc
variable and u32
.
What we have here is what I would call a dangling proof. ptr
type is still
marked by a proof, and this the compiler should emit an error stating that a
dangling proof is left to be consumed. To solve the problem, we must consume the
proof. Let’s modify destroy
:
fn destroy(alloc: Allocator, ptr: anytype | [alloc, @typeInfo(@TypeOf(ptr)).Pointer.child]#create) void {
alloc.destroy(ptr);
}
Okay, that’s a lot. Let’s scan through the signature. We can see that ptr
has
a proof-marked type. The proof ties alloc
and, basicalyl, the type of the
pointee — because Zig’s comptime
is so awesome, we can actually do all of that
at compile-time!
If you think this is too verbose, you can simply pass the type yourself:
fn destroy2(alloc: Allocator, comptime T: type, ptr: *T | [alloc, comptime T]#create) void {
// …
}
We can then properly consume our linear ptr
:
test "linear allocations" {
const alloc = std.testing.allocator;
const ptr = try create(alloc, u32);
// 1. either
destroy(alloc, ptr);
// 2. or this
destroy2(alloc, u32, ptr);
}
Some thoughts about marking
If you have noticed, marking is done on the error union:
fn create(alloc: Allocator, comptime T: type) Allocator.Error!*T | #create {
And the question is: what happens if we return an error? I think that an error
would invalidate the proof. However, if you don’t call try
, the type is:
test "linear allocations" {
const alloc = std.testing.allocator;
const ptr: Allocator.Error!*T | [alloc, u32]#create = create(alloc, u32);
}
If ptr
is actually an Allocator.Error
, the proof is lost and considered
consumed.
Conclusion
Such a change in the language would have tons of impacts, and because of that, I
think that nothing in the standard library should use the feature, to prevent
breaking pretty much everyone — the Allocator
interface is at the heart of
the language and its ecosystem. However, I think that having the feature
available would allow people to use it in their own wrappers, and as mentioned
before, not only for memory allocation. You can easily see how such a feature
would be useful for files, database connections, and more.
I know that the Zig language will probably never accept a proposal like this,
but given comptime
that is already there, I think it’s super close of actually
being able to provide PwTP.
Anyway, that was my ideas, and I will probably write more about Zig in the future.
Cheers!