Lately, I have decided to jump in and learn Zig. Learn the whole thing. After a weekend of reading the whole reference, following some guides, doing [ziglings], and even started migrating some — WIP — Rust projects of mine just to learn correctly, I think I’m now ready to start talking about Zig. More specifically, I want to talk about something specific to the safety of Zig.

Disclaimer: this article expects some sort of fluency with Zig to read smoothly. However, the examples and snippets will be explained for the uninitiated ones.

Problem

Zig has many interesting features and safeties. I won’t cover them up here, it’s not the aim of the article, and I’m sure you’ll find that information on your own — there are plenty of articles talking about how well designed the language is.

No, instead, I want to talk about memory safety and static checks. And actually, it’s even more than memory safety. It’s more about correctness. Instead of taking the — traditional and boring — example of using an allocator (people using Rust, Python, C++, Java, etc. wouldn’t get the problem because they do not have a direct access to allocators — I want to show that the problem exists not only for memory allocations, but for a much more general category: resources. So let’s take an example:

const Storage = struct {
    // … hidden
    const Self = @This();

    pub fn create_resource(self: *Self) StorageError!Handle {
      // … hidden
    }

    pub fn destroy_resource(self: *Self, handle: Handle) StorageError!void {
      // … hidden
    }
};

Okay, so that is basically a Storage type with create_resource and destroy_resource methods. The former creates a new resource in the Storage and returns a Handle to it. The second one destroys a resource in a Storage providing its Handle.

You would use it like so:

// …
const h = try storage.create_resource();

// do something with the storage …

// … then later, at some point:
try storage.destroy_resource(h);

It’s not important to use defer here; the problem would be the same. Also, defer will not work on fallible methods.

Now, what happens if you forget to call destroy_resource? Well, your code still compiles, while it’s not correct. What happens if you call destroy_resource twice with the same pointer? Probably something bad, but it still compiles.

The rest of the article assumes that we want to solve this problem at compile-time, because yes, Zig will catch the problem at runtime, probably (eventually?) resulting in a resource leak.

Some ideas

To solve this problem, we historically used — mainly — two approaches:

Introduce a garbage collected form of resource management. Every function that may create a resource — allocate memory; open a file; etc. — may wrap the pointer in a special pointer that the GC will track. When no more live object exists, the resource is eventually destroyed.
Introduce destructors. This is the direction Rust took, for instance. The destructor allows you to run arbitrary code when a value is dropped.

The latter option is interesting, because it’s actually composed of two different features:

The drop part, that is automatically inserted by the compiler when something goes out of scope.
Linearity — the ability to recognize that a value should be used only once. Indeed, without it, you cannot know where and when place the drop call.

If we take a bit of hindsight, the drop part is actually not that good, because it hides some implicitly called code from the programmer. It makes it harder to think in terms of when some code will be called — is this value of a type that implements a drop interface? if so, where does it go out of scope? etc.

Moreover, the real problem is not really to automatically call the cleanup code. The real problem is to be sure that the cleanup code is called. Eh, the nuance is subtle, but I think it makes all the difference.

Ensuring instead of enforcing

What we really want to do is to be sure that the user call destroy_resource. We should not need destructors for that. Let’s continue with our example:

const handle = try storage.create_resource();

The only way we can emit a compiler error here — because handle is not destroyed! — is to recognize that handle has some special – compilation-only — properties that need to be passed to the destroy_resource. ATS has some answer to this — careful if you click the link; it’s a pretty complex language! That answer is called Programing with Theorem-Proving (PwTP). My idea here is not to take all of ATS but just the part that is interesting to our problem: proofs.

A proof is a compile-time (comptime) value that is returned by a function, typically during resource creation (allocation, opening files, etc.) and that is linear. A linear value is a value that must be consumed at least once and at most once — it must be consumed exactly once! Consider this:

fn do_something(x: Foo) void {
    // …
}

const x = try alloc.create(Foo);
do_something(x);

If this program states that x is linear, after we pass the value to do_something, x is not available anymore. If you have done some Rust, you should be familiar with the concept. In Rust, we say that x was moved into the function.

Proof marking

Back to the proof thing. We can simplify ATS proofs and create a simpler proof system by doing the following:

Define a proof as a set of tying variables, and a tag.
Allow a function to return a value which type can be marked with a proof.
A value which type is proof-marked becomes linear.
Only a function taking as argument the same type with the exact same proof marked can consume the argument.

I know, that’s a lot of new ideas. I’ll introduce the syntax I have on my mind to explain a bit more what I think. Let’s start with a regular alloc/destroy wrapper and then enhance it:

fn create(alloc: Allocator, comptime T: type) Allocator.Error!*T {
    const ptr = try alloc.create(T);
    return ptr;
}

fn destroy(alloc: Allocator, ptr: anytype) void {
    alloc.destroy(ptr);
}

Nothing fancy with this base code. Now, let’s introduce proof marking. The idea is that we want that not to be in the way too much. The syntax of a proof mark goes as follows:

Full syntax | [a, b, …, e]#tag.
Inferred syntax | #tag.

The | … syntax marks a type with a proof.

[a, b, …, e] is a list of tying arguments, which ties the proof to those values. If omitted (hence using the second form), all of the function arguments are used as tying the proof.

The #tag syntax declares a tag symbol in the current Zig file. That tag along with the tying variables uniquely identifies the proof.

Proofs can only be created in return-position of a function. Let’s mark our create function with a proof:

fn create(alloc: Allocator, comptime T: type) Allocator.Error!*T | #create {
    const ptr = try alloc.create(T);
    return ptr;
}

As you can see, proofs only live at compilation time, and the implementation has not changed. | #create marks the return value of create with a proof which is basically [alloc, comptime T]#create. Now, let’s create something:

test "linear allocations" {
  const alloc = std.testing.allocator;

  const ptr = try create(alloc, u32);
}

The type of ptr is *u32 | [alloc, u32]#create. Here, alloc is our testing allocator, and thus we tie the ptr variable to the alloc variable and u32.

What we have here is what I would call a dangling proof. ptr type is still marked by a proof, and this the compiler should emit an error stating that a dangling proof is left to be consumed. To solve the problem, we must consume the proof. Let’s modify destroy:

fn destroy(alloc: Allocator, ptr: anytype | [alloc, @typeInfo(@TypeOf(ptr)).Pointer.child]#create) void {
    alloc.destroy(ptr);
}

Okay, that’s a lot. Let’s scan through the signature. We can see that ptr has a proof-marked type. The proof ties alloc and, basicalyl, the type of the pointee — because Zig’s comptime is so awesome, we can actually do all of that at compile-time!

If you think this is too verbose, you can simply pass the type yourself:

fn destroy2(alloc: Allocator, comptime T: type, ptr: *T | [alloc, comptime T]#create) void {
  // …
}

We can then properly consume our linear ptr:

test "linear allocations" {
  const alloc = std.testing.allocator;

  const ptr = try create(alloc, u32);

  // 1. either
  destroy(alloc, ptr);

  // 2. or this
  destroy2(alloc, u32, ptr);
}

Some thoughts about marking

If you have noticed, marking is done on the error union:

fn create(alloc: Allocator, comptime T: type) Allocator.Error!*T | #create {

And the question is: what happens if we return an error? I think that an error would invalidate the proof. However, if you don’t call try, the type is:

test "linear allocations" {
  const alloc = std.testing.allocator;

  const ptr: Allocator.Error!*T | [alloc, u32]#create = create(alloc, u32);
}

If ptr is actually an Allocator.Error, the proof is lost and considered consumed.

Conclusion

Such a change in the language would have tons of impacts, and because of that, I think that nothing in the standard library should use the feature, to prevent breaking pretty much everyone — the Allocator interface is at the heart of the language and its ecosystem. However, I think that having the feature available would allow people to use it in their own wrappers, and as mentioned before, not only for memory allocation. You can easily see how such a feature would be useful for files, database connections, and more.

I know that the Zig language will probably never accept a proposal like this, but given comptime that is already there, I think it’s super close of actually being able to provide PwTP.

Anyway, that was my ideas, and I will probably write more about Zig in the future.

Cheers!

↑ Zig, linear types and PwTP

zig, linear, pwtp

Wed Jul 17 18:35:00 2024 UTC