This article is a sequel to Let’s talk about C++ constructors. Even though not required, reading that article first might give you more information on the motivation and my state of mind regarding C++ and software development.
This time, I want to talk about C++ exceptions. But talking about exceptions for the sake of talking about them has very little value. Let’s instead talk about error handling.
Infallible pure functions
Imagine a function that takes some input, like a String
, and outputs something else, like the length of the
string. Such a function is pure: its output depends only on its arguments. It cannot have any side-effect
nor return something else. It is also infallible as it can always return a valid length for all possible
values it can be called with. Imagine the following C++ implementation:
size_t get_len(std::string const & s) {
return s.length();
}
Even though C++ doesn’t have the concept of purity, let’s assume we can express the concept, just because it’s a powerful concept that we can apply to any language. The compiler will not know about it, but the way we design things does.
That function is pure and infallible. When you look at the signature, you see that it takes a read-only
reference on a std::string
, and returns a size_t
. And yes, I know about noexcept
. However, this
specifier cannot be trusted. Consider:
#include <string>
void foo() {
throw std::runtime_error("That should be seen by the compiler, right?");
}
size_t get_len(std::string const & s) noexcept {
foo();
return s.length();
}
int main() {
std::string foo = "foo";
get_len(foo);
return 0;
}
Compile with (I’m running Archlinux):
g++ -std=c++2a -Wall -pedantic a.cpp
No compiler warning. Run it:
terminate called after throwing an instance of 'std::runtime_error'
what(): That should be seen by the compiler, right?
zsh: abort (core dumped) ./a.out
What that little snippet tells us is that even though get_len
is annotated with noexcept
… it can still
throw. Not directly in its body, but functions it calls may throw. When I was introduced to that keyword,
years ago, I was suspicious. Since C++ will throw exceptions for — erm — exceptional errors, such as
out of memory errors, then… even a noexcept
function can still throw errors. Then, because of that,
noexcept
cannot propagate downwards in your call stack. If A
is noexcept
and B
may throw, then
calling B
from A
is valid in C++.
noexcept
is just a documentation keyword and an opportunity for your compiler to optimize your code. Your
compiler can only emit warnings if you use the throw
keyword directly in the body of the function that is
marked noexcept
. Also, it’s important to notice that, given the team you work in, or the project you work
on, it’s possible to see the use of noexcept
… like not at all. It’s all about convention; your compiler
will not enforce that… which is a problem to me. It’s a problem because more freedom means more opportunities
for people to make mistake. To forget about annotating a function with noexcept
. Or, worse, it gives
opportunities to people who just don’t care and want to rush, making reviewing their code more challenging
than needed.
Fallible functions
In my previous article, I’ve been criticized for not explaining enough what I mean about exceptional errors and that it was a highly subjective point. I’ll try to explain more in this section.
Imagine that you want to implement a function that will perform a lookup. If it finds the key you give as argument, it will return the associated object. However, what to do when the key is not there? If you read a bit my previous article, you know that I would use a sum type to encode the error in the type system. But let’s do it the C++ way. Let’s use exceptions.
Object lookup(Key const & key);
If you look at that signature, you’ll notice an important point. There is no error handling annotation. Most of the time, people will follow some guidelines to put that information in the documentation directly. However, several points:
- What happens if a teammate of yours or even yourself – after some weeks / months – forgets about that documentation line?
- Because you might call that function from anywhere — even in
noexcept
code, as demonstrated in the previous section — how do you know, when reading the code, that a call to this function can throw? - The last point is especially true when refactoring. Imagine that this function belongs to a block of code
delimited with a
try catch
block. Do you assume the whole block as atomic? If so, do you lookup for the documentation of all of the functions called in that block? What happens if you move that function out of the block?
Now, that just assumes a flat block. But it’s easy to guess that you will have to do that for the whole stack
of functions being called — i.e. as soon as you find a noexcept
function in the stack… well nothing, you
have to go on, since a throw
might be hidden in a function deeper in the stack.
Most of the time, the replies I get about that topic are, either:
- “Just read the documentation”. That argument completely ignores the last part of my point above — i.e. do you
really read the documentation of all the functions being directly or transitively called in a
function? That seems insane. Also, you might argue that the documentation can say, at a function level
N
, that there is a throw at levelN - k
. However, that seems like an impossible task to maintain. You might forget to update the documentation if you stop throwing that exception or throw another object with a different type, etc. - “Use better names. If you use a function that, in its name, expresses the idea that it might fail, it’s
easier to refactor / use the function correctly”. That is true, even though you will always find someone
abusing it and using it without a
try catch
block. - “We don’t care about handling errors: we will just handle them at the top-level of the program / function call stack with a single try catch block”. That argument doesn’t survive five seconds as soon as you talk about, for instance, serialization or map lookups.
About the documentation and naming… it adds another problem: humans. We are fallible. You might work on a project that doesn’t document correctly. Or that doesn’t even have proper convention. Or several ones. When considering exceptions for error handling, I think it’s important to imagine what will happen in X months. After the codebase has become complex, large, with a lot of edge cases and possible errors. Maintainability should be a goal. No one enjoys having to read through the bodies of twenty functions to understand why their program crashed or why the GUI displays a pop-up with the content of an exception.
The sooner you can see an error, the better. If that sooner can be “compile-time”… why would you want to still push the error to the runtime? There are things I will never understand.
On the other side, consider sum types:
std::optional<Object> lookup(Key const & key);
Even though it’s still pretty bad to me because of how std::optional
is made (have a look at my previous
article), it has the advantage of being typed. No documentation is needed — but please do document to
explain what can fail though — and your compiler can safely prevent you from doing bad things. Of course,
this is limited by how you use the std::optional
, as C++ doesn’t have pattern-matching. But I would like
to reply to an argument I hear every now and then: it’s not because a better solution is not perfect that
it should be discarded to stay on your legacy solution. Imagine that you have a tool X
with several
issues, {a, b, c, d}
. Now imagine we suggest to switch to a new tool, Y
, with issues {a, d}
only.
Yes, you still have two issues… but you have less. In the case of exceptions vs. a type-system, in the
case of C++, yes, you can still call .value()
on an empty std::optional
and crash your program. But
you don’t have the problem of hidden and untracked error handling. You can simply use exceptions for
exceptional cases. Those cases that are not function boundaries nor edge cases. And yes, I do think that
most of the standard C++ exceptions, such as std::invalid_argument
, are to be completely avoided.
But here, exceptions have an advantage if we stop there: they provide an error description.
Fully-typed failures
Fixing that problem is pretty trivial with a strong type-system and algebraic data types. We want to use those to create a result type, that can either be something, or a typed error, that would contain exactly the same information you have in a regular exception.
C++ doesn’t really have that out of the box but it could be made, in the same way std::variant
exists.
Imagine a hypothetical std::either
type and let’s implement a function that parses something:
std::either<Something, ParseError> parse(std::string const & input);
That function returns either a successful object (Something
) or an error (ParseError
).
With that signature, it’s clear that the function can fail with ParseError
. The point is that the caller
must do something about it. If they don’t know what to do with it — imagine a parser combinator or some
code that doesn’t know how to handle a parse error and requires its caller to handle the error — then
the function needs to abort and propagate the error upwards. That looks like a bit like the interface you
have with exceptions… but here, the interface is visible at the type-level.
Obviously, you cannot use throw
to propagate upwards. You need to use return
from your function. With
either a macro or some language construct, C++ could make that propagation smoother, but currently, it
doesn’t have a proper way to do it. So we’d be left with macros only, or manual propagation. Since C++
doesn’t have pattern-matching nor exhaustive patterns, it would be pretty hard to implement that mechanism
in a complete sound way. As with std::optional
, it’s not perfect, but it would be slightly better than
using opaque exceptions.
One final point. Sometimes, I wonder what it would be like to just give up on my ideas of using a strong
type system in C++. The language is using exceptions and people are used to it. That’s the C++ way. So…
why not changing the way exceptions work so that they’re not opaque, and propagate upwards? I remember
the throw
specifier, used to state what a function might throw. But again, it’s not enforced by compiler.
Worse, it’s being deprecated in C++ 2020.
I voluntarily omitted any reference to either Haskell, Rust or any language like that so that people don’t think I’m trying to compare C++ to another language. I’m having a look on C++ after almost two decades using it and what else I’ve learned. The situation is, to me, frustrating. Because yes, whatever the good arguments against exceptions, people still use them and error handling in C++ is still about exceptions. So you still can have constructors that fail. You still depend a lot on documentation and your compiler cannot tell you when something is not okay. We are all fallible, way more than we think.