Is possible to write too many asserts?
Well, of course it is. [Imagine obnoxious example here.] However, applying the guidelines detailed in the following, you shouldn't have troubles pushing that limit in practice. I'm a big fan of assertions, too, and I use them according to these principles. Much of this advice is not special to assertions but only general good engineering practice applied to them.
Keep the run-time and binary footprint overhead in mind
Assertions are great, but if they make your program unacceptably slow, it will be either very annoying or you will turn them off sooner or later.
I like to gauge the cost of an assertion relative to the cost of the function it is contained in. Consider the following two examples.
// Precondition: queue is not empty
// Invariant: queue is sorted
template <typename T>
const T&
sorted_queue<T>::max() const noexcept
{
assert(!this->data_.empty());
assert(std::is_sorted(std::cbegin(this->data_), std::cend(this->data_)));
return this->data_.back();
}
The function itself is an O(1) operation but the assertions account for O(n) overhead. I don't think that you would like such checks to be active unless in very special circumstances.
Here is another function with similar assertions.
// Requirement: op : T -> T is monotonic [ie x <= y implies op(x) <= op(y)]
// Invariant: queue is sorted
// Postcondition: each item x in the queue is replaced by op(x)
template <typename T>
template <typename FuncT>
void
sorted_queue<T>::apply_monotonic_function(FuncT&& op)
{
assert(std::is_sorted(std::cbegin(this->data_), std::cend(this->data_)));
std::transform(std::cbegin(this->data_), std::cend(this->data_),
std::begin(this->data_), std::forward<FuncT>(op));
assert(std::is_sorted(std::cbegin(this->data_), std::cend(this->data_)));
}
The function itself is an O(n) operation so it hurts much less to add an additional O(n) overhead for the assertion. Slowing down a function by a small (in this case, probably less than 3) constant factor is something we can usually afford in a debug build but maybe not in a release build.
Now consider this example.
// Precondition: queue is not empty
// Invariant: queue is sorted
// Postcondition: last element is removed from queue
template <typename T>
void
sorted_queue<T>::pop_back() noexcept
{
assert(!this->data_.empty());
return this->data_.pop_back();
}
While many people will probably be much more comfortable with this O(1) assertion than with the two O(n) assertions in the previous example, they are morally equivalent in my view. Each adds overhead on the order of the complexity of the function itself.
Finally, there are the “really cheap” assertions that are dominated by the complexity of the function they're contained in.
// Requirement: cmp : T x T -> bool is a strict weak ordering
// Precondition: queue is not empty
// Postcondition: if x is returned, then there is no y in the queue
// such that cmp(x, y)
template <typename T>
template <typename CmpT>
const T&
sorted_queue<T>::max(CmpT&& cmp) const
{
assert(!this->data_.empty());
const auto pos = std::max_element(std::cbegin(this->data_),
std::cend(this->data_),
std::forward<CmpT>(cmp));
assert(pos != std::cend(this->data_));
return *pos;
}
Here, we have two O(1) assertions in an O(n) function. It probably won't be a problem to keep this overhead even in release builds.
Do keep in mind, however, that asymptotic complexities are not always giving an adequate estimate because in practice, we are always dealing with input sizes bounded by some finite constant and constant factors hidden by “Big-O” might very well be not negligible.
So now we have identified different scenarios, what can we do about them? An (probably too) easy approach would be to follow a rule such as “Don't use assertions that dominate the function they are contained in.” While it might work for some projects, others might need a more differentiated approach. This could be done by using different assertion macros for the different cases.
#define MY_ASSERT_IMPL(COST, CONDITION) \
( \
( ((COST) <= (MY_ASSERT_COST_LIMIT)) && !(CONDITION) ) \
? ::my::assertion_failed(__FILE__, __LINE__, __FUNCTION__, # CONDITION) \
: (void) 0 \
)
#define MY_ASSERT_LOW(CONDITION) \
MY_ASSERT_IMPL(MY_ASSERT_COST_LOW, CONDITION)
#define MY_ASSERT_MEDIUM(CONDITION) \
MY_ASSERT_IMPL(MY_ASSERT_COST_MEDIUM, CONDITION)
#define MY_ASSERT_HIGH(CONDITION) \
MY_ASSERT_IMPL(MY_ASSERT_COST_HIGH, CONDITION)
#define MY_ASSERT_COST_NONE 0
#define MY_ASSERT_COST_LOW 1
#define MY_ASSERT_COST_MEDIUM 2
#define MY_ASSERT_COST_HIGH 3
#define MY_ASSERT_COST_ALL 10
#ifndef MY_ASSERT_COST_LIMIT
# define MY_ASSERT_COST_LIMIT MY_ASSERT_COST_MEDIUM
#endif
namespace my
{
[[noreturn]] extern void
assertion_failed(const char * filename, int line, const char * function,
const char * message) noexcept;
}
You can now use the three macros MY_ASSERT_LOW
, MY_ASSERT_MEDIUM
and MY_ASSERT_HIGH
instead of the standard library's “one size fits all” assert
macro for assertions that are dominated by, neither dominated by nor dominating and dominating the complexity of their containing function respectively. When you build the software, you can pre-define the pre-processor symbol MY_ASSERT_COST_LIMIT
to select what kind of assertions should make it into the executable. The constants MY_ASSERT_COST_NONE
and MY_ASSERT_COST_ALL
don't correspond to any assert macros and are meant to be used as values for MY_ASSERT_COST_LIMIT
in order to turn all assertions off or on respectively.
We're relying on the assumption here that a good compiler will not generate any code for
if (false_constant_expression && run_time_expression) { /* ... */ }
and transform
if (true_constant_expression && run_time_expression) { /* ... */ }
into
if (run_time_expression) { /* ... */ }
which I believe is a safe assumption nowadays.
If you're about to tweak the above code, consider compiler-specific annotations like __attribute__ ((cold))
on my::assertion_failed
or __builtin_expect(…, false)
on !(CONDITION)
to reduce the overhead of passed assertions. In release builds, you may also consider replacing the function call to my::assertion_failed
with something like __builtin_trap
to reduce the foot-print at the inconvenience of losing a diagnostic message.
These kinds of optimizations are really only relevant in extremely cheap assertions (like comparing two integers that are already given as arguments) in a function that itself is very compact, not considering the additional size of the binary accumulated by incorporating all the message strings.
Compare how this code
int
positive_difference_1st(const int a, const int b) noexcept
{
if (!(a > b))
my::assertion_failed(__FILE__, __LINE__, __FUNCTION__, "!(a > b)");
return a - b;
}
is compiled into the following assembly
_ZN4test23positive_difference_1stEii:
.LFB0:
.cfi_startproc
cmpl %esi, %edi
jle .L5
movl %edi, %eax
subl %esi, %eax
ret
.L5:
subq $8, %rsp
.cfi_def_cfa_offset 16
movl $.LC0, %ecx
movl $_ZZN4test23positive_difference_1stEiiE12__FUNCTION__, %edx
movl $50, %esi
movl $.LC1, %edi
call _ZN2my16assertion_failedEPKciS1_S1_
.cfi_endproc
.LFE0:
while the following code
int
positive_difference_2nd(const int a, const int b) noexcept
{
if (__builtin_expect(!(a > b), false))
__builtin_trap();
return a - b;
}
gives this assembly
_ZN4test23positive_difference_2ndEii:
.LFB1:
.cfi_startproc
cmpl %esi, %edi
jle .L8
movl %edi, %eax
subl %esi, %eax
ret
.p2align 4,,7
.p2align 3
.L8:
ud2
.cfi_endproc
.LFE1:
which I feel much more comfortable with. (Examples were tested with GCC 5.3.0 using the -std=c++14
, -O3
and -march=native
flags on 4.3.3-2-ARCH x86_64 GNU/Linux. Not shown in the above snippets are the declarations of test::positive_difference_1st
and test::positive_difference_2nd
which I added the __attribute__ ((hot))
to. my::assertion_failed
was declared with __attribute__ ((cold))
.)
Assert preconditions in the function that depends on them
Suppose you have the following function with the specified contract.
/**
* @brief
* Counts the frequency of a letter in a string.
*
* The frequency count is case-insensitive.
*
* If `text` does not point to a NUL terminated character array or `letter`
* is not in the character range `[A-Za-z]`, the behavior is undefined.
*
* @param text
* text to count the letters in
*
* @param letter
* letter to count
*
* @returns
* occurences of `letter` in `text`
*
*/
std::size_t
count_letters(const char * text, int letter) noexcept;
Instead of writing
assert(text != nullptr);
assert((letter >= 'A' && letter <= 'Z') || (letter >= 'a' && letter <= 'z'));
const auto frequency = count_letters(text, letter);
at each call-site, put that logic once into the definition of count_letters
std::size_t
count_letters(const char *const text, const int letter) noexcept
{
assert(text != nullptr);
assert((letter >= 'A' && letter <= 'Z') || (letter >= 'a' && letter <= 'z'));
auto frequency = std::size_t {};
// TODO: Figure this out...
return frequency;
}
and call it without further ado.
const auto frequency = count_letters(text, letter);
This has the following advantages.
- You only need to write the assertion code once. Since the very purpose of functions is that they be called – often more than once – this should reduce the overall number of
assert
statements in your code.
- It keeps the logic that checks the preconditions close to the logic that depends on them. I think this is the most important aspect. If your clients misuse your interface, they cannot be assumed to apply the assertions correctly either so it is better the function tells them.
The obvious disadvantage is that you won't get the source location of the call-site into the diagnostic message. I believe that this is a minor issue. A good debugger should be able to let you trace back the origin of the contract violation conveniently.
The same thinking applies to “special” functions like overloaded operators. When I'm writing iterators, I usually – if the nature of the iterator allows it – give them a member function
bool
good() const noexcept;
that allows to ask whether it is safe to dereference the iterator. (Of course, in practice, it is almost always only possible to guarantee that it won't be safe to dereference the iterator. But I believe you can still catch a lot of bugs with such a function.) Instead of littering all my code that uses the iterator with assert(iter.good())
statements, I'd rather put a single assert(this->good())
as the first line of the operator*
in the iterator's implementation.
If you're using the standard library, instead of asserting manually on its preconditions in your source code, turn on their checks in debug builds. They can do even more sophisticated checks like testing whether the container an iterator refers to still exists. (See the documentation for libstdc++ and libc++ (work in progress) for more information.)
Factor common conditions out
Suppose you're writing a linear algebra package. Many functions will have complicated preconditions and violating them will often cause wrong results that are not immediately recognizable as such. It would be very good if these functions asserted their preconditions. If you define a bunch of predicates that tell you certain properties about a structure, those assertions become much more readable.
template <typename MatrixT>
auto
cholesky_decompose(MatrixT&& m)
{
assert(is_square(m) && is_symmetric(m));
// TODO: Somehow decompose that thing...
}
It will also give more useful error messages.
cholesky.hxx:357: cholesky_decompose: assertion failed: is_symmetric(m)
helps a lot more than, say
detail/basic_ops.hxx:1289: fast_compare: assertion failed: m(i, j) == m(j, i)
where you'd first have to go look at the source code in the context to figure out what was actually tested.
If you have a class
with non-trivial invariants, it is probably a good idea to assert on them from time to time when you have messed with the internal state and want to ensure that you're leaving the object in a valid state on return.
For this purpose, I found it useful to define a private
member function that I conventionally call class_invaraiants_hold_
. Suppose you were re-implementing std::vector
(Because we all know it ain't good enough.), it might have a function like this.
template <typename T>
bool
vector<T>::class_invariants_hold_() const noexcept
{
if (this->size_ > this->capacity_)
return false;
if ((this->size_ > 0) && (this->data_ == nullptr))
return false;
if ((this->capacity_ == 0) != (this->data_ == nullptr))
return false;
return true;
}
Notice a few things about this.
- The predicate function itself is
const
and noexcept
, in accordance with the guideline that assertions shall not have side effects. If it makes sense, also declare it constexpr
.
- The predicate doesn't assert anything itself. It is meant to be called inside assertions, such as
assert(this->class_invariants_hold_())
. This way, if assertions are compiled-out, we can be sure that no run-time overhead is incurred.
- The control flow inside the function is broken apart into multiple
if
statements with early return
s rather than a large expression. This makes it easy to step through the function in a debugger and find out what part of the invariant was broken if the assertion fires.
Don't assert on silly things
Some things just don't make sense to assert on.
auto numbers = std::vector<int> {};
numbers.push_back(14);
numbers.push_back(92);
assert(numbers.size() == 2); // silly
assert(!numbers.empty()); // silly and redundant
These assertions don't make the code even a tiny bit more readable or easier to reason about. Every C++ programmer should be confident enough how std::vector
works to be certain that the above code is correct simply by looking at it. I'm not saying that you should never assert on a container's size. If you have added or removed elements using some non-trivial control flow, such an assertion can be useful. But if it merely repeats what was written in non-assertion code just above, there is no value gained.
Also don't assert that library functions work correctly.
auto w = widget {};
w.enable_quantum_mode();
assert(w.quantum_mode_enabled()); // probably silly
If you trust the library that little, better consider using another library instead.
On the other hand, if the documentation of the library is not 100 % clear and you gain confidence about its contracts by reading the source code, it makes a lot of sense to assert on that “inferred contract”. If it is broken in a future version of the library, you'll notice quickly.
auto w = widget {};
// After reading the source code, I have concluded that quantum mode is
// always off by default but this isn't documented anywhere.
assert(!w.quantum_mode_enabled());
This is better than the following solution which will not tell you whether your assumptions were correct.
auto w = widget {};
if (w.quantum_mode_enabled())
{
// I don't think that quantum mode is ever enabled by default but
// I'm not sure.
w.disable_quantum_mode();
}
Don't abuse assertions to implement program logic
Assertions should only ever be used to uncover bugs that are worthy of immediately killing your application. They should not be used to verify any other condition even if the appropriate reaction to that condition would also be to quit immediately.
Therefore, write this…
if (!server_reachable())
{
log_message("server not reachable");
shutdown();
}
…instead of that.
assert(server_reachable());
Also never use assertions to validate untrusted input or check that std::malloc
did not return
you the nullptr
. Even if you know that you'll never turn assertions off, even in release builds, an assertion communicates to the reader that it checks something that is always true given that the program is bug-free and has no visible side-effects. If this is not the kind of message you want to communicate, use an alternative error handling mechanism such as throw
ing an exception. If you find it convenient to have a macro wrapper for your non-assertion checks, go ahead writing one. Just don't call it “assert”, “assume”, “require”, “ensure” or something like that. Its internal logic could be the same as for assert
, except that it is never compiled out, of course.
More information
I found John Lakos' talk Defensive Programming Done Right, given at CppCon'14 (1st part, 2nd part) very enlightening. He takes the idea of customizing what assertions are enabled and how to react on failed exceptions even further than I did in this answer.