Automating checks for "good" unit tests

Question

There's already a question about How to write good unit tests.

Based on the answers provided there, the key properties of a good unit test -

Short
Readable and conveys intent instantly
Independent
Fast
Repeatable
Tests a single piece of behaviour
Has a good name

Keeping those properties in mind, how would one go about automating checks for ensuring only "good" unit tests are merged back to the main codebase?

I am absolutely of the opinion that automating these checks is the way to go, if it can be reasonably achieved. There are so many things a reviewer needs to watch out for when accepting a merge request - clean code, design, architecture, clean tests etc. so reducing the burden by automating checks that used to be manual is always welcome.

wonder how would one _automatically_ check for [SRP](https://softwareengineering.stackexchange.com/a/154731/31260 "'it is very subjective, and it is the subject of many heated, red-faced debates...'") or good name — gnat, May 25 '17 at 00:08
You could count asserts for SRP. The name, well you could at least ensure it follows a convention (Given/When/Then or some such). — thegreendroid, May 25 '17 at 01:36
_Testing one thing_ does not mean _Have one assert_. It's perfectly valid to have multiple asserts in a single test as long as you test a single behaviour. — Vincent Savard, May 25 '17 at 11:20
I read that answer and the term SRP does not exist, and "test only one thing" appeared only twice, and **not** in the checklists from the two books cited. I strongly disagree that SRP is a "key property" of a good unit test. Also agree with Victor that multiple asserts are fine. — user949300, Jul 05 '17 at 06:08
SRP or testing one behaviour at a time does not mean a single assert, I agree with you guys completely. But if you do have multiple asserts in a test, perhaps it means that you are missing a 'richer' function that you can have a single assert on. That's just my opinion any way. — thegreendroid, Jul 05 '17 at 06:55

Steve Barnes · Answer 1 · 2017-05-25T15:02:24.697

7

Your characteristics of unit tests are missing some of important features in my opinion:

Reflects and traceable to requirements
Tests all of the requirements for that unit under test
Covers all corner cases
Tests every line of the code & possibly every decision path

The main point of a good test is that it fails when something is wrong and not when nothing is wrong and lets you find out what was wrong so look for:

Comprehensive
Accurate
Complete
Good clear fail messages with what failed & how.

edited May 25 '17 at 15:02

answered May 25 '17 at 05:17

Steve Barnes

5,270
1
16
18

unit tests *cannot* test all of the requirements. 1) Mean time between failure of the system is 480 hours. 2) System is available 99.4% of working hours. – BobDalgleish May 25 '17 at 12:56
@BobDalgleish - we are talking about unit tests not system tests - so unit requirements would apply these rarely have things like MTBF or Availability. I will update the answer. – Steve Barnes May 25 '17 at 14:48
Big +1 for "it fails when something is wrong". Everything else is gravy. :-) – user949300 Jul 05 '17 at 06:09

candied_orange · Accepted Answer · 2017-05-25T01:44:28.387

5

Lets sort your properties by ease of automated checking:

Fast - Most IDE's already tell you this
Short - Your line count tells you this (so long as you don't abuse white space)
Repeatable - Rerunning already tells you this
Independent - This could be done by listing what files the test requires, and what they require ...
Tests one thing (SRP) - Count your asserts. Is there more than one?
Readable - Simple, write an AI that writes code and invite it to the code review. Ask what it thinks.
Has a good name - Hope the AI is smarter than humans because even we suck at this.

edited May 25 '17 at 01:44

answered May 25 '17 at 00:39

candied_orange

102,279
24
197
315

3

Obligatory xkcd https://xkcd.com/1425/ – RubberDuck May 25 '17 at 00:41
2

"Simple, write an AI that writes code and invite it to the code review. Ask what it thinks". Haha gold. I agree with your points though. – thegreendroid May 25 '17 at 01:41
5

I cannot agree with your explanation for SRP. Multiple asserts sometimes have their reason. For example a return value is supposed to be between 0 and 1000 then you can check both limits independently and thus you have checked a single thing using multiple assert statements. Forbidding multiple asserts leads to boilerplate unit tests who do the same thing over and over again and always check not one thing but a fragment of one thing – BlueWizard May 25 '17 at 08:41
`assertTrue( " Expected BlueWizard.sillyness to be ( exclusively ) between %s and %s. Instead it was: %s".format(BozoTheClown, DonaldTrump, BlueWizard.sillyness), BozoTheClown < BlueWizard.sillyness && BlueWizard.sillyness < DonaldTrump);` – candied_orange May 25 '17 at 12:16
Well, that's just silly. It doesn't have to be one long assert either. It can be many simple asserts. Sometimes it makes more sense to do them in one test than it does to set up and tear down the entire test several times, just to have one assert in each test. – Robert Harvey May 25 '17 at 15:11
I agree with @RobertHarvey. Long assertions may violate SRP =) – scriptin May 25 '17 at 15:13
1

@scriptin All sillyness aside, SRP doesn't care if it's a short one liner, a long one liner, or multiple lines. It's one clear idea that would only have one reason to change. But if you can pull it off with one short assert you're likely OK. – candied_orange May 25 '17 at 15:20
1

For what it's worth, I think you're mis-appropriating SRP here. The Single Responsibility Principle says "A [production] class should only have one reason to change." It doesn't govern your number of asserts (as a class can actually *do* more than one thing), nor your number of test methods. The principles that govern the number and character of your test asserts and test methods don't really have anything to do with SRP; they're merely a form of coding style. – Robert Harvey May 25 '17 at 15:28
1

@RobertHarvey You're looking at my thinking from the wrong side. If your going to create an automatic evaluation for SRP of unit tests you will either create some truly brilliant AI or you'll end up driving coders to conform to an artificial style that's easy for the AI to analyze. One that no one can really prove to be wrong but can certainly end up painful. I've labored under such tools. Once they're embedded in the culture they are hard to remove. – candied_orange May 25 '17 at 15:50
To check for simplicity, one can use cyclomatic complexity measurements. Another good measure of a test's usefulness may be the amount of mocking involved. – 9000 May 25 '17 at 16:06
And now I just wanna cry. – candied_orange May 25 '17 at 16:07
1

@9000, cyclomatic complexity of a test?! A test is meant to have zero conditionals, just a simple set of sequential statements, the given, when and then. Agree with your point of mocking though, if a test has too many collaborators, it probably is a design smell. – thegreendroid May 25 '17 at 22:17
1

Ah, now I feel better. :) – candied_orange May 25 '17 at 22:24
@thegreendroid: Exactly: if a lot is going on in a test, if it has a loop or conditionals, it's a problem, and there's an automated way to check for that. – 9000 May 25 '17 at 23:23
4

@9000 Not a bad idea! Tests should have a cyclomatic complexity of one (i.e. no more than one path through the test). It's a pretty neat way of ensuring tests are simple. – thegreendroid May 25 '17 at 23:29

score 4 · Answer 3 · answered May 25 '17 at 11:05

As already mentioned, a good test fails when the system under test experiences "breaking" changes.

To automatically evaluate new unit tests based on above criteria you could try to implement mutation testing:

Determine what parts of the project are covered by the new test.
Generate some mutants by applying (one or more) small modifications (switching operators and such) in those parts.
Run the new test on each mutant. If the test fails, that's good (the test could be too strict, but that's not so much an issue compared with a test that's too weak). If the test doesn't fail, then you probably need some human review of the modifications of the corresponding mutant; it could be an indication that the test is too weak or doesn't cover all cases.

You'll probably get lots of false negatives at first. It will probably improve by careful selection of mutation operations that actually lead to failures. As an example, switching adjacent declarations of local variables is probably rather unlikely to yield significant errors.

This is a great approach, exemplified by QuickCheck and Hypothesis on one hand, and fuzzing tests on the other hand. — 9000, May 25 '17 at 16:04

Automating checks for "good" unit tests

3 Answers3