Is it acceptable to test based on test output data rather than input data in unit tests?

Question

I'm used to write unit tests with assertions which are based on the input, e.g. (hopefully self-explanatory and let's assume that using random test data is fine)

int a = random();
int b = random();
Adder instance = new Adder();
int expResult = a+b;
int result instance.add(a, b);
assertEquals(expResult, result);

Let's assume that Adder.add has a lot of side effects which I cannot imagine, so that this test would make sense.

Now, I encountered a situation where it would make sense to create assertions based on the output, e.g.

int a = random();
int b = random();
Multiplier instance = new Multiplier();
int result = instance.multiply(a, b);
if(isPrimeNumber(result)) {
    assertTrue(a == 1 || b == 1);
}else {
    //some other assertions...
}

Yes, this is a non-sense test and it tests more the functioning of rational numbers than anything else, but it illustrates the difference between basing assertions on input exclusively and making the output/test result influence assertions.

I'm assuming that I cover all possible distinct output states of the test - just like I'd assume that I'm covering all possible input states.

You can always write own test(test case) for every assertion. Even better, for every test you will have own input data, which with good naming, will tell what is tested and what is expected result — Fabio, Mar 03 '18 at 09:42
`if .. else` one of the reason, why we have tests. Continuously using conditions in the tests will lead in the tests which fail for wrong reason or even worse - will pass, when it should fail. — Fabio, Mar 03 '18 at 09:46
@Fabio Agreed, but that would change the code quality and the code flow. I doesn't provide an argument for the question, or am I misreading that. — Kalle Richter, Mar 03 '18 at 09:46
In tests you should have control over all input data and possible global state. Then before running testable function you will setup all input and global state values which where function under the test should produce expected result. In your case it looks more like guessing. — Fabio, Mar 03 '18 at 09:55

Doc Brown · Answer 1 · 2018-03-03T09:58:11.700

In this example, there is a certain risk of having instance.multiply(a, b) producing a prime number where it should not, (maybe for a pair 1,10 it delivers 11) and then leaving out the "other assertions" section.

More general, the fact you need some condition to test on the output data is a sign that your test data generation does not behave deterministic, so you cannot easily pick the right choice of assertions beforehand. That is the main problem I see here, since non-deterministic tests have the very nasty property of not creating reproducable outcomes.

Let us take your second example: if there would be a fixed set of deterministic test data (even if it was produced with a random generator once), for each pair (a,b) of numbers it would be determinable beforehand if their product should be a prime number or not. A better test then could look like this:

 [TestCase(1,41,true)]
 [TestCase(2,3,false)]
 [TestCase(10,1,false)]
 void TestMultiplier(int a, int b, bool prodIsPrime)
 {
     Multiplier instance = new Multiplier();
     int result = instance.multiply(a, b);
     assertEqual(prodIsPrime, isPrimeNumber(result));
     if(!prodIsPrime)
     {
       // some other tests
     }
  }

Of course, you might then refactor this further and split this into two tests, one for prime products and one for non-primes, as @Fabio wrote.

You should use same "best practices" for the tests as you use for the application code. Usually having `bool` as argument for the method - is sign that method can be split in two. In this case you can introduce two test methods with proper name `TestForPrimaryNumbers` and `TestForOtherNumbers` which will be more easily to follow — Fabio, Mar 03 '18 at 09:49
@Fabio: what you wrote is fine, but it distracts somewhat from my core point, I tried not to change the original code more than necessary. So for the sake of demonstration, lets imagine the real test data is read from some file. — Doc Brown, Mar 03 '18 at 09:51

score 4 · Accepted Answer · answered Mar 11 '18 at 13:43

Asserting certain relations between the input and output data without specifying concrete test values is a valid testing strategy, also known as property testing. This has the advantage that a large input space can be covered easily with very little code. Property testing moves away from testing some specific hand-chosen examples, towards testing laws that must hold for the whole input space.

But this only works under the following conditions:

You have a sensible mechanism to generate input values. Choosing values at random can be part of that strategy, but is insufficient. A mixture between random values and special interesting values is best (e.g. boundaries of your input domain). When using random values, you must record the seed in order to make your tests reproducible.
The system under test is pure, so that it can be run repeatedly, in arbitrary order, and quickly.
The system under test runs reasonably quickly, so that many instances can be exercised.
The properties must be very cheap to check. E.g. a primality test would be inconveniently expensive.
The properties must not simply rephrase the implementation being tested, or they are worthless. E.g. in your example, a property that checks result == a + b would fail to catch problems relating to numeric overflow. A more interesting property would be if (a > 0 && b > 0) assert(result > a && result > b).
Properties don't necessarily have to restrict themselves for a single input. E.g. we might also assert the commutativity property add(a, b) == add(b, a). It is still sensible to write separate tests for properties around specific input values, e.g. add(a, 0) == a.

Property-based testing was largely popularized by the Haskell QuickCheck library. Now, comparable frameworks exist in a large variety of languages. Their main value is that they assist in generating interesting input values. They may also assist in exercising only a subset of possible input combinations, in order to avoid exponential explosion of test cases.

So using a hand-rolled property testing approach, we might write the test like this (Python-like pseudocode):

def generate_integers(rng: Random) -> Iterator[int]:
    # interesting values around zero and 1
    yield from [-1, 0, 1, 2, 10, 11]
    # interesting boundary values
    yield from [INT_MIN, INT_MIN + 1, INT_MAX - 1, INT_MAX]
    # extra random values
    5 times:
      yield rng.next_int()

def test_commutativity(rng):
    a_values = list(generate_integers(rng))
    b_values = list(generate_integers(rng))
    for a in a_values:
        for b in b_values:
            assert add(a, b) == add(b, a)

def test_identity(rng):
    for a in generate_integers(rng):
        assert add(a, 0) == a

def test_inverse(rng):
    a_values = list(generate_integers(rng))
    b_values = list(generate_integers(rng))
    for a in a_values:
        for b in b_values:
            result = add(a, b)
            assert result - a == b
            assert result - b == a

Is it acceptable to test based on test output data rather than input data in unit tests?

2 Answers2

Linked