Should unit-tests be entirely self-contained?

Question

As the title suggests my question is whether or not unit-tests should be entirely self-contained or can one rely on the results yielded by previous tests?

What I mean, in case that it isn't entirely clear, is that if ones initial test sufficiently asserts that a certain module A works in a certain way can, or more appropriately, should one write subsequent tests with the assumption that the aforementioned module A is tested beforehand? This would imply that the order that the unit-tests are executed matters.

Or should each individual test be self-reliant to the point that if one needs to know whether a module B works, which can only be validated if module A works, should one then test module A within the same test as module B which would imply that the separate unit-tests can be executed in any order.

To give a concrete example, consider the "stack" datatype, which we won't delve too deeply into, specifically two fundamental properties we'd need to be able to reason about the datatype in any meaningful sense. Namely, isEmpty(stack) and Empty(). Now, if one wishes to test the validity of isEmpty which takes a stack and returns a True or False depending on whether or not the stack received as an argument is empty or not one would first need to create an empty stack using Empty().

Then we consider the scenario of doing the following: isEmpty(Empty()) and checking what kind of result we get. Either we get a True statement, so Empty() could have returned a non empty stack and isEmpty might view that as an empty stack which would be wrong. Or we get False, and we still don't know how the two actually interplay, nor how either works on their own (assuming we can't view the source). (There is a third option, we receive something that's neither True nor False, but that is beyond the scope of this discussion, this is solely a remark).

Finally, to tie this back to my question, if we create a test where we can be reasonably certain that isEmpty works in a satisfying manner can then all tests executed after it trust that it does indeed work? Or should they all try and incorporate this ambiguity into their own test logic (for instance, including an else statement to if neither True nor False was returned)

@MichaelT To my mind the two questions are similar indeed, and thanks for pointing that out. However I do not find mine to be a duplicate. The question you have linked pertains to sharing data between two points. My question is supposed to pertain to whether or not one can reasonably assume that after a module has been deemed to work in the way that is expected two other tests can be constructed based on that information. If that assumption can be made subsequent tests will be concise and to the point. Otherwise they will consist of the previous test(s) and some further testing. — , Feb 01 '14 at 14:12
They should be self-contained, for the simple reason being the tests may not be executed in sequence, or together. — CMR, Feb 01 '14 at 14:47
Even without the formatting-fail, that link seems pretty irrelevant to the question. — Aaronaught, Feb 02 '14 at 01:34
If you read the article (Mocks Aren't Stubs) @Aaronaught you will see that Martin Fowler Discusses Classic TDD vs Pure Mockist TDD which basically treats problems like "Should I always mock dependency A or should I use a concrete dependecy A when Testing"? — Chedy2149, Feb 02 '14 at 09:38
I would recommend this article [Mocks Aren't Stubs](http://martinfowler.com/articles/mocksArentStubs.html) which despite the title does not only discuss the difference between mocks and stubs. — Chedy2149, Feb 02 '14 at 09:40
This is emphatically not a duplicate of the closed question. — Winston Ewert, Feb 02 '14 at 17:11
If you believe your question isn't a duplicate, rephrase it in such a way that the difference also becomes clear to enough others who can re-open the question. — Bart van Ingen Schenau, Feb 02 '14 at 20:42
your ["edit 2"](http://programmers.stackexchange.com/posts/226357/revisions "made in revision 4") looks nice, did you consider posting it as a new, separate question? that way, you wouldn't need to bother about whether it will invalidate answers here or not — gnat, Feb 04 '14 at 05:56
@gnat I'd consider my second edit to be a duplicate of the original post. — , Feb 04 '14 at 13:16
all right, let's see how it will work without separation (I just voted reopen) — gnat, Feb 04 '14 at 13:45

l0b0 · Accepted Answer · 2014-02-02T09:46:11.103

Each unit test should be testing one thing, so yes, they should assume that all other parts of the system is working. The way to do this reliably is to mock any other code which should not be tested at the same time.

For example, consider this pseudo-code:

house_json(id):
    name = get_house_name(id)
    return json.format({'id': id, 'name': name})

This depends on the functionality of two other functions: get_house_name and json.format. To unit test this you'll have to mock both of them. First the normal case, testing that with a valid house ID we end up calling json.format with the expected parameters:

test_house_json_format():
    get_house_name = mock()
    json.format = mock()
    get_house_name.return_for(5) = 'foo'
    house_json(5)
    assert_called_once_with(json.format, {'id': 5, 'name': 'foo'})

Then to test that if we send an invalid house ID, the formatter should throw an exception:

test_house_json_with_invalid_house_id():
    get_house_name = mock()
    json.format = mock()
    get_house_name.return(x) = lambda: raise InvalidHouseError(x)
    assert_raises(house_json, 5, InvalidHouseError)
    assert_not_called(json.format)

Making expectations about the code explicit in your test code means it will be easy for others to understand its expected behaviour, and to change it when requirements change. For example, if the function should handle thrown errors and return an error code in JSON format instead, you'd change that in this test:

test_house_json_with_invalid_house_id():
    get_house_name, json.format = mock()
    get_house_name.return(x) = lambda: raise InvalidHouseError(x)
    house_json(5)
    assert_called_once_with(json.format, {'invalid_house_error': 5})

Then you'd run the test to verify that it fails, fix get_house_name to make it pass again, and refactor to make it the simplest possible code which passes (red-green-refactoring).

Once you've tested all the parts, you should add integration and acceptance tests without mocking to ensure that they all work together.

Integration tests can also serve the same purpose of ensuring that they all work together. Acceptance tests are more suited to verifying that the requirements are fulfilled. — Robert Harvey, Feb 01 '14 at 15:27
As I agree that unit test should test one thing (smallest possible code unit), it don't necessary should assume that evey other parts works correctly, moustly because unit test should not have any interaction with other pars (and there goes mocks, as you pointed). — Dainius, Feb 02 '14 at 08:26

Doc Brown · Answer 2 · 2014-02-04T11:48:41.567

6

First, it is a very good practice to make sure your automatic tests are independend from the order of execution (for example, when one test fails, you want to run exactly that test in your debugger without having 10 other tests to be run before).

But you asked something different:

is that if ones initial test sufficiently asserts that a certain module A works in a certain way [], should one write subsequent tests with the assumption that the aforementioned module A is tested beforehand

Writing a test for a function X which relies on the correctness of function Y does not make your tests order-dependend. If your function X is wrong, your tests for Y may fail (as well as your tests for function Y), and if they fail, they fail always if you run the tests for function X ,or if you don't run them, it does not matter.

edited Feb 04 '14 at 11:48

answered Feb 01 '14 at 16:31

Doc Brown

199,015
33
367
565

when you writing test, that uses another module, your test isn't unit test, so irelevat for question. To use another module, in unit testing you are using mock object to imitate same behaviour and mock object always return what specification says. – Dainius Feb 02 '14 at 08:28
2

@Dainius: you missed my point. To make it clearer, I replaced "module" by "function". This will also fit better to the example added by the OP afterwards. – Doc Brown Feb 04 '14 at 11:52

score 3 · Answer 3 · answered Feb 01 '14 at 18:16

3

I would argue that you emphatically should not make your individual tests be entirely self-contained. It is not typically useful to try to write tests on Module B which will work even if Module A is buggy. In fact, I think it is often dangerous and unwise to attempt to do so.

A strict unit test will mock out all of the other classes besides the one being tested. Thus the test will specify the inputs and outputs to all objects/modules other than the one being tested. I do not think there is great value in writing these strict unit tests. There are cases to mock out modules, but in general one should default to using the actual module.

Why?

1) By running the actual implementation of Module A while testing Module B, you get additional testing on Module A for free. Bugs which you didn't catch while testing Module A may come in Module B's use of it. You are throwing away valuable checks on the accuracy of Module A if you simply mock it.

2) Mocking out all of the external calls is obnoxious. You end up having to write lots of code in your test to specify the inputs and outputs of the various objects used. This is typically tedious and makes your tests harder to read and write.

3) If you mock out the calls to Module A from Module B, you are asserting that Module B made the calls you expected. You are not checking that these were the correct calls to make. For example, suppose you have a function like:

Foobar highestScoring() {
    // getFoobars() returns the Foobars sorted by score.
    return module_a.getFoobars().getFirst();
}

It seems pretty sensible, but what order does module_a sort them by? If it sorts lowest to highest, then this function is wrong. But if you mocked module_a, you'd have missed this because you assumed that it sorted the other way around. This is a bug in Module A that would have missed in try to isolate it from Module B.

What are the advantages of mocking every module? There are certainly advantages to mocking certain modules. You typically want to mock modules which slow, rapidly changing, or would cause side effects. But what does the practice of a strict unit test which mocks everything get you? The theory is that if Module A is buggy, then the tests will point to Module A, and not everything that Module A depends on.

I don't think this is very helpful in practice. If you've actually broken Module A, it is typically because you've modified Module A. So you probably already know module A is broken, because that's why were modifying.

Actually, if we really wanted that benefit, unit testing platforms could add it by annotating or inferring a dependency between tests. So if Module A's test fail, we wouldn't even both running Module B's tests.

So in short, systematically mocking every module in your test requires a lot of work, creates places for bugs to hide, and gives you marginal benefits. Just don't do it.

answered Feb 01 '14 at 18:16

Winston Ewert

24,732
12
72
103

2

This is total nonsense. If you *don't* stub/mock dependencies then your tests aren't unit tests, they're integration tests. A **unit** tests means it tests a **unit** of code, in isolation. And the example illustrates a deep ignorance of what unit testing is for. Unit tests don't prove that the program is *correct*, they prove that parts of the program behave according to a specification. The example is just bad code - it implicitly assumes that results will be returned highest-to-lowest, which is clearly not guaranteed. Mocking out-of-order results would and should lead to a failing test. – Aaronaught Feb 02 '14 at 01:42
@Aaronaught, yes, what I'm suggesting aren't unit tests (at least not be the strict technical definition. People often use the terminology loosely). But you gain almost nothing by adhering to a policy of always writing your tests as unit tests. – Winston Ewert Feb 02 '14 at 05:59
1

@WinstonEwert Passing unit tests and failing integration tests means that your specs are wrong/not thought through. And the reverse means one module is handling the broken cases of the other module, so it "knows" about things it shouldn't. So yes, you do gain something by doing both. – Izkata Feb 02 '14 at 06:04
2

Who said anything about "a policy of always writing your tests as unit tests"? The question was *about* unit tests. On my team we aim for 80% unit test coverage, 80% acceptance/integration test coverage, and have to go through a manual regression before every release. But unit tests are unit tests, we don't mix and match; for one thing, you're supposed to run unit tests on every CI build - every single unit test should pass, if one doesn't then the build is broken and nobody else checks in until the problem is fixed. To do *that* with integration tests would be slow and, well, insane. – Aaronaught Feb 02 '14 at 15:39
@Aaronaught, the OP does mention "unit tests," however he was contemplating writing a test on Module B which assumed that Module A worked, and didn't attempt to mock it. Thus he is asking essentially whether it's okay that his test on Module B isn't a strict unit test. I'm saying that's fine. – Winston Ewert Feb 02 '14 at 22:22
1

@Aaronaught, integration tests don't have to be slow. An integration test which doesn't invoke disk, network, or intense computation is usually fast enough to run along with all your unit tests. That's the kind of test I think you should write instead of systematically mocking every last piece. – Winston Ewert Feb 02 '14 at 22:27
@Aaronaught, actually where I work we do block check-ins until integration tests have been passed. We can get away with that because we have a server farm to run all the tests. – Winston Ewert Feb 02 '14 at 22:28
@Izkata, that's a very interesting point. I'll have to think about it. – Winston Ewert Feb 02 '14 at 22:31
@Aaronaught, let me put this is another way. When you write unit tests do you mock out strings and containers? I'm assuming you don't. Instead, you assume they work correctly. I think for every case, you have to make a call whether to mock or not, but it isn't automatically wrong to not mock when writing your unit tests (even if perhaps they aren't really unit tests anymore). – Winston Ewert Feb 02 '14 at 23:00
1

Strings are essentially primitives in most modern languages, and containers are part of the standard library, so no, I don't mock those. Anything that reasonably falls under the umbrella of "user code" is worth substituting a test double for. And you must either check in very infrequently, have very few integration tests, or be swimming in cash; no business I know of has *developer*-dedicated server farms powerful enough to run thousands of Selenium tests (as an example) in the 5 minutes or less that's almost universally recommended as the max for continuous integration. – Aaronaught Feb 03 '14 at 01:26
@Aaronaught, my company falls in the swimming in cash category. But I'll grant we are a rather special case and I'm not supposed to discuss our internal process publicly. – Winston Ewert Feb 03 '14 at 01:50
@Aaronaught, regarding the examples, what if I wrote my own money or collection class? Would you think I should mock those? – Winston Ewert Feb 03 '14 at 01:52

jh12 · Answer 4 · 2014-02-02T01:36:40.507

The goal of unit testing is to test all paths within a class. This means testing all possible inputs (or representative sample), but also all possible behaviors of dependencies, including dependency failures, i.e. how a method in module B handles all possible correct values returned by A, all possible exceptions thrown by A, and all possible incorrect return values from A, e.g. null. Testing module B by relying on a set of tests for A is valid so long as you recognize that it represents only a subset of behaviors of A (unless you can confirm that it is possible to exercise all paths in A using the range of inputs to B). To fully test B, you likely need to force the full range of potential return values/exceptions from A by mocking it. It doesn't matter from the perspective of B whether A is correct or not. What matters is how B handles any implementation of A, i.e. to protect against current or future buggy implementations of A (or A's dependencies). The goal is to prove that B always does the correct thing for correct inputs and correct dependency behaviors, and also does something sensible (not crash or corrupt anything, informs the caller of the problem, returns control gracefully) when it receives bad input or encounters faulty behavior by a dependency.

So my response is unit tests can use other unit tests to generate behavior of dependencies, but that a mockup of dependencies is likely needed to define a complete set of unit tests, e.g. to include dependency failures. The particular implementation of dependencies, and their correctness, are irrelevant, only their API matters.

To say it slightly differently, an exception thrown by A is a valid unit test of B in that it tests an error handling path in B, that is unreachable by varying input to B.

score 0 · Answer 5 · answered Feb 02 '14 at 01:06

Unit tests should be self-contained, but need not be order-independent. It should be possible a test in isolation from others, but a failure may then imply failure in a dependency, not in the test itself.

Take a simplistic example, a set of functions implementing a data type. Two of those functions are Parse() and Format(), converting from and to a string representation. It is a quite reasonable strategy to test the Format() function, and then use the Format() function in tests on the Parse() function. If the tests are run out of order then a fault in the Format() function could actually appear as a test failure in the Parse() function.

This is simplistic, but it is common to build up suites of increasingly complex tests based on the knowledge that prior tests have passed. In some cases mocking is a better strategy, but mock components themselves have to be tested, so even then you are depending on prior successful tests.

Should unit-tests be entirely self-contained?

5 Answers5