Test A requires reviewing other files to prove it is correct. Test B doesn’t but has more than one assert

Question

Say you want to test an update capability across a CRUD api.

Test A updates a field, queries it, and asserts that it now has the value from the update. Knowledge about the starting value (and that it differs from the test value) comes from the way the initial data for the field is prepared: it is done by an utility component which initializes the test data, and the test author can determine its value by looking into the source code file of the utility when writing the test. This review should prevent creating a test that, for example, updates the value from "TESTDATA" to "TESTDATA".

Test B first queries the value at the start - directly from the field under test - and asserts that it’s not the same as the test value before doing and asserting the update.

Both tests confirm the same update behavior. A requires reviewing another file. B requires an extra assert.

Is one preferable? Why?

Argument for A

Code reviews will catch if the values are the same.

Argument for B

Good automated tests minimize what you need to read.

The principle that a test should have only one assert is one of the stupidest, in a long list of other stupid software principles. — user949300, Jun 13 '23 at 21:23
@DocBrown Neither value is created at runtime. As I said, in A you refer to a file to learn what default value is being updated. You do this as you write the test and decide what the update value will be and hopefully remember not to make them the same value. — candied_orange, Jun 13 '23 at 23:11
For Test A, is the "other file" part of the test suite or the system under test? — Greg Burghardt, Jun 14 '23 at 01:22
And are these tests persisting data to a database, or just variables in memory? — Greg Burghardt, Jun 14 '23 at 01:24
@GregBurghardt good question. It’s a utility class used by many of the test suites. As far as I know the system under test doesn’t know about it. — candied_orange, Jun 14 '23 at 01:27
@GregBurghardt both. It’s mocked and cached. So if I want to see the tests make changes to the real DB I have to do funny things. But I can query the mock like it’s real from the test. — candied_orange, Jun 14 '23 at 01:29
I would consider testing by updating the field to known value "1", checking it is now "1", then updating to known value "2" and testing it is now "2"? Does not rely on what the state prior to starting the test is, and while there are multiple asserts these are asserts on the same call so the reader needs no additional knowledge. — Joris Timmermans, Jun 14 '23 at 09:01
I don't follow the requirement that "another file must be reviewed". In other words, why is the pre-change value in another file? Why not just stick that bit in the file with the test class? — João Mendes, Jun 14 '23 at 09:41
@JoãoMendes it's not a requirement. It's an implementation detail. They have a widely used helper class that sets up the initial state. I'd love to do it your way. — candied_orange, Jun 14 '23 at 12:08
@candied_orange: I tried to make the question a little bit more clearer - please have a 2nd look. — Doc Brown, Jun 14 '23 at 12:21
@DocBrown thanks for the edit. If you're still in the mood, see if you can think of a better title for this question. — candied_orange, Jun 14 '23 at 12:22
Another potential solution: add a separate test that asserts the initial value as mocked is what you expect it to be. This makes even more sense if the initial value is some sort of sensible default, not a random mocked value. Someone changes the default? An assert immediately triggers. — Joris Timmermans, Jun 14 '23 at 12:44
@candied_orange: changing the title just now bears a certain risk to invalidate existing answers, I think it is ok as it is. — Doc Brown, Jun 14 '23 at 13:11

Doc Brown · Answer 1 · 2023-06-14T14:55:40.767

25

Relying on some value which was coded in a different file (a reusable utility class somewhere else in the codebase) bears the risk the value will be changed by someone else after the code review happened. It is quite unlikely the person who changes the file will remember that months or years ago in another part of the code base a test exists which relies indirectly on the original value. In an unfortunate case, the "default value from that other file" becomes equal to the "update value" afterwards, which makes Test A pointless.

So this clearly shows that the approach for Test A is flawed, it ignores that the code base can evolve and introduces a hidden dependency. It is a kind of DRY violation, since the test unnessarily duplicates the knowledge of what value was stored in "that other file".

Test B is IMHO fine, even if it uses two asserts - which I am ok with. Applying a rule like "strictly one assert per test" is IMHO dogmatic nonsense. When one follows the "arrange-act-assert" pattern, it can be perfectly valid to make a test more robust by having an extra "assert" statement for the "arrange" step, or multiple assert statements in the "assert" step itself for testing the outcome of a single operation. See also Is it OK to have multiple asserts in a single unit test?.

Still I would prefer Test C instead:

query the original value x from the field
modify x in some simple way (when it is an integer, add 1, when it is a string, append or change a character, when it is a boolean, invert it etc.). Lets call the new value y
update the field to y
assert that the field has the new value y

This way, the test does neither have to rely on some code review from the past, nor has to use two asserts. It just verifies the update behaviour and avoids to make any assumptions about prior values of the field.

If one has concerns about modifying a potentially unknown value, I would probably add a simple preparation first: pick a fixed, hardcoded value x and set the field to x first. Here, one needs to use an extra assert to make sure the field really has got the initial value x before it is changed to y. And I definitely don't see the two asserts as a downside here.

Let me add after I wrote all of the above, a simpler solution for Test C was suggested by @IMil.

edited Jun 14 '23 at 14:55

answered Jun 14 '23 at 06:22

Doc Brown

199,015
33
367
565

7

The downside of that approach is that you might run into constraints (business or technical) when altering an unknown value. Append `_test`? "Your input must be at max 10 characters". Change one character? "Your password must contain at least one digit". Add 1? Integer overflow. And for some data, modification is not an option, e.g. UUIDs. But as long as the chance of running into those constraints is reasonably low, this is also my preferred way. – Christoph Jun 14 '23 at 09:41
2

@Christoph: Sure, there's likely to be some constraints indeed. But any constraint violation should lead to update failure, and from there a test failure, so I'd consider it a non-problem even then. – Matthieu M. Jun 14 '23 at 11:39
1

@Christoph: see my edit. – Doc Brown Jun 14 '23 at 11:54
@MatthieuM. It depends. Imagine a system where tests are run in random order, and another test sets your "unknown value" to a string with maximal allowed length to assert that this is valid. Now your test will fail if it appends to the unknown value due to length constraint. It wouldn't fail if the order was reversed. Is it an contrived example? Surely. But the point is, as long as you rely on possibly unknown data, you risk that your tests will fail randomly. – Christoph Jun 14 '23 at 12:45
2

@Christoph: I... didn't see anything in the OP, or this answer, that mentioned global data. I thought we were talking about an "example" object constructed _anew_ each test with some pre-initialized values that the test would then modify and assert the modification worked. Tests all interfering with global/external state are a whole other beast. – Matthieu M. Jun 14 '23 at 13:02
@MatthieuM. fair enough, I let my imagination carry me away. And yet: today my object might be pre-initialized with a constant value, tomorrow my colleague might replace it, using one of those Faker Library. Unknown data is unknown data, the source is just an implementation detail. But that is a discussion for another day ;-) – Christoph Jun 14 '23 at 13:19
1

@Christoph: True, but hopefully a faker library would _still_ be deterministic. Tests that fail randomly are a bane :( – Matthieu M. Jun 14 '23 at 14:19
yeah i mean you say update to random value makes the test flakey?, update to last value +1 is an overflow waiting to happen – Ewan Jun 14 '23 at 19:32
@Ewan: I think you misunderstood the scenario of the OP. To be fair, I didn't understand it initially, too. – Doc Brown Jun 14 '23 at 20:46
hmm it has had a couple of edits...... – Ewan Jun 14 '23 at 21:21
3

The comments are probably over-focussing on specific details here. As long as the type (with business or constraints) allows more than one value, it should be possible to come up with a pure total function that maps any allowed value to a different allowed value. If you really need to *absolutely* guarantee you never see a test failure caused by someone updating the default value to some crazy thing that the simple modification will fail on or map to an invalid value, then you can write such a function and use it as your modification. (That function likely needs its own tests though) – Ben Jun 15 '23 at 06:10
@DocBrown Whee! 100K. Thanks for helping me reach it. – candied_orange Jun 15 '23 at 18:38
@candied_orange: enjoy your gifts. – Doc Brown Jun 15 '23 at 21:45

Flater · Answer 2 · 2023-06-14T22:44:44.280

The core question here seems to be whether you should test your test arrangement. In other words, who tests the testers?

This is highly subjective, and the answer is based on what you get out of it. You've pointed at a trivially assertable check, but it won't always be this easy. There will be a subset of tests where asserting the arrangement will become a disproportionate amount of the effort required.

If the update fails we can't detect the failure if the update was supposed to change "TESTDATA" to "TESTDATA".

The core issue I see here is a mistrust of how the initial state was arranged. If you mistrust your test designers to that degree; I am really concerned about how they're designing the rest of the tests as well, and you cannot cover that just by slapping on some more assertions.

The only way your bug would slip through is if (at the same time) the developer failed to have the update logic implemented and the tester did not arrange the test with correctly assertable values, but they still knew to write this test that is otherwise complete. That is an oddly specific scenario to try and catch.

So is it bad that you assert this? At face value, no. But the reason for doing so seems to be part of a larger problem (not trusting who wrote the test), and you're only patching one part (arranging the test data) and not the others (overall test design, assertion logic, ...); which is making me feel that this is a well hidden XY problem.

However, there are edge cases where you'd like to test your data and it's not a matter of developer mistrust, for example because you're using real world data or because you generated data because it was too much to handcraft it. In those cases, you can justify testing the data.
But I maintain that you shouldn't be adding those assertions to the same test. You should be testing the quality of the data separately from testing the behavior of your application, as one has nothing to do with the other. Just because I provide bad test data does not mean the app is broken, and vice versa.

I'll disagree with the XY problem... in that I do stupid mistakes all the time. A quick assert costs nothing. A quick assert is good. — Matthieu M., Jun 14 '23 at 11:36
I also disagree, I think it's a fairly common problem for higher level tests to have some shared state, ie an underlying database and then the ordering of the tests in the run, parallelising the tests, or the shear number of tests means you get collisions — Ewan, Jun 14 '23 at 19:30
@Ewan: The point I was getting at more is that _the same test_ shouldn't be making those assertions. It's true that there are cases where testing the data can be justified, e.g. when you need so much data that you cannot feasibly handcraft it; but it shouldn't be smushed into the same test, it should be a separate test. Given it's shared state, as you point out, there's no harm in running the test separately then. I will amend the answer as I did not point this out before. — Flater, Jun 14 '23 at 22:42
@MatthieuM.: Testing shouldn't invert on itself. That's not to say that you're forbidden to test it, I just disagree that it should be done in the test itself. Tests should have a single purpose and not get distracted by secondguessing their own work. — Flater, Jun 14 '23 at 22:53
With the latest updates to the question I agree with you. The problem is the setup is done outside of the test in some non obvious way. Not the pragmatic extra assert hack or lack of — Ewan, Jun 14 '23 at 23:06

score 8 · Answer 3 · answered Jun 14 '23 at 12:41

8

If I understand correctly, the only problem with test A is that the initial value of the field might happen to be exactly what the test is trying to store there.

Test B indeed fixes this problem. But there's a simpler option.

Just have a parameterised test with a list of test cases. E.g. (in pseudocode) instead of

TestA() {
    field.SetValue("New Value");
    Assert.AreEqual(field.getValue(), "NewValue");
}

you should have

[TestCases("New value", "Even newer value", "Completely new value")]
TestC(value) {
    field.SetValue(value);
    Assert.AreEqual(field.getValue(), value);
}

In this case it's obvious that even if the original value accidentally matches one test case, in other cases the value will be new.

answered Jun 14 '23 at 12:41

IMil

849
1
5
7

2

Excellent. a lot simpler than my own suggestions. – Doc Brown Jun 14 '23 at 14:53
I don't see how this solves your problem, you still want _all_ the tests to pass. If you have even one fail it doesnt matter that you have 100 passing tests – Ewan Jun 14 '23 at 19:35
3

@Ewan: when `SetValue` has a bug and does silently nothing, I want at least one test to fail. If it does not have this kind of bug, then I want all tests to pass. That's what this solution does. – Doc Brown Jun 14 '23 at 20:54
Elegant, but could you provide an example that won't leave the peer reviewers and maintenance coders wondering why they need more than one value? – candied_orange Jun 15 '23 at 15:39
@candied_orange I don't think there's a particular need for explanation, having a few test cases is perfectly normal. But you can leave a comment like "//several test cases needed to ensure that value differs from initial" – IMil Jun 15 '23 at 23:38
1

@IMil it's precisely because "having a few test cases is perfectly normal" that this test fails to make the point of it's existence obvious. This is clever, sneaky code. So a comment is most certainly warranted. I meant to encourage an edit. Stuff like this shouldn't be buried in comments. Or have you had enough upvotes already? – candied_orange Jun 15 '23 at 23:50

Ewan · Answer 4 · 2023-06-13T20:58:15.517

7

Test B is preferable because the interface for the object under test is tested, rather than the internal implementation details. It will continue to work correctly when you change the internal implementation to use a different format, or whatever.

Test A requires a knowledge of the implementation. This can lead to you writing a duplicate of the class under test in your test project.

If you are writing tests for a repository for example, and use strategy A, your database setup and tear down code for the various test cases can easily exceed the complexity of the repository itself. A column added to a table can break all your tests.

PS. I would update to a random value to avoid the pre check. But get value, update value, get value again, assert its changed as expected is a common pattern.

edited Jun 13 '23 at 20:58

answered Jun 13 '23 at 20:16

Ewan

70,664
5
76
161

4

I would try to avoid adding any randomness to my tests to keep them as deterministic as possible. Otherwise this is a good answer! – Christoph Jun 14 '23 at 08:14
@Christoph If you seed your random generator with a constant, it will be deterministic – João Mendes Jun 14 '23 at 09:21
3

@JoãoMendes: if one does not unnecessarily introduce a random generator for things which don't need one, the test will stay simpler, deterministic and less error prone. The suggestion of using random values in a test is always a warning sign for me - this always causes me to think - if we want "random data", but not "really random data", why don't we just pick "42" randomly and hardcode it instead overcomplicating the test with a random generator which produces "42" from a fixed seed of 0? – Doc Brown Jun 14 '23 at 09:28
2

@DocBrown S'fair. Actually, that's what I do, even, so, yeah. :) – João Mendes Jun 14 '23 at 09:37
@DocBrown actually I'm considering the random value solution. But in the form of a fixed UUID string. It may not be the best idea presented here but I sense it's the one least likely to get pushback. Not perfect but to fail I'd have to be incredibly unlucky or the victim of a malicious copy-n-paste. – candied_orange Jun 14 '23 at 17:13
"update to a random value" doesn't make the test itself random, it just eliminates coupling with the other tests via shared data which can be caused by hardcoded values. ie "add customer id 1, then continue with the test" fails when you run in with some other test that also uses customer id 1 in its setup. "add customerId NewGUID()" makes the test _less_ random – Ewan Jun 14 '23 at 19:21
@Ewan I'd rather hardcode a fixed UUID then generate a new one each time I run the test. Makes debugging a fail less of a mystery. – candied_orange Jun 14 '23 at 19:46
a fixed GUID saves you against accidental collision with another hardcoded human written value, but not against the test running twice, shared setup code, change in order of tests etc etc – Ewan Jun 14 '23 at 20:04
@Ewan Well the test invokes setup itself. So it'd have to run twice at the same time to be a problem. I'll admit I don't actually know if the mock is a singleton or each invocation is it's own instance. Will look into that. – candied_orange Jun 14 '23 at 21:03
Looks like each invocation is it's own instance. So I think I'm good even if test is ran in parallel. Correct me if I'm wrong. – candied_orange Jun 14 '23 at 21:14
initially i thought the preset data was being read from a csv or something. with the latest edit i think i would change my answer – Ewan Jun 14 '23 at 21:24
although... does it really make a difference? The key thing i think its that its not being set in the arrange part of the test, but some hidden place. This still seems to be true in Test B, its just testing itself – Ewan Jun 14 '23 at 21:26
It's being set in the arrange part. But as a hidden default value you can only learn by referring to some other file. – candied_orange Jun 15 '23 at 13:40

score 2 · Answer 5 · answered Jun 14 '23 at 01:59

Why do you need to know that the starting value is different than the test value? This immediately makes me suspicious that we are either:

Missing a requirement that deserves its own test.
OR the test is executing in an environment that results in non-deterministic tests.

If the starting value is not a missing requirement, and tests are deterministic, then I would argue the starting value is not important. Test A is fine. Why do reviewers need to see this other file? That other file is setup code, which should be unnecessary to read in order to understand the test case.

As a thought experiment, consider how much effort would be required to debug a test failure in both cases.

If Test A fails, how important is the starting value? Is knowing the starting value essential to finding the root cause of a test failure? Do you not have the ability to step through this unit test with an IDE so you can analyze the initial value?

For Test B, why on Earth would the first assert ever fail? I have written tests before where an assert is part of the setup — but only as a debugging tool for wonky or unstable tests. As soon as I root out the instability, I usually remove the assert in the test setup.

To me, asserting the initial value means that I, as the test author, trust the initial state of the system under test as much as I trust the system to change the state correctly (e.g. "I don't trust the initial state or the system under test"). The first assert in Test B feels funny, because I should at least trust the initial state of the system.

But funny feelings are not a good reason to remove an assert. It might be the reason to add that first assert in Test B.

Whichever you chose, go with the strategy that makes your life easier. Tests shouldn't pass. Tests should fail correctly, and if you are doing that the test is fine however it is written. But still, the first assert in Test B bugs me. Why do we need it? That's the thing to figure out. If you don't need it, get rid of it. If you do need it, then consider writing a test to associate the initial value with some requirement. If there is no requirement, but asserting the initial value is deemed necessary, add a comment above the assert stating why this was needed, and describe the conditions which should prompt its removal.

The point of both A and B is to prove that an update happened. If the update fails we can't detect the failure if the update was supposed to change "TESTDATA" to "TESTDATA". It's perfectly deterministic. Stupid. But deterministic. Test B's first assert blows up in your face if you're this stupid. And provides assurance against this stupidity without making you look at other files. Test A just lets you shoot yourself in the foot if you don't. By which I mean, it passes quietly when it shouldn't. — candied_orange, Jun 14 '23 at 02:11
@candied_orange: it almost seems like the test is not as deterministic as you think it is. I like to see the starting value hard-coded in the test for testing updates. I don't like relying on test utilities to set the initial values when those initial values can affect the outcome of the test. — Greg Burghardt, Jun 14 '23 at 02:37
Well the test is calling the test utility. You could argue it's a hidden dependency. But the fact coders have to refer to a different file before spotting that they've over used the expression "TESTDATA" makes me consider validating required preconditions. — candied_orange, Jun 14 '23 at 03:00
"why on Earth would the first assert ever fail?" Very easily, if someone creates a new similar test via copy+paste, and doesn't modify the literal. Then the two tests run in sequence, and the second one is affected by the value stored in the first one. — Ben Voigt, Jun 14 '23 at 15:52

score 1 · Answer 6 · answered Jun 14 '23 at 15:11

Of the two approaches as written, I think approach B is closer to correct. At a high level, the purpose of setup code is to ensure the SUT† is in the correct state for the test. We normally achieve that through initializing values, setting up mocks, creating test files, frobnicating widgets, and the like, but asserting something is also a way of ensuring it. I don't consider complaints about multiple assertions valid here; to me, an assertion about the integrity of the test is fundamentally different to an assertion about the SUT itself.

That said, I think approach A can be made correct by making the helper code configurable. If the helper code picks an initial value, then the test cannot be validated by reading it; the helper and the test might pick the same initial and updated value, respectively. If the test picks both values, then reading the test makes it obvious whether they're the same or not. This approach has the added benefit of being generally applicable; if your REST API provides a means to retrieve data, testing it likely runs into the opposite problem (the helper and the test need to pick the same value), which can also be avoided in the same way.

// Approach A as written

@Test
void testGet() {
  commonHelper.setupInitialValue();
  var actual = makeGetRequest();
  assertThat(actual).isEqualTo("initial value");
}

@Test
void testPut() {
  commonHelper.setupInitialValue();
  makePutRequest("updated value");
  var actual = commonHelper.getCurrentValue();
  assertThat(actual).isEqualTo("updated value");
}

// Modified Approach A with configuration

@Test
void testGet() {
  commonHelper.setupInitialValue("initial value");
  var actual = makeGetRequest();
  assertThat(actual).isEqualTo("initial value");
}

@Test
void testPut() {
  commonHelper.setupInitialValue("initial value");
  makePutRequest("updated value");
  var actual = commonHelper.getCurrentValue();
  assertThat(actual).isEqualTo("updated value");
}

Of course, this approach still permits the possibility of a bug in the helper code. Where to draw the line there is a highly contextual decision call; since this question doesn't make sense in a context where the trade-off isn't (deemed) worth it, I believe that to be out of scope.

† System Under Test

Jörn Guy Süß · Answer 7 · 2023-08-09T00:40:49.147

If your crud code is automatically generated you should not be testing the crud functionality at all. Unless you have a reason to distrust the underlying Generator functionality there is no need to spend cycles here.

If you need a resilient smoke test, generate a random value and then use the check and set procedure (the second version). Note that this version is twice expensive on the underlying code.

There is no reason to just have one assert in a test. If this was the case then every test would be a single function which it isn't by design. A test may also want to call out several different factors that it wants to show distinctively. For this multiple asserts are necessary.

The important factor is the naming and intent of the test. If the name of the test cannot clearly state what it's purpose is, then it is not a good test.

A test should not have a single assert, but it should have a single purpose.

this post is rather hard to read (wall of text). Would you mind [edit]ing it into a better shape? — gnat, Jun 15 '23 at 10:45

Test A requires reviewing other files to prove it is correct. Test B doesn’t but has more than one assert

7 Answers7