What is a good unit testing strategy against a chain of public method calls?

Question

say I have this code which is a chain of public methods, public_c calls public_b calls public_a


def public_a(...):
    ...

def public_b(...):
    ...
    public_a(...)

def public_c(...):
    ...
    public_b(...)

Should I test all 3 methods individually? Part of me thinking if I test public_c then it "sort of" also test for public_b and public_a, in a limited scope defined by public_c.

Part of me thinking if any of these public methods being used more than one place, it warrants an isolated test.

What's the recommended practice here?

Does this answer your question? [How should I test the functionality of a function that uses other functions in it?](https://softwareengineering.stackexchange.com/questions/225323/how-should-i-test-the-functionality-of-a-function-that-uses-other-functions-in-i) — gnat, Apr 20 '23 at 07:54
@gnat sort of, my main concern was writing repetitive/similar tests on `public_a/b/c` but as @mmathis pointed out it's okay and normal. — James Lin, Apr 20 '23 at 09:45

mmathis · Accepted Answer · 2023-04-20T02:56:46.713

Test against the interface, not the implementation. So, you'll have tests for each of the methods: public_a, public_b, and public_c.

When you put your tester hat on, you don't know that public_b calls public_a internally - all you know is what public_b is supposed to do. So you need to make sure that public_a does what it's supposed to do, and public_b does what it's supposed to do, and public_c does what it's supposed to do.

In your case, the tests may be quite repetitive - and that's OK. You're verifying functionality. In the event, however, that you refactor one or more methods so there isn't the method chaining any more, your tests are already written and should catch any problems with the refactoring.

Writing a test only for public_c may be a way to save some effort, but as you said may only work for a subset of cases. And if you discover a bug in the public_c test, do you know if that was from the logic specific to that method, or to the logic in public_a or public_b? Guess you gotta write tests for public_a and public_b to determine where the error is...

score 4 · Answer 2 · answered Apr 20 '23 at 05:09

Your suggestion is essentially arguing that if your code calculates the length of the hypotenuse (= Pythagoras' theorem), and you test this code, that you then don't need to test the underlying methods which calculate the squares and square roots of numbers.

Tests should not assume the implementation of the unit under tests.

Otherwise, you'd open the door to selectively not testing some things "because you know that that part works". This goes against the purpose of having an automated testing suite that you can run to detect regressions in the future.

Tests don't exist to succeed, they exist to fail.

This one may seem counterintuitive at first. Let's use an analogy: we don't buy a smoke detector so the house doesn't catch on fire. We buy a smoke detector so that when the house is on fire, we know about it as soon as possible.

Similarly, we write tests because we want them to fail. That's their entire point, we use their failure to alert us that something's gone amiss in the codebase.

Scenario A: your Pythagorean theorem test fails. As we established, this is a complicated orchestration of several components (theorem calculator, square calculator, square root calculator). Which component failed?

You don't know. That's a problem. Not that you can't debug it, but I would question the quality of a test that cannot tell me specifically which part failed. This clearly means that we've not asserted our full process.

Scenario B: your Pythagorean theorem test succeeds. So now we know everything is working as intended, right?

Most likely, yes. I don't want you to think that you should never trust a test when it passes.
However, it's possible that you're not catching any "two wrongs make a right" kinds of scenarios. This requires a different example scenario to explain. For example you could have a complex process which requires you to parse a string from a particular source, remove the letter "e", and then return the cleaned up string.

Now consider this scenario:

[Bug] The censoring logic doesn't actually remove the letter "e"
[Bug] The parser logic doesn't store any character past the first 4 characters
[Test data] You happen to have used "abcde" as your test string.

This is obviously a cherry-picked bug with a blatantly simple root cause, but I've encountered these kinds of scenarios several times in enterprise-grade applications (a much more convoluted bug than this simple example).

Not having tests that individually assert each component makes it significantly harder to understand that there is a problem, plus identify where it is located.

It's a trap!

In my experience as a developer and consultant over the years, it is a very common occurrence for developers to suggest cutting corners when writing their test suites. The intention of trying to cut down the total amount of effort required is good, and the developer might be putting in considerably genuine effort in achieving what they consider to be an equally qualitative test report.

However, in a significant amount of cases, developers fall into the trap of assuming that a test suite is equivalent if the happy path looks the same, but this is not the correct approach. Ideally, you should consider both paths, but if you're going to focus on one path, it should be the unhappy path.

In essence, a test suite does not succeed because its tests all pass, it succeeds because none of its tests fail. (I'm aware this is semantically equivalent but I'm trying to get you to focus on the failures rather than the successes).

"Tests don't exist to succeed, they exist to fail." - another way to explain this is to view testing as hypothesis testing. A hypothesis is a falsifiable statement. E.g., If I throw this apple, then it comes down according to some formula. You cannot prove that statement, only falsify it. E.g., at a specific time and place. Likewise, lambda: x, y -> x + y can only be "proven" to add two numbers by a theorem prover. Tests can only falsify it. Maybe on Mondays it doesn't work because Python has a Although Software Development involves a formal system, it's mostly an empirical exercise. — Dr. Thomas C. King, Apr 20 '23 at 14:22

What is a good unit testing strategy against a chain of public method calls?

2 Answers2

It's a trap!