How to ensure functionality is tested consistently?

Question

As an example, lets say I am working on creating an eCommerce solution. Part of the functionality of this solution will be to ensure that the final price of an order is correct. Lets say the final price will include the grand total of all products, the shipping costs and any tax.

Lets say I have two public methods:

/**
 * @return array
 */
public function viewAllOrders(){};

/**
 * @return array
 */
public function viewOrder($guid){};

Each method returns (amongst other things) the grand total of an order.

The problem I have run into is that because each of these methods is tested in separate tests, the tests could have different definitions on what the grand total should be.

For example, lets say for an order 'A' the product price is $100, shipping $10, tax $20 giving grand total $130.

A developer is writing the test for viewAllOrders. He writes the test correctly, and asserts that for order 'A' the grand total is $130.

Another developer writes the test for viewOrder. He thinks that grand total is just the price for the products, and forgets about tax and shipping. He now asserts that for order 'A' the grand total is $100.

Both developers write production code that ensures all the tests pass, but we now have a situation where order 'A' has a different grand total depending on the public method that you call! Errors have crept into our system (even though all tests are passing) because two tests disagree on the definition of grand total.

What are some good ways to ensure that we don't use different definitions of functionality across our tests?

I am adopting 'Classical/Statist' testing over mocking. The reason being is I'm using a dynamic language. See Are mocks in dynamic languages dangerous?. This means I cant just simply have a grandTotal() method that the other classes call and can then be mocked in tests.

I have also considered testing each method in the same test (thus ensuring the definition of grand total stays in one test). The problem with this is that you open yourself up to Assertion Roulette.

Also note I have trivialise this example. Most people do know what grand total is, but my problem is a general one, being explained with a simplified example. When you read 'grand total' this could equally be read as 'some value with definite meaning'

Just a last thought now you edited your question: You ensure that because your test of the calculatePrice method will give you the answer. That second developer ($100) is never going to get that price visible because the price calculation prevents that. — Luc Franken, Jul 29 '16 at 09:07
Your problem isn't testing. Your problem is putting business logic in your controllers and then duplicating that logic in multiple places. — RubberDuck, Jul 29 '16 at 17:01
[Listen to Frank](http://programmers.stackexchange.com/a/326118/142319) @Gaz_Edge. Create (and test) a single class that does the real work. Call that class from your controllers. — RubberDuck, Jul 29 '16 at 20:16

score 5 · Accepted Answer · answered Jul 29 '16 at 13:37

5

You ensure that by properly knowing your domain and designing for it.

In some other software, the calculation of a+b may be almost irrelevant to the domain. In your case, you are talking about the commerce domain, in which tax and shipping are vital things. Therefore, if a method simply does return cost + tax + shipping you have a horrible design flaw.

This is similar to the emphasis on context mentioned by the other answer, but I find domain more understandable. Why is it important to properly model your domain? Because your software is built around and for it.

Let's make this more concrete: Assume that you somehow managed to ensure that the two methods perform the calculations equally, but separately.

What will you do once you discover, that the tax rate is different for each country?
How do you ensure the correct price calculation for super-important customer X who gets a 3% discount on everything they order?
How do you adjust the calculations for next week's free shipping promotion?

If you're in the business of selling stuff, then price calculation is of such vital importance that having two (let alone more than that) places in your code for deciding it becomes a major liability.

Now, should you discover such a violation, then you do not want to simply fix the two calculations to be the same. You go to the underlying root cause of the problem and re-design your system.

You may want to check out the domain-driven design (DDD) literature on this. Most of that is going to help you avoid such problems in a more general way (bounded contexts and ubiquituous language come to mind immediately).

answered Jul 29 '16 at 13:37

Frank

14,407
3
41
66

1

"If you're in the business of selling stuff, then price calculation is of such vital importance that having two (let alone more than that) places in your code for deciding it becomes a major liability." Totally agree and that is my question: how do I get it in one place?! I do need viewOrder and viewAllOrders (and viewCustomerOrders, viewLastTenOrders etc). Using classical/statist TDD I cant just create some CalculatePrice class and then mock it when testing viewOrder, viewAllOrders etc etc so what should i do? – GWed Jul 29 '16 at 13:52
1

@Gaz_Edge you need to separate in your mind the style of TDD you're using from the design you come up with, they are not related. – guillaume31 Jul 29 '16 at 13:55
@guillaume31 i have no idea what that means sorry – GWed Jul 29 '16 at 14:00
Just because you refrain yourself from using mocks doesn't mean you can't create a small class dedicated to something. The Refactor step in TDD is the ideal moment to do extract that kind of class. – guillaume31 Jul 29 '16 at 14:04
Note that it may not apply to this particular case (I see price calculation as a domain object, not a small specialized class you can extract) but it's a general rule of thumb. – guillaume31 Jul 29 '16 at 14:05
But we refactor out production code that is similar. In this situation, the code will not be similar as two pieces of production code define grand total in different ways. Refactoring while using 'classical' TDD does not solve the problem, as ultimately the tests dont care about our implementation details (i.e. refactoring out code). We wont actually be able to refactor out the code as the two tests define grand total differently so one will fail. You could argue in this trivial case we would spot this error while refactoring, but that cant be assume true for more complex areas – GWed Jul 29 '16 at 14:14
I fail to see how 'mockist' TDD would solve this any better. – guillaume31 Jul 29 '16 at 14:29
cool - so its a problem in both classical and mockism approaches - how do we solve it then? – GWed Jul 29 '16 at 14:33
2

By having a priorly agreed upon domain lexicon with a single authority for calculating prices in it – guillaume31 Jul 29 '16 at 14:37
I still don't understand how this fixes my statist tests. Each test under classical/statist TDD asserts the state of the system. In this case, the state of grandTotal. The tests would not allow a common authority to be used as one test would always be wrong. The tests allow us to refactor out similarities into one single authority. We should not be refactoring out production code to test are tests are correct!!! – GWed Jul 29 '16 at 14:49

candied_orange · Answer 2 · 2016-07-29T15:19:59.527

1

This sounds like a semantic argument.

One says:

order 'A' the product price is $100, shipping $10, tax $20 giving total price $130.

The other says:

for order 'A' the total price is $100.

The difference? Context.

You don't have a situation where order A has a different total price. You have a situation where any order has both a sub-total and a grand total that includes things besides order A (shipping, taxes). If you call both these things total and expect them to be the same it's as silly as if you'd switched to order B and expected things to be the same.

What are some good ways to ensure that we don't use different definitions of functionality across our tests?

Use good names. Make the context clear. A test that only the author of the test understands is a bad test. Total just means some stuff has been added up. It's not a very useful name. Give me a hint about what's been totalled up. Subtotal tells me this is just the first wave of totalling. Grandtotal tells me this is the last of the totalling.

To help ensure that code performs as expected and is understandable you can have peer reviews and have peers write their own tests against it. A peer test shows how well the code is understood in a way so formal that it compiles.

edited Jul 29 '16 at 15:19

answered Jul 29 '16 at 12:21

candied_orange

102,279
24
197
315

lol this is always the danger of providing trivial examples to try and explain the problem. Lets say that i DO have the name grandTotal. My question above still applies. The best naming in the world will not solve the problem that their are two tests that could define grandTotal in a different way. Unless we name it totalOfProductsAndShippingAndTax, there is room for error. My question is could this be mitigated if there was some way to define in one place what grandTotal is, and use that in all tests? – GWed Jul 29 '16 at 13:39
@Gaz_Edge There is a way. Define methods called `subTotal()`, `calculateShipping()`, and `calculateTax()`, that depend on a collection of products. Add a `grandTotal()` method that uses them all. Now grandTotal is defined in one place. Anything that needs it can call it. – candied_orange Jul 29 '16 at 13:50
so viewOrder and viewAllOrder use the grandTotal() method. I get that. But the only way I can remove the definition of what grandTotal is from each test, is to mock grandTotal(). I can then pass any old junk (even just text!) and assert that gets returned. But as I've stated, I don't want to use mocks. Im using statist/classical TDD. How should I resolve in this instance? – GWed Jul 29 '16 at 13:56
You can avoid mocking grandTotal by calculating a grandTotal by hand. You'll need a test product whose price will never change, a test tax that will never change, and a test shiping rate that will never change. But yes you can avoid mocking. – candied_orange Jul 29 '16 at 14:09
Precisely my point! Grand total IS calculated by hand in each test. Calculated by two different developers. Even if we set the values on the product to the same, developer A uses shipping and tax when manually calcuating the grand total, developer B does not. Although i havent expressed it clearly, this is the whole point of my question - grand total is calculated manually in the tests by developers. If one gets it right, while another gets it wrong, we have a problem. How do we solve this problem? – GWed Jul 29 '16 at 14:23
1

If a developer is getting the calculation wrong you find it in a code review. It also helps if more that one developer is writing tests against the same code. I've been in shops where you write tests against your code then hand it off to a peer reviewer who adds their own tests. The coder has to make the tests pass or explain what's wrong with the new tests. – candied_orange Jul 29 '16 at 14:32
good response - getting somewhere now. That would certainly mitigate the risk and is currently the best answer to my actual question. – GWed Jul 29 '16 at 14:35
@Gaz_Edge if you extract the price calculation into a method or class, and not mock it but use the real function, then exactly one of the two tests will fail. But this is good! Obviously you found something where tests don't agree, so now you have to look at them and fix the one wrong test. Which will be the one with the wrong manual calculation. Tests are not "more important" than the production code. If tests break, you have to look at both the production code and the tests and decide which one is wrong. – jhyot Aug 01 '16 at 18:08

guillaume31 · Answer 3 · 2016-07-29T15:56:17.913

Flesh out a language shared between all developers and business experts (see Ubiquitous Language in Domain Driven Design). Model your domain concepts precisely and unambiguously around that language. The authority for price calculation should be in one central place.
Since you seem to use an Outside-in TDD approach, when you approach the core layer (the domain), you'll already have the names of the domain objects thanks to that model. The first developer to need a given domain object (or part of it) can implement it. Team members who subsequently need it will use that implementation.

(edit)

This might differ from an extreme "TDD as if you meant it" approach, as I don't recommend refactoring your domain objects out of the production code as you go. Instead, I believe that when you hit the edge of the domain, you should pause your current test and go and create another unit test for the domain object you need. Implement it completely before going back to the outer layer test. Unlike more technical code, modelling rich business concepts and rules cannot be done in the heat of a minutes-long refactor step. They are too important to be improvised.

As a side note, I guess that a more extremist TDD practice tends to imply pair programming all along and rotating pairs frequently, thus generating much fewer opportunities for conflicting/duplicate implementations. Which might work just as well.

Im not sure why people arent getting this - possibly I'm on a totally wrong path. In production code, put the logic in one place. BUT this can ONLY happen if the tests allow it. With my example, we would never be able to refactor the code out. This is nothing to do with production code. The logic on production could live in a 100 different places. It doesnt matter so long as the tests pass. But my point is the logic for the TESTS needs to be in one place. Thats more important than having the production logic in one place. — GWed, Jul 29 '16 at 14:31
Forget about refactoring, I didn't mention it in my answer, on purpose. It was just a general remark in a previous comment. I'm not recommending refactoring out the domain objects out as you go - they are way too important for that. Their shape should be agreed upon *beforehand* so that you don't come across the kind of problem you're describing. — guillaume31, Jul 29 '16 at 14:45
ok , sorta see what you mean. Are you able to explain further in your answer? — GWed, Jul 29 '16 at 14:50

score 1 · Answer 4 · answered Jul 29 '16 at 18:24

Project requirements, the code and their associated unit tests can get mixed up as you pointed out two functions that should return the same value but were interpreted differently by the programmers.

Hopefully every unit test has some business rule to justify its existence. A code review may catch this, but you're past that in this project.

Final user testing should be where this gets found. When you're messing with other people's money, you better have testers that know what they're doing. One thing is to compare things that should be the same. I always push back on requests that seem to have a slightly different interpretation of appears to the same value. I don't want anyone looking at two reports and asking me why the "Big Bad Totals" don't match.

Once this discrepancy is caught, you would be able to go through the code and find two functions operating under different assumptions about how to calculate something. You never know, there may be a place where a different function is required. A little refactoring should make the name of this function a little clearer and the make sure you're only calling it from places it's appropriate.

The best way to avoid these problems is to get it right the first time, but that rarely happens in big projects. You have to be diligent and have many levels of testing. They're not all going to be automated, so the resources need to be there or you're going to miss things.

Good answer. Sometimes there's no logical solution. Just test the crap out of it. — GWed, Jul 29 '16 at 19:31

How to ensure functionality is tested consistently?

4 Answers4