Do we need test data or can we rely on unit tests and manual testing?

Question

We're currently working on a medium/large PHP/MySQL project. We're doing unit testing with PHPUnit & QUnit and we have two full time testers that are manually testing the application. Our test (mock) data is currently created with SQL scripts.

We have problem with maintaining scripts for test data. The business logic is pretty complex and one "simple" change in the test data often produces several bugs in the application (which are not real bugs, just the product of invalid data). This has become big burden to the whole team because we are constantly creating and changing tables.

I don't really see the point of maintaining the test data in the scripts because everything can be manually added in the application in about 5 minutes with the UI. Our PM disagrees and says that having project that we can’t deploy with test data is a bad practice.

Should we abandon maintenance of the scripts with test data and just let the testers to test the application without data? What’s the best practice?

P.Brian.Mackey · Answer 1 · 2011-10-10T14:27:20.717

6

Yes, having unit tests and data mock-ups is a best practice. The project manager is correct. Since performing a "simple" change in the test data often produces bugs, then that is the core of the problem.

The code needs improvement. Not doing so (saying hey we don't need tests) is not a fix, that is simply adding technical debt. Break the code down into smaller more test-able units because being unable to identify units without breakage is a problem.

Start doing a refactor. Keep the improvements small so they are manageable. Look for anti-patterns like God classes/methods, not following DRY, single-responsibility, etc...

Finally, look into TDD to see if it works for the team. TDD works well for ensuring all your code is test-able (because you write the tests first) and also ensuring you stay lean by writing just enough code to pass the tests (minimize over engineering).

In general, if a series of complex business logic processes produce a set of data, then I view this as a report. Encapsulate the report. Run the report and use the resultant object as input to the next test.

edited Oct 10 '11 at 14:27

answered Oct 10 '11 at 13:29

P.Brian.Mackey

11,123
8
48
87

I need to clarify things a little bit: "Simple change in the test data produces bugs" - the problem here isn't in the application - the app works fine when the data is valid (and you can't add invalid data manually). The issue here is that invalid test data can produce errors when trying to work on that data. So we need to test the test data also? – Christian P Oct 10 '11 at 13:51
Don't get tripped up on a red herring fallacy. The fact that the test data introduces a bug is a different issue all together. Removing tests is not a fix, "governing the government" is something else entirely as well. The problem is the code. It is not testable because you are telling us that you are unable to write tests that don't break things. That is why you need to improve the code. – P.Brian.Mackey Oct 10 '11 at 13:57
Maybe you misunderstood my question. We have working unit tests and every new functionality we write has unit tests. I'm not suggesting that we remove tests that are not passing or that we don't write tests at all. I'm just suggesting that we don't use the scripts for creating mock data in the database because the manual testers are doing the same thing. – Christian P Oct 10 '11 at 14:00
"I don't really see the point of maintaining the test data in the scripts" <-- Dropping test support is what I am saying. Not deletion of old tests. Its a bad idea. You are decreasing reproducibility and coupling yourself to a UI that is a part of the very thing you are trying to test and be able to adapt to change. Decouple yourself from the UI. Keep the data automation. – P.Brian.Mackey Oct 10 '11 at 14:02
But how do we tackle the problem of invalid mock data? If we continue to create mock data for the database how do we check if the mock data is ok or not? If business rule requires that value X=2 and the database accepts X=100 - how to we check the integrity of the test data when the business rule is complex? – Christian P Oct 10 '11 at 14:12
I appreciate your answers but I don't think that the data we're talking about can be viewed as reports. In theory it MAY be viewed as report but it's not a report. It's just data from which (in the future) the reports will be generated, but that's another story. – Christian P Oct 10 '11 at 15:37
The report is just a concrete representation of the abstraction. The important thing is to encapsulate. Beyond that, I we are getting into requirements specific concepts that can really only be applied to your particular situation and are therefore off topic. – P.Brian.Mackey Oct 10 '11 at 15:51
If you can't put invalid data in through the UI why can't the thing generating your test data run through the same tests? It sounds like maybe some of your validation code may be at the UI level but should be at the business level. – psr Oct 10 '11 at 20:01

AJC · Accepted Answer · 2011-10-10T15:12:12.557

4

You are mixing two different concepts. One is verification, which is based on Unit Testing and Peer Reviews. This can be done by the developers themselves, without test data and its intent is to verify that a set of requirements are met.

The second one is validation, and this is done by QA (your testers). For this step you do need test data since the tester do not need to have any knowledge of the programming in the application, only its intended use cases. Its objective is to validate that the application behaves as intended in a production environment.

Both processes are important and necessary to deliver a quality product to the customer. You can't rely on unit tests alone. What you need to figure out is a reliable way to handle your test data to ensure its valid.

EDIT: OK, I get what you are asking. The answer is yes, because the Tester's job is not to generate the test data, just to test the application. You need to build your scripts in a way that allows easier maintenance and ensures valid data is inserted. Without the test data, tester will have nothing to test. Having said that, however, if you have access to the testing environment, I don't see why you can't you insert the test data manually rather than by using scripts.

edited Oct 10 '11 at 15:12

answered Oct 10 '11 at 13:28

AJC

1,439
2
10
15

Maybe I stated my question wrong by mentioning the unit testing and test data. I understand that the validation is not the same as unit testing. My issue here is that the test data we are creating with the scripts can be created through UI in 5 min. To insert this data in the application you don't need to know programming, you just need to follow the test cases. – Christian P Oct 10 '11 at 13:38
@christian.p check my update regarding your clarification of the question. – AJC Oct 10 '11 at 15:05
So your solution is to abandon the scripts and just to add manually test data through UI? What about the answer that P.Brian.Mackey provided and his answers to coupling the data with the UI? – Christian P Oct 10 '11 at 15:35
@christian.p Well, I agree that you should use scripts. BUT there is no formality, or rule that says you HAVE to. The main reason to use scripts to generate mock data is speed (automation) and access (to the test environment). If you have access and it IS faster and easier to do it manually, there is no reason you can't do so. (BUT keep a log of the data you tested with). – AJC Oct 10 '11 at 15:39
every tester has it's own testing environment and testers are completely dropping the db several times a day, so it's impractical to manually add test data, but we can ask them politely to add some data for testing. – Christian P Oct 10 '11 at 15:45
@christian.p ? then you have just answered you own question. Its impractical for you, and it would be impractical for them, like I said, that is not their job and it would just cause delays on their part. Come up with a better way to generate mock data. Set it up ONCE in one DB and generate the script from the db or have others access it and get the data from there. – AJC Oct 10 '11 at 15:57

score 1 · Answer 3 · answered Oct 10 '11 at 19:10

This is a very common problem and very difficult one as well. Automated tests that run against a databse (even an in-memory database, such as HSQLDB) are usually slow, non-deterministic and, since a test failure only indicates that there is a problem somewhere in your code or in you data, they are not much informative.

In my experience, the best strategy is to focus on unit tests for business logic. Try to cover as much as possible of your core domain code. If you get this part right, which is itself quite a challenge, you will be achieving the best cost-benefit relationship for automated tests. As for the persistence layer, I normally invest much less effort on automated tests and leave it to dedicated manual testers.

But if you really want (or need) to automate persistence tests, I would recommend you to read Growing Object-Oriented Software, Guided by Tests. This book has a whole chapter dedicated to persistence tests.

Do we need test data or can we rely on unit tests and manual testing?

3 Answers3