Should on each test create and nuke a testing database?

Question

In my case I inherited a poorly engineered code, on that piece of code I have been tasked to increase the code coverage in integration tests. But instead of the usual pattern:

Create/Populate a test database with specific test data
run the test
Delete the test data or Nuke the db
Repeat step 1 until no more tests.

The project itself is a laravel API that some logic implemented originally in codeigniter has been poorly migrated (Lots of time I came across, some logic in MVC controllers). Also not any sort of migration tests has been implemented as well.

In this codebase I have been tasked to increase the code coverage of integration tests. I am the exra member of a team of 2 people and on the existing tests I noticed that is relied upon existing data and due to lack migration scripts the workflow above is not followed.

As a result there is no consistency of the test results of the existing database integration tests and some tests depending the test execution sequence either pass or fail.

Also the database has been left as is with the very same schema that the codeingiter uses, also the code has not been fully migrated in laravel from codeigniter and as a result I inherited some mess. Not to mention that the migration scripts in laravel do not fully cover the whole database.

So I wonder:

What's the point of having integration tests if we have not the right tools, (creating on ther fly a test db)?
Should I spent a time to create a way to create on the fly a database and refactor all tests to create on the fly a test database from existing snapshot schema?
Should I gradually do gradually small scale redesighns (without telling to anyone) whilst I implement the tests, if yes what procedure I should follow?

score 9 · Answer 1 · edited Jul 02 '19 at 16:05

9

Tests should be independent of each other and reproduceble.

This can be done with

complete database setup for every test as you described
or with a predefined database-content where a database-commit is not allowed and where all changes are rolled-back in the end (so database is not changed)
or where the repository-database-implementation is replaced by by mocks/fakes.

If you want to use "predefined database-content" you should have a test-database-setup script so the database can be easily setup and loaded on a developper-database engine

edited Jul 02 '19 at 16:05

Joshua Crocker

103
3

answered Jul 02 '19 at 14:26

k3b

7,488
1
18
31

On one hand, yes. On the other hand, what about errors that only creep up if the database has been used for a while? – gnasher729 Jul 02 '19 at 16:41
4

@gnasher729: You need both kinds of tests. This answer works well for automated tests, but I think you still need some manual regression testing/smoke testing. And that should be done against a more persistent data set. – Greg Burghardt Jul 02 '19 at 16:49
1

`what about errors that only creep up if the database has been used for a while?` then you get the details and set a regression test. Bugs due to these reasons are going to be running (or not) database during UT. – Laiv Jul 03 '19 at 10:48
@Laiv: And to add to your comment, I believe the first thing you should do with a production bug report is write an automated test to replicate it, if possible. – Greg Burghardt Jul 03 '19 at 15:27

Laiv · Accepted Answer · 2019-07-04T08:02:55.707

In addition to k3b's answer.

If you want to use "predefined database-content" you should have a test-database-setup script so the database can be easily set up and loaded on a developer-database engine.

It's not that simple. You can not simply rely on an existing DB that could come and go, change any time due to unknown reasons and by the hand of different actors¹.

No, you should deploy your own DB. Once per test or once per the whole suite is up to you. This is important if you want your tests to be deterministic and get the most similar behaviour to the one expected from production. It's important for CI too. The builds might run in dedicated environments that might not have access to the DB. Think in running builds on the cloud. In my opinion, these tests should behave like unit tests as well: run any time in any environment.

You must consider matching the DB engine in vendor and version too, for more liable outputs.²

However, there's the problem of timings. Deploying DBs slows down test executions. Time is a limited and valuable resource you don't want to waste. In consequence, you have to balance integration tests with other kinds of tests.

So, when I would run on-the-fly DB instead of mocks or stubs?

When CI and CD timings are not constraining or critical.
When I need accuracy and precision in configurations and set-ups, overall system behaviour, approximate performance, etc.
When there's several teams (or devs) working on the same application, feature, task, etc.
When running test in distributed environments.
When I want to keep test code complexity at bay ³
When the benefits offset the costs by much.

How to shift from one way to another depends on the context, the team and the resources at hand. I would not start small scale redesigns without letting others know, because the change on the paradigm is fairly important as for the others to know and embrace the idea as soon as possible.

^{1: Say, there're more teams running tests on the same DB. Tests running for a different version of the same application.}

^{2: You might think that deploying fake or lightweight DB could do the job, but to my experience, they never behave exactly like the product they try to behalf. In some cases, I got unexpected and unpleasant behaviours in production that I could not detect during tests.}

^{3: Introducing mocks increments the complexity of the test code. It's also a possible source for bugs because usually we never test the test code. We accidentally could set the wrong behaviour to the mock, the reason why mocking 3rd party components is not advisable.}

Well the team is 2 people and when I told about a way to scedule a redesighn plan I have been rejected. The code is a pretty mess as well. So I thought to follow a sneaky way for it. — Dimitrios Desyllas, Jul 03 '19 at 12:37
On the specific issue, making and destroying a db on the fly is also helpfull for running the tests in parallel. Only if I can find a way each test has its own database name. — Dimitrios Desyllas, Jul 03 '19 at 12:50
You will contribute to the mess if you start making decisions alone. Imagine 2 novelists working on the same book but each writing whatever they think without sharing ideas, plots, characters, etc. — Laiv, Jul 03 '19 at 12:50
That said, there're changes you can do, like making the code more testable or ready for a quick paradigm change. Improve the production code gradually. We are talking about little changes here, no revamping the whole design of the application. Perhaps you could make one test of concept or proof test to get real estimations about the cost of migrating from one paradigm to another so you can try convincing the stakeholders again. This time with arguments backed by numbers. — Laiv, Jul 03 '19 at 13:02
Well in integration tests you do not mock unless you want ti mock. Mocking is done in Unit test to check under certain conditions that the business logic works. In integration tests you check the thw whole unit. Also for testing the tests you can do mutationn testing as well. — Dimitrios Desyllas, Jul 04 '19 at 08:05

Should on each test create and nuke a testing database?

2 Answers2