How to write tests that make sense for visualization software?

Question

I have a fairly large piece of software which takes certain file types and visualizes them / creates a host of buttons for manipulation of the image plotted. I feel like I'm finding bugs / pieces of code that don't actually work once a week, but I'm struggling with understanding how can I write tests for this software?

I understand how tests are important for projects like libraries and API's, you simply write tests that use these functions.

But what about visualization software? It seems to require a different approach due to the visual element(s) involved.

Do I need to write a test program or test harness which runs and manually calls every operation provided that I can use on the data?

What approach should I use in order to start writing tests in order to validate that I fixed the bugs and to alert me if the code breaks again?

There is a related but not duplicate question regarding when you should unit test. Since I'm discovering bugs, I want to write tests in order to help prevent the software from regressing again.

score 10 · Answer 1 · answered Jun 23 '15 at 19:52

There are a couple of things you can do to make testing software like that easier. First, try to abstract as much as you can into layers that aren't visual. That will let you just write standard unit tests on those lower layers. For example, if you have a button that performs a certain calculation, make sure you have a way to perform that calculation in a unit test with a regular function call.

The other suggestion for testing graphics-heavy programs is to create some output that a tester can easily manually verify. A minecraft example is:

minecraft test output

I've also worked on projects that had a bunch of tests that rendered something on the screen, then prompted the tester to manually verify it matched the description. This makes sure you don't forget test cases later.

Testing is often extremely difficult if you didn't originally design with testing in mind. Just work on putting your most fragile code under test first. When you find a bug, make a test that will catch that bug if it happens again. That often prompts writing related tests while you're at it.

+1 and if the test data renders a static visual, manually verify the result the first time then save the screenshot for programmatic comparison thereafter — Steven A. Lowe, Jun 23 '15 at 19:53
+1 Having a hard-coded sample data that produces all variations of outputs already solves 90% of problems related to testing visual outputs. After that, you can have a trained monkey compare before/after for possible bugs. — Euphoric, Apr 20 '23 at 11:14

score 6 · Answer 2 · answered Jun 23 '15 at 20:44

Everything has an interface. When I put my testing hat on, I use a specific world-view to write a test:

If something exists, it can be measured.
If it can't be measured, it doesn't matter. If it does matter, I just haven't found a way to measure it yet.
Requirements prescribe measurable properties, or they are useless.
A system fulfils a requirement when it transitions from a not-expected state to the expected state prescribed by the requirement.
A system consists of interacting components, which may be subsystems. A system is correct when all components are correct and the interaction between components is correct.

In your case, your system has three main parts:

some kind of data or images, which can be initialized from files
a mechanism to display the data
a mechanism to modify the data

Incidentally, that sounds very much like the original Model-View-Controller architecture to me. Ideally, these three elements exhibit loose coupling – that is, you define clear boundaries between them with well-defined (and thus well-testable) interfaces.

A complex interaction with the software can be translated into small steps that can be phrased in terms of the elements of the system we are testing. For example:

I load a file with some data. It displays a graph. When I drag a slider in the UI, the graph becomes all wobbly.

This seems to be easy to test manually and difficult to test automated. But let's translate that story into our system:

The UI provides a mechanism to open a file: the Controller is correct.
When I open a file, the Controller issues an appropriate command to the Model: the Controller–Model interaction is correct.
Given a test file, the model parses this into the expected data structure: the Model is correct.
Given a test data structure, the View renders the expected output: the View is correct. Some test data structures will be normal graphs, others will be wobbly graphs.
The interaction View–Model is correct
The UI provides a slider to make the graph wobbly: the Controller is correct.
When the slider is set to a specific value, the Controller issues the expected command to the Model: the Controller–Model interaction is correct.
When receiving a test command regarding wobbliness, the Model transforms a test data structure to the expected result data structure.

Grouped by component, we end up with the following properties to test:

Model:
- parses files
- responds to file-open command
- provides access to data
- responds to make-wobbly command
View:
- renders data
Controller:
- provides file-open workflow
- issues file-open command
- provides make-wobbly workflow
- issues make-wobbly command
whole system:
- the connection between the components is correct.

If we do not decompose the problem of testing into smaller subtests, testing becomes really difficult, and really fragile. The above story could also be implemented as “when I load a specific file and set the slider to a specific value, a specific image is rendered”. This is fragile since it breaks when any element in the system changes.

It breaks when I change the controls for wobbliness (e.g. handles on the graph instead of a slider in a control panel).
It breaks when I change the output format (e.g. the rendered bitmap is different because I changed the default colour of the graph, or because I added anti-aliasing to make the graph look smoother. Note that in both of these cases).

Granular tests also have the really big advantage that they allow me to evolve the system without fear of breaking any feature. Since all required behaviour is measured by a complete test suite, the tests will notify me should anything break. Since they are granular, they will point me to the problem area. E.g. if I accidentally change the interface of any component, only the tests of that interface will fail and not any other test that happens to indirectly use that interface.

If testing is supposed to be easy, this requires a suitable design. For example, it is problematic when I hard-wire components in a system: if I want to test the interaction of a component with other components in a system, I need to replace those other components with test stubs that let me log, verify, and choreograph that interaction. In other words, I need some dependency injection mechanism, and static dependencies should be avoided. When testing an UI, it's a great help when this UI is scriptable.

Of course, most of that is just a fantasy of an ideal world where everything is decoupled and easily testable and flying unicorns spread love and peace ;-) While anything is fundamentally testable, it is often prohibitively difficult to do so, and you have better uses of your time. However, systems can be engineered for testability, and typically even testing-agnostic systems feature internal APIs or contracts that can be tested (if not, I bet your architecture is crap and you have written a big ball of mud). In my experience, even small amounts of (automated) testing effect a noticeable increase of quality.

score 2 · Answer 3 · answered Jun 23 '15 at 19:53

Start with a known file that produces an expected image. Check every pixel. Each should have an expected value for a known, hand-crafted test-file. You should have an expected output-image to compare it against. Anything that's "off" signifies a bug in your code.

Expand your test-file so the output image changes and hits every feature of your software.

Scripting would be handy for that sort of black-box testing. A simple script that executes the latest build of your software for the known input and expected output.

Unit Testing on the other hand, should be white-box testing where you take the smallest possible chunk of software, typically a function or whatever, and see if it behaves like you expect. You could see what pixel color it returns, or whatever. In this case your code behaves just like a library, with APIs to all the other sections of your code.

If it's all tucked into one .c file with all the functionality shoved into main(), then you've got bigger issues then how to test.

How to write tests that make sense for visualization software?

3 Answers3