Should I write tests when I can prove code correctness?

Question

People say that "talking about TDD hardly works, if you want to convince someone to TDD, show them results". However, I'm already getting great results without TDD. Showing me that people who use TDD get good results won't be convincing, I want to see that people who write both TDD and not-TDD get better results with TDD.

Despite all of this, I'm interested in giving TDD a try. However I'm not convinced I will gain anything from this. If it does prove useful, I will try to push it to the rest of my team.

My main question is this: Would TDD serve any purpose for code, if I can already prove the code correctness?

Obviously, neither one is a silver bullet. Your proof might be wrong because you missed a detail, and your test might fail to spot a bug that you failed to test for. In the end, we're human, nobody can make 100% bug-free-code forever. We can only strive to get as close as possible.

However, would TDD actually save any time on code that had its correctness proven? i.e. code that, in the state machine that the code operates on, all valid possible states and their ranges are recognized by the developer, all are accounted for, and the code is designed in a whitelist-style error-checking that passes every exception to an upper handler to make sure nothing unexpected leaks -> without both displaying a (within-reason-)relevant message to the client and sending log notifications to an admin.

Answers with real-life examples would be better.

Some clarifications:

This question is not about whether you can prove code correctness or not. Lets assume by default that not all code can be proven correct within a reasonable timeframe, but that some pieces of code can be. For example, it's very easy to proven correctness of a FizzBuzz module. Not very easy for a cloud-based data syncing service.
Within this confine, the question asks the following: Start with the assumption that a codebase is divided into 2 parts: [I] parts that have been proven correct [II] parts that have not been proven correct, but manually tested to work.
I want to apply TDD practices to this codebase that did not have them until now. The question asks the following: should TDD be applied to every single module, or would it be enough to apply them to only modules that were not proven correct?
"Proven correct" means that you can consider this module completely functional-style, i.e., it does not rely on any global or outer state outside itself, and has entirely its own API for I/O that other modules that interact with it must follow. It is not possible to "break this module" by changing code outside the module, at worst you can misuse it and get formatted error messages returned to you.
Obviously, every rule has exceptions, compiler bugs in new compiler versions may introduce bugs to this module, but the same bugs could be introduced to tests that tested it and result in a false sense of safety from tests that no longer work as intended. The bottom line is that tests are not a magical solution, they're another layer of protection, and this question discusses the issue of whether this layer of protection is worth the effort in the specific case of a module that was proven correct (assume that it was indeed).

Proving _"code correctness"_ might become harder that you think it is. — πάντα ῥεῖ, Apr 02 '18 at 17:56
I'm well aware of the difficulty. There have been pieces of code that I spent multiple days on just writing theoretical documents that lead into the proof (especially on async threaded backend code). Nontheless, it's not any more of an impossible mission than writing full coverage tests for the same kinds of problems. — Kylee, Apr 02 '18 at 17:59
While I enjoyed learning about your workplace and your work history, I almost stopped reading several times because well, sometimes what's really important is that, as a way of maximising your results, and, in order to keep your reader's attention so that they can, in an effort to strengthen your knowledge as well as the community's, help you, you must... *get to the point*. Please consider shortening your question to its most salient points. — MetaFight, Apr 02 '18 at 19:07
TDD's greatest strength is not finding bugs, it's enforcing good application architecture. Badly designed and hard to maintain code will be difficult to unit test — TheCatWhisperer, Apr 02 '18 at 19:17
What you describe is not process of "proving correctness" of code. That kind of process is drastically different. Also, I find it hard to accept that you can build code in a way that can be "proved correct" just by looking at it. I saw plenty of code that seemed trivial and "correct", only to be totally crushed when put under robust automated test. — Euphoric, Apr 02 '18 at 19:44
I must've phrased my post incorrectly. I did not say that it's trivial nor fast to prove code correctness, I said that reducing code complexity has helped a great deal in reducing bugs that were left undetected before pushing to the master build, by making it much easier to reason about all the possible paths the code can take; and that I'm interested in adding TDD style practices, to reduce the chance of bugs slipping even further. Something that would be invaluable in certain places, but potentially not as useful in others [this is the citations/answers needed part] — Kylee, Apr 02 '18 at 21:19
I don't think you've made a good case for adding TDD to your process. Every technique has a cost, and you claim that your existing process already obtains the benefits that TDD can provide (reducing bugs, improving code reliability, etc), making TDD a technique that will add costs without providing new benefits. — Robert Harvey, Apr 02 '18 at 21:35
Question: What is the role of QA in your organization, if any? Comment: Man, your question is really long. — John Wu, Apr 02 '18 at 21:35
Are you experienced with TDD? Because if you're not, I think getting some experience in TDD will answer your own question. Your approach of being strict about the way you write code is excellent, but it works because you know it works, and you know it works because you've done it. — Robert Harvey, Apr 02 '18 at 21:44
The organization has no dedicated QA team. Most employees are either managers, customer relations, or graphics/designers people. The actual dev/programmer team is expected to test their own code. Originally, the codebase was textbook spaghetti, that "kinda worked", but had lots of bugs. The number of bugs dropped rapidly once we started discussing issues together and imposing clear rules on what to not do to avoid problems. Generally, switching to more modular and simpler-to-reason code (at the cost of more-planning less-writing) made the biggest impact on reducing bugs so far. — Kylee, Apr 02 '18 at 21:48
If the only testing is self-testing, and developers are pushing their own code directly to production, that is not a great paradigm for a commercial software offering. I'd say there are a few other practices to introduce before you hunker down and spend a zillion hours on unit tests. — John Wu, Apr 03 '18 at 03:59
"Beware of bugs in the above code; I have only proved it correct, not tried it." -Donald Knuth. — Neil, Apr 03 '18 at 13:01
Your edit sound like this "Lets all assume unicorns exist. What would be best way to get Unicorn blood?" — Euphoric, Apr 03 '18 at 13:37
Your question makes no sense. TDD means that the tests drive the development. In other words, you have no design, no architecture, no code, *unless* you have a test for it. So, how in the world are you "applying TDD to code that has been proven correct" when by the very *definition* of TDD, *there is no code to prove correct*? — Jörg W Mittag, Apr 03 '18 at 16:21
@JörgWMittag, That is a very common **misconception** about TDD. TDD is not intended to drive design. It is intended to drive _implementation_. That is a very important distinction. If you develop with no idea of where you intend to go, you deserve what you get. — Berin Loritsch, Apr 04 '18 at 12:49
@Kylee: What tool/process do you use to prove code correctness? Is it automated so code is proven correct on every build? — JacquesB, Apr 21 '18 at 19:10

score 20 · Accepted Answer · edited Apr 05 '18 at 07:57

20

Yes.

Proofs are fine when they're available, but even at the best of times they only prove that a single bit of code will work as expected (for all inputs? accounting for interruptions in the middle of any operation? what about running out of memory? disk failure? network failure?).

What happens when it changes?

Tests are great because they serve as an implied contract about what the code should do. They provide some scaffolding so that your new intern can go in and make changes with some level of confidence. All via quick, clear results: pass or fail.

And frankly, I can coach an intern to write viable unit tests in a few months. I doubt that anyone on my team (myself included) could create proofs that guarantee anything meaningful for non-trivial code; let alone do it quickly and accurately.

edited Apr 05 '18 at 07:57

Robbie Dee

9,717
2
23
53

answered Apr 02 '18 at 20:07

Telastyn

108,850
29
239
365

The same code that would be non-trivial to prove would also be non-trivial to test. A network failure or a disk failure can take a lot of forms, how can you be confident that your tests covered every single possible scenario? They will most likely cover every single scenario you could think of, but not necessarily every single scenario in real life. You can't just blindly assume every change will be non-breaking just because you have tests in place. It's just another protection layer. Tests are important, but they're not a silver bullet. – Kylee Apr 02 '18 at 21:15
6

@Kylee - no offense, but you seem to be wildly underestimating the effort and skill necessary to do a proof of correctness (or wildly overestimating the effort and skill necessary to toss together some automated tests). – Telastyn Apr 02 '18 at 21:37
Maybe you're right, and I most likely do overestimate the effort in putting together tests due to being inexperienced with it (honestly, it's less the "putting together tests" that I'm concerned about, and more about actually running the tests. For large & rapidly changing projects, that in itself is not a negligible time sink). Either way, this has nothing to do with the question. The question clearly states, assume that you have code that was already proven correct, is there still a point to write unit tests for it? – Kylee Apr 02 '18 at 22:02
@kylee - imo, yes, for the reasons stated in the answer. And really unit tests take a few dozen milliseconds to run. Even with thousands, it’s fine. – Telastyn Apr 02 '18 at 22:38
1

`For large & rapidly changing projects` the more prone to change the more necessary are the tests, since a changing code has many more chances to fail due to new bugs or unexpected behaviours, than code that barely change. It's a matter of probability. Even if it doesn't change often, after a while the knowledge obtained during the development could get lost or fall into oblivion. Tests are also materialised knowledge, that can cut down, significantly, the learning curve. Is coding tests time-consuming? Yes. Does it make more expensive the project? **No, in the long run make it cheaper**. – Laiv Apr 04 '18 at 06:38
3

@Kylee Even if you could prove that a piece of code is correct, at best you would be in the situation that no one is allowed to change that code or add features to it; to do so would invalidate your proof. As already highlighted in this answer, tests allow you to change your code with confidence. Even if the person changing it is an inexperienced intern. By the way, even if some block of code is 100% correct does not mean you will never need to change it (e.g. to add a new feature needed by a customer or to satisfy some real world constraint not considered in your proof). – Brandin Apr 09 '18 at 12:48

score 5 · Answer 2 · answered Apr 02 '18 at 20:02

We don't know. We cannot answer your question.

While you spend lots of time explaining that process you have now seems to work to everyone's satisfaction, you are telling us only small sliver of what is actually happening.

From my experience, what you are describing is extreme rarity and I'm skeptical that it is actually your process and approach to coding that is actually cause of low bug count in your applications. There might be many other factors that influence your applications and you are telling us nothing about those factors.

So we don't know, in face of not knowing your exact development environment and culture, if TDD will help you or not. And we can spend days discussing and arguing about it.

There is only one recommendation we can give you : try it out. Experiment. Learn it. I know you are trying to spend least amount of effort to decide, but that is not possible. If you really want to know if TDD will work in your context, only way to find out is to actually do TDD. If you actually learn it and apply it to your application, you can compare it with your non-TDD process. It might be that TDD actually has advantages and you decide to keep it. Or it can come out that TDD doesn't bring anything new and only slows you down. In which case, you can fall back to your previous process.

score 5 · Answer 3 · answered Apr 02 '18 at 20:07

5

The main purpose of (unit) tests is safeguarding code, making sure it will not break unnoticed because of later changes. When the code is first written, it will get a lot of attention and it will be scrutinized. And you may have some superior system for that.

Six months later, when someone else is working on something seemingly unrelated, it may break and your super-duper code-correctness-prover will not notice it. An automatic test will.

answered Apr 02 '18 at 20:07

Martin Maat

18,218
3
30
57

1

This is something that is far too often overlooked. Yes, there is pain to go through to make sure the code has unit tests now. But this will pay for itself many times over when a junior dev does a simple change and unit tests immediately flag up that they have broken something else. I've been the junior dev decades ago who has broken something and the whole release has been delayed. While my colleagues were very tolerant at the time and had my back when managers started jumping up and down, the whole scenario could, in hindsight, have been avoided if unit tests had been in place. – Robbie Dee Apr 05 '18 at 08:04

score 5 · Answer 4 · answered Apr 03 '18 at 00:35

I want to apply TDD practices to this codebase that did not have them until now.

This is the hardest way to learn TDD. The later you test, the more it costs to write tests and the less you get out of writing them.

I'm not saying it's impossible to retrofit tests into an existing code base. I'm saying doing so isn't likely to make anyone into a TDD believer. This is hard work.

It's actually best to practice TDD the first time on something new and at home. That way you learn the real rhythm. Do this right and you'll find it addictive.

The question asks the following: should TDD be applied to every single module,

That is structural thinking. You shouldn't say things like test every function, or class, or module. Those boundaries are not important to testing and they should be able to change anyway. TDD is about establishing a testable behavioral need and not caring how it's satisfied. If it wasn't we couldn't refactor.

or would it be enough to apply them to only modules that were not proven correct?

It's enough to apply them where you find a need for them. I'd start with new code. You'll get much more back from testing early than from late. Don't do this at work until you've practiced enough to master it at home.

When you've shown TDD is effective with the new code at work and feel confident enough to take on the old code I'd start with the proven code. The reason why is because you'll be able to see right away if the tests you're writing are taking the code in a good direction.

My main question is this: Would TDD serve any purpose for code, if I can already prove the code correctness?

Tests don't just prove correctness. They show intent. They show what is needed. They point out a path to change. A good test says there are several ways to write this code and get what you want. They help new coders see what they can do without breaking everything.

Only once you have that down should you wander into the unproven code.

A warning against zealots: You sound like you've achieved success and so will be unlikely to jump in headfirst. But others looking to prove themselves will not be so reserved. TDD can be overdone. It's amazingly easy to create a suite of tests that actually hurts refactoring because they lock down trival and meaningless stuff. How does this happen? Because people looking to show off tests just write tests and never refactor. Solution? Make them refactor. Make them deal with feature changes. The sooner the better. That will show you the useless tests quickly. You prove flexibility by flexing.

A warning against structural categorizing: Some people will insist that a class is a unit. Some will call any test with two classes an integration test. Some will insist that you can't cross boundary x and call it a unit test. Rather than care about any of that I advise you to care about how your test behaves. Can it run in a fraction of a second? Can it be run in parallel with other tests (side effect free)? Can it be run without starting up or editing other things to satisfy dependencies and preconditions? I put these considerations ahead of if it talks to a DB, file system, or network. Why? Because these last three are only problems because they cause the other problems. Group your tests together based on how you can expect them to behave. Not the boundaries they happen to cross. Then you know what you can expect each test suite to do.

I've seen people say they don't want to use TDD because it would have too much overhead, and TDD supporters defend it by saying that once you get used to write TDD all the time there isn't much overhead.

That question already has answers here.

score 1 · Answer 5 · answered Apr 02 '18 at 18:17

1

Test Driven Development is more about prototyping and brainstorming an API, than testing. The tests created are often poor quality and eventually have to be thrown out. The main advantage of TDD is determining how an API will be used, before writing the API implementation. This advantage can also be obtained in other ways, for example by writing API documentation before the implementation.

Correctness proofs are always more valuable than tests. Tests don't prove anything. However, in order to use correctness proofs productively, it helps to have an automated proof checker, and you will need to work using contracts of some sort (design by contract or contract based design).

In the past, when working on critical sections of code, I would attempt manual correctness proofs. Even informal proofs are more valuable than any automated tests. But you still need the tests, unless you can automate your proofs, as people will break your code in the future.

Automated tests do not imply TDD.

answered Apr 02 '18 at 18:17

Frank Hileman

3,922
16
18

Brainstorming an API and design by contract is already how I approach problems, so I'd like to say that I like your answer, but other sources say "In test-driven development, each new feature begins with writing a test". You said things that I agree with, but did not convince me to use TDD approach. "Automated tests do not imply TDD", ok, but then, what does TDD imply, as opposed to simply following general best practices? – Kylee Apr 02 '18 at 18:32
1

While I agree that TDD often involves brainstorming an API, IMO, this is sometimes a **bad thing**. The API shouldn't be designed (necessarily) for testability, it should be designed to be used smoothly by your clients, and to make sense to the classes / code involved. All too often I've seen a test writer say "lets add `String getSomeValue()` here so we can test it" when that makes no sense for the overall design. Sure, you could remove that function later, but, in my experience, that is rare. – user949300 Apr 02 '18 at 18:43
1

@user949300, There is a large overlap between designing a testable API and one designed to be used smoothly by your clients. Adding unnecessary code for the sake of the test shows you have a bad design. All too often, API writers forget about debugging what is going wrong with an API. Writing something testable AND useful to the user forces you to think about those things... without leaking implementation details into your interface. – Berin Loritsch Apr 02 '18 at 18:46
@user949300 The number one complaint about TDD is probably the way the API is modified for "testability", in the unit sense, such that things that are usually encapsulated are exposed. The number two complaint is probably that it doesn't scale well over time. – Frank Hileman Apr 02 '18 at 21:17
@Kylee Automated tests are older than the buzzword "TDD". The difference is that automated tests are simply what you think should be tested -- there is no dogma, nor specific order in which you write them. Nor is there any emphasis on unit vs integration tests. – Frank Hileman Apr 02 '18 at 21:20
@FrankHileman It was my understanding that TDD is the philosophy of writing tests for code before writing the code itself, to make sure that it meets new requirements (by checking that it passes the tests), which is obviously different than just writing automated tests when you think you need them. Did I get this wrong? I'm very in favor of testing in general, and don't trust code to work until it's tested somehow; the question at hand is about the workflow methodology when you approach adding new code based on the planning you did prior to writing it. – Kylee Apr 02 '18 at 22:11
@Kylee You are correct. The only advantage I have seen with regards to TDD is the one I mentioned in my answer, but there are other ways to achieve the same effect. We used to call it planning or documentation. – Frank Hileman Apr 03 '18 at 16:35

score 0 · Answer 6 · answered Apr 21 '18 at 18:50

0

A) You reading the code and convincing yourself that it's correct isn't remotely close to proving it's correct. Otherwise why write tests at all?

B) When you change the code you want to have tests run that demonstrate the code is still correct or not.

answered Apr 21 '18 at 18:50

Josh

175
7

this seems to merely repeat points made (and much better explained) in [top answer](https://softwareengineering.stackexchange.com/a/368727/31260) that was posted few weeks earlier – gnat Apr 24 '18 at 07:51
@gnat It's a more succinct response for something that doesn't need a novel written about it. Please try to contribute something here. – Josh May 01 '18 at 19:28

Berin Loritsch · Answer 7 · 2018-04-03T12:38:36.057

I will caveat by saying that once you are used to using TDD effectively, it will save you time in the end-game. It takes practice to learn how to use TDD effectively, and it doesn't help when you are under a time crunch. When learning how to best make use of it, I recommend starting on a personal project where you have more leeway and less schedule pressure.

You'll find that your initial progress is slower while you are experimenting more and getting your API written. Over time, your progress will be quicker as your new tests start passing without changing code, and you have a very stable base to build from. In the late game, code that is not built using TDD requires you to spend a lot more time in the debugger as you try to figure out what is going wrong than should be necessary. You also run greater risk of breaking something that used to be working with new changes. Assess the effectiveness of TDD vs. not using it by total time to completion.

That said, TDD is not the only game in town. You can use BDD which uses a standard way of expressing the behavior of a full-stack application, and assess the correctness of the API from there.

Your whole argument hinges on "proving code correctness", so you need something that defines code correctness. If you aren't using an automated tool to define what "correct" means, then the definition is very subjective. If your definition of correct is based on the consensus of your peers, that can change on any given day. Your definition of correct needs to be concrete and verifiable, which also means it should be able to be evaluated by a tool. Why not use one?

The #1 win from using automated testing of any sort, is that you can verify your code remains correct even when OS patches are applied quickly and efficiently. Run your suite to make sure everything is passing, then apply the patch and run the suite again. Even better, make it part of your automated build infrastructure. Now you can verify your code remains correct after merging code from multiple developers.

My experience using TDD has led me to the following conclusions:

It's great for new code, difficult for changing legacy systems
You still have to know what you are trying to accomplish (i.e. have a plan)
Slow to start, but saves time later
Forces you to think about how to validate correctness and debugging from a user perspective

My experience using BDD has led me to the following conclusions:

It works for both legacy and new code
It validates the whole stack, and defines the specification
Slower to get up and running (helps to have someone who knows the toolset)
Fewer behaviors need to be defined than unit tests

Definition of Correct: Your code complies with requirements. This is best verified with BDD, which provides a means of expressing those requirements in a human readable fashion and verifying them at run time.

I am not talking about correctness in terms of mathematical proofs, which is not possible. And I am tired of having that argument.

A wonderful post that does address an issue that made me consider TDD to begin with: the current definition of "code correctness" is indeed consensus of the peers. My concern is that as the team changes in the future, this method would stop working. But, how TDD would solve this? While old tests may save new team members from breaking old functionality easily, they may still write incomplete tests for future features, that would still lead into problems. In the end, both methods rely on trusting your team. I've not heard of BDD before, so going to check that out as well. Thanks. — Kylee, Apr 02 '18 at 19:23
"you can verify your code remains correct" Nope. Not even close. You can at most claim that the test that you've run still passes. — Bent, Apr 02 '18 at 20:48
@Kylee, BDD addresses that concern better. BDD has you write a spec that is verifiable. The spec is written in your natural language with means of creating hooks to actual test code that verifies the spec. That bridges two concerns, communicating the actual requirements and enforcing them. — Berin Loritsch, Apr 03 '18 at 12:32
@Bent, Please don't argue about "correctness" in terms of mathematical proofs. That's not the topic of conversation or what I intended to convey. However, based on experience people who post comments like yours tend to have that in mind. You can absolutely verify that code complies with requirements. That is the working definition of correct that I am talking about. — Berin Loritsch, Apr 03 '18 at 12:34
The trouble is, you're driving to a definition of "correct", where the requirement is that the code passes the tests you've defined. Any other set of inputs is undefined behaviour, and the output is arbitrary. This doesn't really match what most users would consider to be the requirements. — Simon B, Apr 04 '18 at 12:27
@SimonB, it's a bit more refined than that, particularly if you are using BDD. You are defining correctness in terms of complying with the stated specifications. If your tests violate the specification, then the test is wrong. How else are you going to define a consistently repeatable concept of 'Correct'? — Berin Loritsch, Apr 04 '18 at 12:46
Great answer +1. Don't understand the down votes personally unless it is the usual dog piling. BDD is absolutely a great approach as it defines a common language between the developers and business and by extension marries the test to a requirement. Especially important if the GWT style hasn't been adhered to in the TDD process. — Robbie Dee, Apr 05 '18 at 08:14

Should I write tests when I can prove code correctness?

7 Answers7