421

It seems reasonable to me that if a serious bug is found in production by end-users, a failing unit test should be added to cover that bug, thus intentionally breaking the build until the bug is fixed. My rationale for this is that the build should have been failing all along, but wasn't due to inadequate automated test coverage.

Several of my colleagues have disagreed saying that a failing unit test shouldn't be checked in. I agree with this viewpoint in terms of normal TDD practices, but I think that production bugs should be handled differently - after all why would you want to allow a build to succeed with known defects?

Does anyone else have proven strategies for handling this scenario? I understand intentionally breaking the build could be disruptive to other team members, but that entirely depends on how you're using branches.

MattDavey
  • 7,096
  • 3
  • 31
  • 32

12 Answers12

428

Our strategy is:

Check in a failing test, but annotate it with @Ignore("fails because of Bug #1234").

That way, the test is there, but the build does not break.

Of course you note the ignored test in the bug db, so the @Ignore is removed once the test is fixed. This also serves as an easy check for the bug fix.

The point of breaking the build on failing tests is not to somehow put the team under pressure - it's to alert them to a problem. Once the problem is identified and filed in the bug DB, there's no point in having the test run for every build - you know that it will fail.

Of course, the bug should still be fixed. But scheduling the fix is a business decision, and thus not really the dev's concern... To me, once a bug is filed in the bug DB, it's no longer my problem, until the customer/product owner tells me they want it fixed.

sleske
  • 10,095
  • 3
  • 29
  • 44
  • 154
    +1 I think you hit the nail on the head with *"scheduling the fix is a business decision"* - as a developer it's not my judgement call to decide if a bug breaks the build. – MattDavey Jan 20 '12 at 11:43
  • 25
    I think this is a very sensible solution. Especially if the next guy to check in some minor code, suddenly gets a "failed test" notice and thinks it's something he has done. – Holger Jan 20 '12 at 13:08
  • 15
    "To me, once a bug is filed in the bug DB, it's no longer my problem"... +1 – Jake Berger Jan 20 '12 at 15:38
  • @Matt That's the only acceptable answer. No engineer at an auto plant runs from his desk and wrecks an active assembly line, neither should a developer. Starting or stopping production/delivery is a business decision. – anon Jan 21 '12 at 01:29
  • 20
    @anon Exept at Toyota. A line worker sees a defect, then pulls the andon cord and the entire plant comes to a halt and management comes over and the line isn't restarted until the problem is fixed. Google Andon Cord. It's not a new concept. see : http://www.startuplessonslearned.com/2008/09/test-driven-development-as-andon-cord.html – Christopher Mahan Jan 21 '12 at 05:10
  • 1
    Not all bugs are created equal. This less restrictive method should be applied to 99% of them. If a user can't even open the app successfully, get it fixed; there's nothing to ship. – JeffO Jan 21 '12 at 16:06
  • @jberger Funny, I was just thinking the opposite. It is absolutely your problem! Fix that bug _now_ unless it is really that costly (> four hours' work). A good answer nonetheless. – Andres Jaan Tack Jan 21 '12 at 19:32
  • 5
    @AndresJaanTack: This will of course depend on your methodology, but in general I'd disagree. At least in a business setting, scheduling work is a business decision - and that *includes bug fixing*. Sometimes, a new feature (or releasing on a certain date) may be more valuable than fixing a (minor) bug - in that case the bug fix must be deferred. "Fixing the bug now" would be inappropriate in that situation, because it delays more important work. – sleske Jan 21 '12 at 20:52
  • @sleske Yes, I see what you mean. To make my point, though: is quality important? If not, then hiding bugs in a bug tracker is an _excellent_ method. – Andres Jaan Tack Jan 21 '12 at 21:08
  • Or the Israeli Air Force.... http://www.israelnationalnews.com/News/News.aspx/151883#.TxyMpPknZtw – JoelFan Jan 22 '12 at 22:24
  • 2
    @sleske true but then you could also argue that writing a test for the bug should also be deferred until there is a business need to fix it. – jk. Jan 23 '12 at 11:28
  • @jk.: No, writing tests to document the behaviour of the software is part of the development process. At least on our team, having good test coverage is part of the "definition of done" for any task, and it's the devs' decision to add tests if they feel more tests are needed. – sleske Jan 23 '12 at 11:33
  • 1
    @sleske the question states that the test is to be added after the bug has been found by users, so this isn't a test which would have been added as part of the original task – jk. Jan 23 '12 at 12:40
  • @jk.: True. Still, in that case a dev would be tasked with investigating the bug, and then writing the (failing) test would be part of the bug investigation. – sleske Jan 23 '12 at 12:54
  • 1
    If it is part of bug investigation rather than fix I think i'd prefer to attach the test to the bug report and not have it in the unit tests, else you have to go back and change the unit tests if the business decision after the investigation is to not fix the bug – jk. Jan 23 '12 at 12:57
  • 1
    @jk I think the point is that there *should* have been a test there to start with. The task is not to add new unit tests, rather it's to patch leaks in the existing test coverage. – MattDavey Jan 24 '12 at 08:15
  • @MattDavey ok, but is the result of the test a fail or pass? in the most extreme case that will depend on the business decision that is yet to be made – jk. Jan 24 '12 at 08:18
  • IMHO, the unit tests should be created by the developer fixing the bug. The person doing triage might provide suggestions on unit tests (a test list), but I don't think that person should be dictating to the developer. – John Saunders Jan 25 '12 at 19:00
  • I don't like the idea of "ignored" tests for known bugs. When people do that they tend to focus on other stuff, and bugs are serious. If there is a known bug and it makes trouble, I would like to NEVER ignore it. A breaking build means something is wrong and must be fixed. I don't like the idea of prioritizing bugs when you have tests for it - fix it asap. – Hugo Tavares Mar 12 '12 at 02:33
  • 1
    @HugoTavares: I feel with you about not wanting to ignore bugs - but please see above. _Scheduling a bug is a business decision_ - it's not just your decision as a dev if a bug is really serious. There may be situations where a new feature is more important than a (minor) bug. – sleske Mar 12 '12 at 11:08
110

Why would you want to allow a build to succeed with known defects?

Because sometimes, you have time constraints. Or the bug is so minor that it isn't really worth delaying the shipment of the product for a few days needed to unit test and fix it.

Also, what's the point in breaking the build intentionally every time you find a bug? If you found it, fix it (or assign it to the person who will fix it), without disturbing your whole team. If you want to remember to fix a bug, then you need to mark it as very important in your bug tracking system.

thegreendroid
  • 359
  • 2
  • 13
Arseni Mourzenko
  • 134,780
  • 31
  • 343
  • 513
  • I see your point, and generally I agree - but in this case we're talking about a serious bug that made it into production and was encountered by end users :s – MattDavey Jan 20 '12 at 11:21
  • 3
    See the second paragraph of my answer. – Arseni Mourzenko Jan 20 '12 at 11:25
  • 8
    I understand, I think the point is summed up in your first paragraph - it's not up to the developer to judge the seriousness of the bug, or whether it's a show-stopper, that's a decision for the wider business. – MattDavey Jan 20 '12 at 11:40
  • 4
    The question is what is the priority of this bug? It could be a OMG FIX IT NOW it could be Yea thats annoying we should fix it at some point it could be something in the middle. But not all bugs will hit at the same place on that spectrum. – Zachary K Jan 21 '12 at 16:59
58

Tests are there to ensure that you don't (re-)introduce problems. The list of failing tests isn't a substitute for a bug tracking system. There is some validity in the POV that failing tests aren't only indication of bugs, they are also an indication of development process failure (from carelessness to badly identified dependency).

AProgrammer
  • 10,404
  • 1
  • 30
  • 45
  • 20
    *"list of failing tests isn't a substitute for a bug tracking system"* +1, also a very good point :) – MattDavey Jan 20 '12 at 12:16
  • 1
    I'd suggest tohave regression unit tests enter code base as part of the bugfix, not earlier. – yrk Jan 20 '12 at 12:37
  • 6
    @yarek: The regression tests can go into the codebase at any time, they just need to be ignored until the bug is fixed. I usually develop them while diagnosing the problem, because they can then double as a debugging aid. – sleske Jan 20 '12 at 12:42
  • This is a good example of why "breaking the build" becomes a toxic part of a workplace where CI devolves into merely "Blame Driven Development". I have sat in lots of meetings where the PHB starts whinging about "carelessness" as if that's why the build broke. In such an environment, you would hardly intentionally check in something that broken the build. Otherwise the PHB will get upset. Break the build, wear the cone of shame. What a crappy practice. – Warren P Feb 09 '12 at 17:16
  • @WarrenP, there are sometimes other issues, but let's be clear, carelessness is the first reason why builds break. If you know that something break the build, why check it in? – AProgrammer Feb 09 '12 at 18:02
24

"Break the build" means to prevent a build from completing successfully. A failing test doesn't do that. It is an indication that the build has known defects, which is exactly correct.

Most build servers track the status of tests over time, and will assign a different classification to a test that's been failing since it was added vs a regression (test that used to pass and no longer does), and also detect the revision in which the regression took place.

Ben Voigt
  • 3,227
  • 21
  • 24
  • 12
    This isn't always correct, often teams consider a failing test as a broken build--in fact most of the teams I've seen lately do (It's a typical agile practice). With most agile teams a failing test is a case where you stop the line--the entire team attacks the problem and solves it. I suppose I could take your post to mean that the response has to be based on your practices, in which case it's totally accurate. – Bill K Jan 21 '12 at 00:03
  • 2
    I always consider a failing test to mean the build has failed. – John Saunders Jan 25 '12 at 19:01
  • @JohnSaunders: But that isn't what it means. As I said in my answer, it means "the build has known defects". – Ben Voigt Jan 25 '12 at 19:50
  • I suppose there are different definitions. The one I use is "the build fails". Perhaps this is because I use TFS, which includes defect tracking, that I don't need a definition that means "there are known defects". "Known defects" are expected, until they are prioritized to be fixed. – John Saunders Jan 25 '12 at 19:53
  • @JohnSaunders: Without tests, TFS isn't doing a very good job of tracking defects. – Ben Voigt Jan 25 '12 at 20:06
  • 1
    @ho said there were no tests? Where do you get that? I mean that my first step after finding the bug is not to prevent the build from succeeding, but rather to create a detailed bug report. When the bug is being fixed, there should first be a failing unit test. – John Saunders Jan 25 '12 at 21:10
  • @JohnSaunders: Ahh, but you assume that the test won't pass until a developer specifically works on that bug. That's a very dangerous assumption. Often work necessary for fixing another bug will cause this test to pass. By your own words, "first there should be a failing unit test". And a detailed bug report should include a repro case, which is what a test is. – Ben Voigt Jan 25 '12 at 21:51
  • 1
    I have little problem with the immediate creation of a failing test. I just don't want it checked in to the set of tests that will run upon build. I also want the developer who fixes the bug to be able to ignore that test. In most places I've worked, the people who create the bug report will not be creating unit tests. – John Saunders Jan 25 '12 at 22:04
17

I would argue that the failing test should be added, but not explicitly as "a failing test."

As @BenVoigt points out in his answer, a failing test doesn't necessarily "break the build." I guess the terminology can vary from team to team, but the code still compiles and the product can still ship with a failing test.

What you should ask yourself in this situation is,

What are the tests meant to accomplish?

If the tests are there just to make everyone feel good about the code, then adding a failing test just to make everyone feel bad about the code doesn't seem productive. But then, how productive are the tests in the first place?

My assertion is that the tests should be a reflection of the business requirements. So, if a "bug" has been found that indicates a requirement is not properly met, then it is also an indication that the tests do not properly or fully reflect the business requirements.

That is the bug to be fixed first. You're not "adding a failing test." You're correcting the tests to be a more accurate reflection of the business requirements. If the code then fails to pass those tests, that's a good thing. It means the tests are doing their job.

The priority of fixing the code is to be determined by the business. But until the tests are fixed, can that priority truly be determined? The business should be armed with the knowledge of exactly what is failing, how it is failing, and why it is failing in order to make their decisions on priority. The tests should indicate this.

Having tests which don't fully pass isn't a bad thing. It creates a great artifact of known issues to be prioritized and handled accordingly. Having tests which don't fully test, however, is a problem. It calls into question the value of the tests themselves.

To say it another way... The build is already broken. All you're deciding is whether or not to call attention to that fact.

David
  • 1,560
  • 9
  • 16
  • 1
    Your assertion is wrong, unit tests don't necessarily map directly to business requirements, whereas functional or end-to-end tests probably do, but the OP was talking about unit tests/TDD. – John Buchanan Jan 20 '12 at 18:13
  • @JohnBuchanan: What are the tests meant to validate, if not that the software is doing what it is supposed to do? (That is, that it meets the requirements.) There are, as you state, other forms of tests outside of unit tests. But I fail to see the value in unit tests which don't validate that that software meets the needs of the business. – David Jan 20 '12 at 18:17
  • 1
    @JohnBuchanan - he didn't say "the tests ARE a reflection of the business requirements", he said "SHOULD BE". Which is true, but debatable. You're right in claiming that this isn't always the case - some people write unit tests that don't map to business requirements - although (in my opinion) they're wrong to do so. If you want to claim that David's assertion is wrong, you could say something about why you think so. – Dawood ibn Kareem Jan 21 '12 at 00:29
14

In our test automation team, we check in failing tests as long as they fail due to a defect in the product rather than a defect in the test. That way we have proof for the dev team that hey, they broke it. Breaking the build is highly frowned upon, but that's not the same as checking in perfectly compilable but failing tests.

Yamikuronue
  • 702
  • 3
  • 13
  • 4
    I think @MattDavey's idea is an excellent one, and i have argued for it in the past. I have always met a stone wall of resistance - "you should never break the build!". The idea that the build is *already broken* in this situation seems impossible for people to grasp. Sadly, this is just another case where a good idea (automatic testing and clean builds) has devolved into a cargo-cult practice whose adherents know the rule but not the reason. – Tom Anderson Jan 21 '12 at 10:07
  • 3
    One idea i came up with is that the QA team (if they're technical enough to write tests) should be allowed to write failing tests for bugs, and check them in. The developers' obsession with the green bar would then lead to absolutely prioritising fixing bugs over adding features, which is the correct way to do development. – Tom Anderson Jan 21 '12 at 10:09
  • But these should not be unit tests that will run during the build. If your environment contains a test management system for QA (like Microsoft Test Manager), then certainly, one or more test cases should be added and linked to the bug, but this would not prevent the build from succeeding - it would simply be a test case that has to pass before the bug is considered "Done". – John Saunders Jan 25 '12 at 19:03
8

Writing a test that you know will fail until the bug is fixed is a good idea, it's the basis of TDD.

Breaking the build is a bad idea. Why? Because it means that nothing can move on until it is fixed. It essentially blocks all other aspects of development.

Example 1
What if your application is vary large, with many components? What if those components are worked on by other teams with their own release cycle? Tough! They have to wait for your bug fix!

Example 2
What if the first bug is hard to fix and you find another bug with a higher priority? Do you also break the build for the second bug? Now you can't build until both are fixed. You have created an artificial dependency.

There is no logical reason why failing a test should stop a build. This is a decision that the dev team needs to make (perhaps with management discussion) weighing up the pros and cons of releasing a build with known bugs. This is very common in software development, pretty much all major software is released with at least some known issues.

gnat
  • 21,442
  • 29
  • 112
  • 288
Qwerky
  • 1,582
  • 8
  • 17
5

It depends on the bug, of course, but generally if something went wrong that wasn't identified during manual or automated testing, then that implies there is a gap in your coverage. I would definitely encourage figuring out the root cause and slapping a unit test case on top of the problem.

If it's a production issue that is planned for a hot fix from a maintenance branch, you also need to ensure that the fix works on the mainline, and that a developer can't erroneously blow away the fix with an over-zealous merge conflict resolution.

Additionally, depending on your release policy, the presence of newly updated unit tests can help confirm that a developer has actually fixed the problem, rather than merely changing it [(the problem or the tests?)], although this depends on having encoded the correct requirements in the new unit tests.

Keith Brings
  • 486
  • 2
  • 5
5

It depends on the role the tests are supposed to play and how their results are supposed to affect the build system/process adopted. My understanding of breaking the build is the same as Ben's, and at the same time we should not knowingly check in code that breaks existing tests. If those tests came in "later", it might be "okay" to ignore them just to make it less unnecessarily disruptive to others, but I find this practice of ignoring failing tests (so that they appear to pass) rather disturbing (especially so for unit tests), unless there is a way to indicate such tests as neither red nor green.

prusswan
  • 201
  • 2
  • 9
  • *"unless there is a way to indicate such tests as neither red nor green"* Just for the record: Most unit testing frameworks do distinguish red, green and ignored tests. At least JUnit and TestNG do (they report "xx test, x failed, x ignored"). – sleske Jan 21 '12 at 16:52
  • @sleske that would be ideal, I just want to be sure that ignoring failures do not automatically turn into a success – prusswan Jan 21 '12 at 16:57
  • Aren't there YELLOW builds? (Red/Green/Yellow) in Hudson/Jenkins, Cruise Control, and all the other biggies? – Warren P Feb 09 '12 at 17:18
  • @Warren P it works when people ignore tests correctly, but some ignore tests by making them green – prusswan Feb 09 '12 at 23:15
5

One problem with adding a know-to-fail test to the build is that your team may get in the habit of ignoring failing tests because they expect the build to fail. It depends on your build system, but if a failing test doesn't always mean "something has just broken" then it's easy to pay less attention to failing tests.

You don't want to help your team get into that mindset.

So I agree with sleske that you should add the test, but mark it as 'ignored' for the purpose of the automatic build, until the bug is fixed.

Wilka
  • 173
  • 6
4

While I think you should in some way 'check in' the bug as test, so that when you fix it it does not happen again, and in some way prioritize it, I think it's best not to break the build (/the tests). The reason is that subsequently build-breaking commits will be hidden behind your broken test. So by checking in a broken test for this bug you demand that the whole team sets their work aside to fix this bug. If that does not happen you might end up with breaking commits which are not traceable as such.

Therefore I would say it's better to commit it as a pending test, and make it a priority in your team not to have pending tests.

markijbema
  • 379
  • 1
  • 6
4

Another option is to check the failing test into a separate branch in your source control system. Depending on your practices, this may be feasible or not. We sometimes open a new branch for ongoing work, such as fixing a bug that is not trivial.

Ola Eldøy
  • 211
  • 2
  • 4