22

I am a software engineer at a medium sized company. We have a fairly robust testing platform running on TeamCity. It does unit tests on every checkin, and a daily unit test/BVT run.

The problem is - we have a great deal of broken unit tests.

Quite often, I bring up the pointlessness of unit tests if they are constantly breaking and unmaintained. Being unable to see if a change has caused a regression removes most of the value of a unit testing platform.

I would like to get a seed planted that will create a culture of good habits - fixing tests when they're broken, seeing them as valuable, prioritizing the fixing of tests along with other work.

I've tried bribery (baked goods!), just plain asking, and speaking to team leads. Everyone says that it's a good idea, but I see to be the only one doing anything about it.

What is the best way to get started on encouraging others to fix their tests, and prioritize test fixing within their sprints?

If there is a less subjective way to ask this, I would be happy to accept any tips.

Codeman
  • 1,117
  • 1
  • 8
  • 13
  • 2
    automatic nerf gun aimed at the guy breaking the build... – ratchet freak Nov 14 '13 at 01:31
  • We actually do have an automatic nerf gun... but the build isn't broken, just the unit tests :) – Codeman Nov 14 '13 at 01:34
  • 7
    Breaking the unit tests should imply breaking the build. ;) – Jeroen Vannevel Nov 14 '13 at 01:41
  • That might be part of the problem! – Codeman Nov 14 '13 at 01:43
  • sprints of only testing – ratchet freak Nov 14 '13 at 02:39
  • This is all about buy-in, and technical solutions will only work if you have team support. Breaking the build if tests fail just leads to tests being removed "it failed, so it's obviously broken" unless your team change their attitude towards unit tests. There's a lot of other questions here about this problem. – Móż Nov 14 '13 at 02:40
  • If you were using Maven then your build tool would punish your team for having broken tests. Of course then I have seen people circumvent this by commenting out broken tests :( – maple_shaft Nov 14 '13 at 03:04
  • @maple_shaft: That's why you also fail if the coverage drops too low. – Aaronaught Nov 14 '13 at 03:22
  • 2
    @Ӎσᶎ: Buy-in is important, but you can't get buy-in for resolving an issue until people are actually *aware* of the issue. In this case the buy-in need only *initially* come from team leads and managers. Developer buy-in can come later and generally will, naturally, when the build system has been set up to make individual developers pay for their own mistakes. – Aaronaught Nov 14 '13 at 03:25
  • 2
    If doughnuts failed, you're toast. :-) – Ross Patterson Nov 14 '13 at 11:12
  • @JeroenVannevel is right. In our TeamCity setup, the build is red if any tests fail. – Ross Patterson Nov 14 '13 at 11:13
  • The build IS red. It just still allows the deployment to go on. – Codeman Nov 14 '13 at 22:26

2 Answers2

28

Make it so that's impossible to actually release anything without fixing the tests.

  1. Fail the build if any tests fail.
  2. Fail the build if any tests are ignored.
  3. Fail the build if test coverage goes below a certain level (so people can't just delete tests to work around it).
  4. Use the CI server to do your release builds, and only allow builds from the server's build drop to be promoted to UAT/staging/production/whatever.

The fact of the matter is, if your build is broken for more than about 15 minutes at a time (and that includes failing tests), then you aren't doing continuous integration.

The "nuclear option" is to have your source control server refuse commits/checkins from any user other than the one who broke the build. Obviously an admin needs to be able to override this temporarily if said person goes on holiday - but, if everybody knows that the whole team is screwed until they fix their tests, then they'll resolve it damn quick.

A good policy (which is even better when it's automated) is to revert the source to the last known stable commit after 15 minutes of the build failing. In other words, if you can't fix it, or don't know what caused the build or test to break, then revert it and work locally until it's resolved - never ever make other developers twiddle their thumbs while you grind away at a problem they don't care about.

P.S. If you already have a lot of tests failing, you can use a "trailing threshold" in CI. Set it up so that the build only fails if there are more test failures than last time. This, along with a coverage rule, will force developers to eventually improve the test situation if they want to be able to keep working.

P.P.S. I realize this might seem draconian to some, but it's all down your culture. If you get to a point where people just don't leave the build broken or tests failing (my team almost never does, although I occasionally have to remind them), then you don't need to continue with the strictest set of rules. Although IMO you should always fail the build on a broken unit test. Integration/browser tests can fail sometimes.

Aaronaught
  • 44,005
  • 10
  • 92
  • 126
  • 1
    While all your technical hints are useful, I think the most valuable part of your answer is that “It's all your culture” because more than a discipline problem, it is a problem of perceived utility of the test. I would rather put it to the front than in a P.P.S. – user40989 Nov 14 '13 at 12:20
  • @user40989: I hear you. Culture is something you have to cultivate, though. If you want people to understand how important tests are, you sometimes have to make it so that people *can't* ignore them. Once people get used to a high level of code coverage and green tests, they won't *want* to go back, and then your own developers will enforce it for new recruits. Well, hopefully. An anal-retentive team lead and/or build master and/or manager helps. :) – Aaronaught Nov 14 '13 at 12:40
  • FWIW: Our entire release process is now automated and people wouldn't *think* of having broken tests. The team lead does a merge up to master, then starts a release build, and sends a promotion request to the sysadmins who literally push a button to deploy from the build artifacts and run automated browser and API tests. Nobody complains about this process, ever, because it *saves time* - we used to spend 2 weeks fussing over a release, now it's basically a handwave. This is what I mean by cultivating the culture - everyone knows that the extra 2 minutes to fix a test will save 2 hours later. – Aaronaught Nov 14 '13 at 12:45
10

Units tests that fail are not the problem. They are a symptom.

The real problem is in the culture. You need to tread gently: here be dragons. You cannot change the culture by yourself, and being the squeaky wheel will, in the end, make you an outcast. Literally.

I suggest that if you try to find a senior person to champion the cause and lead the way. If that fails, try raising it in a general developers meeting, without pointing fingers or calling names. Another alternative is to take responsibility yourself for doing a proper job: just fix some more tests every time you do a check-in. Keep a chart on the wall showing how many tests fail over time. Others will see it: maybe they'll opt in.

There is no easy answer.

andy256
  • 3,156
  • 2
  • 15
  • 20
  • Being the squeaky wheel made me team lead. Maybe there was something wrong with your squeak? In all seriousness, though, that speaks to a very *different* culture problem, not with the dev team but with the management of the company. If management's response to a burning fire is to put on sunglasses, then just get the hell out of there. But if you're actually a dev *shop*, as opposed to an enterprise IT department churning out software from a cost center, it's fairly likely that managers care about things like how frequently and safely you can release to market. – Aaronaught Nov 14 '13 at 12:57