151

I've seen the book Working Effectively with Legacy Code recommended a few times. What are the key points of this book?

Is there much more to dealing with legacy code than adding unit/integration tests and then refactoring?

Armand
  • 6,508
  • 4
  • 37
  • 53
  • 10
    [Short review](http://www.dmst.aueb.gr/dds/pubs/Breview/2005-CR-Legacy/html/review.html) by [Diomidis Spinellis](http://en.wikipedia.org/wiki/Diomidis_Spinellis) – yannis Nov 28 '11 at 07:51
  • 2
    Of course the point is to add tests and then refactor. The book is largely about *how* you manage to get impossibly convoluted code under test, and there are a lot of eye-openers on that point. Let's just say that Feathers doesn't take any prisoners! – Kilian Foth Nov 28 '11 at 07:57
  • 9
    Perhaps you should just read the book – HLGEM Nov 28 '11 at 18:48
  • Nice review here: http://www.andreaangella.com/2014/03/book-working-effectively-with-legacy.html – rohancragg Nov 03 '14 at 15:42
  • @yannis Link for Short review has gone and had no redirect. Here is the new link to the [review](https://www.productivecsharp.com/2014/03/book-working-effectively-with-legacy-code/) by same guy Andrea Angella. – Animesh Jan 10 '20 at 07:29

5 Answers5

177

The key problem with legacy code is that it has no tests. So you need to add some (and then more...).

This in itself would take a lot of work, as @mattnz noted. But the special problem of legacy code is that it was never designed to be testable. So typically it is a huge convoluted mess of spaghetti code, where it is very difficult or downright impossible to isolate small parts to be unit tested. So before unit testing, you need to refactor the code to make it more testable.

However, in order to refactor safely, you must have unit tests to verify that you haven't broken anything with your changes... This is the catch 22 of legacy code.

The book teaches you how to break out of this catch by making the absolute minimal, safest changes to the code just to enable the first unit tests. These aren't meant to make the design nicer - only to enable unit tests. In fact, sometimes they do make the design uglier or more complex. However, they allow you to write tests - and once you have unit tests in place, you are free to make the design better.

There are lots of tricks to make the code testable - some are sort of obvious, some are not at all. There are methods I would have never thought about myself, without reading the book. But what is even more important is that Feathers explains what precisely makes a code unit testable. You need to cut dependencies and introduce barriers into your code, but for two distinct reasons:

  • sensing - in order to check and verify the effects of executing a piece of code, and
  • separation - in order to get the specific piece of code into a test harness first of all.

Cutting dependencies safely can be tricky. Introducing interfaces, mocks and Dependency Injection is clean and nice as a goal, just not necessarily safe to do at this point. So sometimes we have to resort to subclassing the class under test in order to override some method which would normally e.g. start a direct request to a DB. Other times, we might even need to replace a dependency class/jar with a fake one in the test environment...

To me, the most important concept brought in by Feathers is seams. A seam is a place in the code where you can change the behaviour of your program without modifying the code itself. Building seams into your code enables separating the piece of code under test, but it also enables you to sense the behaviour of the code under test even when it is difficult or impossible to do directly (e.g. because the call makes changes in another object or subsystem, whose state is not possible to query directly from within the test method).

This knowledge allows you to notice the seeds of testability in the nastiest heap of code, and find the minimal, least disruptive, safest changes to get there. In other words, to avoid making "obvious" refactorings which have a risk of breaking the code without you noticing - because you don't yet have the unit tests to detect that.

Péter Török
  • 46,427
  • 16
  • 160
  • 185
  • 9
    Note when the above answer says _"unit tests"_ it in fact means _"automated tests"_. For a legacy app, a large fraction of the initially useful automated tests will in fact be _**integration tests**_ (where the test logic executes a larger slab of the overall code and may produce failures due to defects in many different places), rather than true _unit tests_ (that aim to analyze just one module and execute far smaller portions of the code each). – Lutz Prechelt Jan 18 '19 at 08:40
111

Quick ways to get the key points of Working Effectively With Legacy Code

MarkJ
  • 2,827
  • 1
  • 18
  • 18
  • 4
    The MP3 link on that Hanselminutes page is broken, but the one on http://www.hanselminutes.com/165/working-effectively-with-legacy-code-with-michael-feathers is not - http://s3.amazonaws.com/hanselminutes/hanselminutes_0165.mp3. – Peter Mortensen Feb 22 '14 at 08:09
  • Thanks rosston for fixing the PDF link. Looks like objectmentor.com has gone - maybe "Uncle Bob" went out of business? – MarkJ Feb 18 '16 at 14:07
  • 1
    I'm not sure what's happened to object mentor, but these days Uncle Bob works for 7th Light. – Jules Feb 18 '16 at 22:03
45

I work on a code base of millions of lines of code, some dating back to the 1980's. It's just software, so it's just a matter of writing a few unit tests, so you can go and just refactor it, and make it so much better.

The key word here is just - it's a four-letter word that does not belong in any programmer's vocabulary, let alone one who is working on legacy systems.

How long do you think it takes to write a unit test, to test one hour worth of development effort? For discussion sake, let's say another hour.

How much time is invested in that million-line, 20-year-old legacy system? Let's say, 20 developers for 20 years times 2000 hours / annum (they worked pretty hard). Let's now pick a number - you have new computers and new tools, and you are so much more clever than the guys who wrote this piece of $%^^ in the first place - let's say you are worth 10 of them. Have you got 40 man years, well, have you...?

So the answer to your question is there is much much more. For instance, that routine that's 1000 lines (I have a few that are over 5000), it is overly complex and is a piece of spaghetti. It would only (yet another four letter word) take a couple of days to re-factor it into a few 100 line routines and a few more 20 lines helpers, right? WRONG. Hidden in those 1000 lines is 100 bug fixes, each one an undocumented user requirement or an obscure edge case. It's 1000 lines because the original 100-line routine did not do the job.

You need to work with the mind set "if it ain't broke, don't fix it". When it is broke, you need to be very careful when you fix it - as you make it better, that you don't accidentally change something else. Note that "broke" may include code that is unmaintainable, but working correctly, that depends on the system and its use. Ask "what happens if I screw this up and make it worse", because one day you will, and you will have to tell the bosses' boss why you chose to do that.

These systems can always be made better. You will have a budget to work to, a timeline, whatever. If you don't - go and make one. Stop making it better when the money/time has run out. Add a feature, give yourself time to make it a little better. Fix a bug - again, spend a little extra time and make it better. Never deliver it worse than it was when you started.

gnat
  • 21,442
  • 29
  • 112
  • 288
mattnz
  • 21,315
  • 5
  • 54
  • 83
  • 2
    thanks for the tips! Are these yours or from the book? – Armand Nov 28 '11 at 08:50
  • 3
    Probably a bit of both - I read the book after a few years of doing this work, and should probably read it agian. Like any good book, it will make you challange some of what you are currently doing, re-enforce most of what you do, but does not have all the answers for your specific set of problems. – mattnz Nov 28 '11 at 19:14
  • 7
    "It's 1000lines because the orginal 100line routine did not do the job." So very rarely does this appear to actually be the case. More often than not it's 1,000 lines simply because the original developer rolled up his/her sleeves and started coding before sparing even a moment for planning or design. – Stephen Touset Aug 22 '12 at 15:16
  • @Stephen: Are you saying that people who wrote a majority of the code in large critical systems used today did not do planning and design. More importantly, are you saying you do a better job and are better than them. Is that not just a little arrogant. – mattnz Aug 22 '12 at 23:06
  • 3
    No. I'm saying that in most of the cases (I've personally encountered), 1,000 line routines are clear indications that people started writing code before thinking about how to write a proper abstraction. Thousand line routines are virtually by definition too complicated – are you saying that a thousand-line routine with hundreds of hidden, uncommented bugfixes is the hallmark of a responsible developer? – Stephen Touset Aug 23 '12 at 04:53
  • 19
    If you believed every post on this site, everyone has to deal with 1000 line spaghetti code, yet no one ever wrote it. In my experience, 1000 (and 10000) line routines are the hall mark of developers who are doing the best they can with what they have, to deliver what is required of them by the boss who pays their wages. I find it insulting and arrogant the way so many developers feel free to comment from the sidelines with no knowledge of the circumstances, while never having to expose there own work to the community to critique. – mattnz Aug 24 '12 at 01:41
  • 1
    Sorry for the late comment, but *I've* written 1000 line routines, albeit they weren't spaghetti code. So I'll own up to that. Yes, it was structured, not OO code. And yes, we had severe time constraints. Generally, those routines were the simplified versions of what I had started with from the design. The routines I'm thinking of were modeling very complex processes, and I commented liberally as to why things were being done. OTOH, I have also paid my karmic dues and _fixed_ 1000+ line routines, too. –  Jan 16 '13 at 17:08
  • @mattnz: I would have been a much worse programmer in the 1980s than today because I wouldn't have been able to read books like Refactoring and others on TDD. So, it's not arrogance I think, just an acceptance that best practice moves forward. I do not blame anyone for writing code that was the best they could do under the circumstances. I hope that 30 years from now things have improved again and the stuff we write today is seen as kind of crap (because some very clever people discover new paradigms of programming that makes things better). – Sean Sep 08 '14 at 07:15
  • 1
    I work in a system with 1000 line routines, written by my boss whom I report to directly. There is a ton of functionality that's there because of business requirements - and a ton that even he occasionally doesn't remember the function of. Add global state to sweeten the deal... At the end of the day, there is no replacement for good design. I've refactored many of these methods, and have to say, it is worth the time and effort. – Bryan Rayner Jan 16 '16 at 15:19
  • +1 for timeboxing +1 for a need to get yourself a budget for improvement +10 for putting the two in a single statement – przemo_li Jul 05 '19 at 13:09
23

There are two key points to take away from the book.

  1. Legacy code is any code that does not have test coverage.
  2. Whenever you have to change legacy code, you should make sure it has coverage.

As other responders have pointed out, trying to pre-emptively update your existing legacy code is a fool's errand. Instead, whenever you have to make a change to legacy code (for a new feature or a bug fix), take the time to remove its legacy status.

Peter Mortensen
  • 1,050
  • 2
  • 12
  • 14
Michael Brown
  • 21,684
  • 3
  • 46
  • 83
9

In a nut shell that's true - adding tests and refactoring is what's it all about.

But the book gives you many different techniques to do so with code that is very difficult to test and refactor safely.

Oded
  • 53,326
  • 19
  • 166
  • 181