9

I recently watched "All the Little Things" from RailsConf 2014. During this talk, Sandi Metz refactors a function that includes a large nested if-statement:

def tick
    if @name != 'Aged Brie' && @name != 'Backstage passes to a TAFKAL80ETC concert'
        if @quality > 0
            if @name != 'Sulfuras, Hand of Ragnaros'
                @quality -= 1
            end
        end
    else
        ...
    end
    ...
end

The first step is to break the function into several smaller ones:

def tick
    case name
    when 'Aged Brie'
        return brie_tick
    ...
    end
end

def brie_tick
    @days_remaining -= 1
    return if quality >= 50

    @quality += 1
    @quality += 1 if @days_remaining <= 0
end

What I found interesting was the way these smaller functions were written. brie_tick, for example, was not written by extracting the relevant parts of the original tick function, but from scratch by referring to the test_brie_* unit tests. Once all of these unit tests passed, brie_tick was considered done. Once all of the small functions were done, the original monolithic tick function was deleted.

Unfortunately, the presenter seemed unaware that this approach led to three of the four *_tick functions being wrong (and the other was empty!). There are edge cases in which the behaviour of the *_tick functions differs from that of the original tick function. For example, @days_remaining <= 0 in brie_tick should be < 0 - so brie_tick does not work correctly when called with days_remaining == 1 and quality < 50.

What has gone wrong here? Is this a failure of testing - because there were no tests for these particular edge cases? Or a failure of refactoring - because the code should have been transformed step-by-step rather than rewritten from scratch?

user200783
  • 167
  • 6
  • 2
    I'm not sure I get the question. Of course it is OK to rewrite code. I am not sure what you mean specifically by "is it okay to *simply* rewrite code." If you're asking "Is it okay to rewrite code without putting a lot of thought into it," the answer is no, just as it is not OK to *write* code in that manner. – John Wu Oct 24 '18 at 04:55
  • This happens often due to test plans mainly focused on testing use cases of success and very little (or not at all) on covering error use cases or sub-use cases. So it's mainly a leak of coverage. A leak of testing. – Laiv Oct 24 '18 at 06:38
  • @JohnWu - I was under the impression that refactoring was generally done as a series of small transformations to the source code ("extract-method" etc.) rather than by simply rewriting code (by which I mean writing it again from scratch without even looking at the existing code, as done in the linked presentation). – user200783 Oct 24 '18 at 09:01
  • @JohnWu - Is rewriting from scratch an acceptable refactoring technique? If not, it's disappointing to see such a well-regarded presentation on refactoring take that approach. OTOH if it is acceptable, then unintended changes in behaviour can be blamed on missing tests - but is there any way to be confident that tests cover all possible edge cases? – user200783 Oct 24 '18 at 09:07
  • @User200783 Well that is a bigger question, isn't it (how do I ensure my tests are comprehensive?) Pragmatically, I would probably run a code coverage report before making any changes, and carefully examine any areas of code that don't get exercised, ensuring the development team is mindful of them as they rewrite the logic. – John Wu Oct 24 '18 at 15:46

4 Answers4

11

Is this a failure of testing - because there were no tests for these particular edge cases? Or a failure of refactoring - because the code should have been transformed step-by-step rather than rewritten from scratch?

Both. Refactoring using only the standard steps from Fowlers original book is definitely less error prone than doing a rewrite, so it is often preferable to use only these kinds of baby steps. Even if there are no unit tests for every edge case, and even if the environment does not provide automatic refactorings, a single code change like "introduce explaining variable" or "extract function" has a much smaller chance to change the behaviour details of the existing code than a complete rewrite of a function.

Sometimes, however, rewriting a section of code it is what you need or want to do. And if thats the case, you need better tests.

Note that even when using a refactoring tool, there is always a certain risk of introducing errors when you change code, regardless of applying smaller or bigger steps. That's why refactoring always needs tests. Note also that tests can only reduce the probability of bugs, but never prove their absence - nevertheless using techniques like looking at the code and branch coverage can give you a high confidence level, and in case of a rewrite of a code section, it is often worth to apply such techniques.

Doc Brown
  • 199,015
  • 33
  • 367
  • 565
  • 1
    Thanks, that makes sense. So, if the ultimate solution to undesirable changes in behaviour is to have comprehensive tests, is there any way to be confident that tests cover all possible edge cases? For example, it would be possible to have 100% coverage of `brie_tick` while still never testing the problematic `@days_remaining == 1` case by, for example, testing with `@days_remaining` set to `10` and `-10`. – user200783 Oct 24 '18 at 09:20
  • 2
    You can never be absolutely certain that the tests cover all possible edge cases, since it isn't feasible to test with all possible inputs. But there are lots of ways to gain more confience in tests. You could look into *mutation testing*, which is a way to test the effectiveness of the tests. – bdsl Oct 24 '18 at 10:30
  • 1
    In this instance, the missed branches could have been caught with a code coverage tool while developing the tests. – cbojar Oct 24 '18 at 14:44
2

What has gone wrong here? Is this a failure of testing - because there were no tests for these particular edge cases? Or a failure of refactoring - because the code should have been transformed step-by-step rather than rewritten from scratch?

One of the things that is really hard about working with legacy code: acquiring a complete understanding of the current behavior.

Legacy code without tests that constrain all of the behaviors is a common pattern in the wild. Which leaves you with a guess: does that mean that the unconstrained behaviors are free variables? or requirements that are underspecified?

From the talk:

Now this is real refactoring according to the definition of refactoring; I'm going to refactor this code. I'm going to change its arrangement without altering its behavior.

This is the more conservative approach; if the requirements may be underspecified, if the tests do not capture all of the existing logic, then you have to be very very careful about how you proceed.

For certain, you can assert that if the tests are inadequately describing the behavior of the system, that you have a "failure of testing". And I think that's fair -- but not actually useful; this is a common problem to have in the wild.

Or a failure of refactoring - because the code should have been transformed step-by-step rather than rewritten from scratch?

The issue isn't quite that the transformations should have been step-by-step; but rather that the choice of refactoring tool (human keyboard operator? rather than guided automation) wasn't well aligned with the test coverage, because of the higher error rate.

This could have been addressed either by using refactoring tools with higher reliability or by introducing a wider battery of tests to improve the constraints on the system.

So I think your conjunction is poorly chosen; AND not OR.

VoiceOfUnreason
  • 32,131
  • 2
  • 42
  • 79
2

Refactoring should not change the externally visible behaviour of your code. That’s the goal.

If your unit tests fail, that indicates you changed the behaviour. But passing unit tests is never the goal. It helps more or less to achieve your goal. If refactoring changes the externally visible behaviour, and all unit tests pass, then your refactoring failed.

Working unit tests in this case only give you the wrong feeling of success. But what went wrong? Two things : The refactoring was careless, and the unit tests were not very good.

gnasher729
  • 42,090
  • 4
  • 59
  • 119
1

If you define "correct" to be "the tests pass", then by definition it is not wrong to change untested behaviour.

If a particular edge behaviour should be defined, add a test for it, if not, then it's OK to not care what happens. If you are really pedantic, you can write a test that checks true when in that edge case to document that you don't care what the behaviour is.

Caleth
  • 10,519
  • 2
  • 23
  • 35