55

I was cruising around the programming blogosphere when I happened upon this post about GOTO's:

http://giuliozambon.blogspot.com/2010/12/programmers-tabu.html

Here the writer talks about how "one must come to the conclusion that there are situations where GOTOs make for more readable and more maintainable code" and then goes on to show an example similar to this:

if (Check#1)
{
    CodeBlock#1
    if (Check#2)
    {
        CodeBlock#2
        if (Check#3)
        {
            CodeBlock#3
            if (Check#4)
            {
                CodeBlock#4
                if (Check#5)
                {
                    CodeBlock#5
                    if (Check#6)
                    {
                        CodeBlock#6
                        if (Check#7)
                        {
                            CodeBlock#7
                        }
                        else
                        {
                            rest - of - the - program
                        }
                    }
                }
            }
        }
    }
}

The writer then proposes that using GOTO's would make this code much easier to read and maintain.

I personally can think of at least 3 different ways to flatten it out and make this code more readable without resorting to flow-breaking GOTO's. Here are my two favorites.

1 - Nested Small Functions. Take each if and its code block and turn it into a function. If the boolean check fails, just return. If it passes, then call the next function in the chain. (Boy, that sounds a lot like recursion, could you do it in a single loop with function pointers?)

2 - Sentinal Variable. To me this is the easyest. Just use a blnContinueProcessing variable and check to see if it is still true in your if check. Then if the check fails, set the variable to false.

How many different ways can this type of coding problem be refactored to reduce nesting and increase maintainability?

gnat
  • 21,442
  • 29
  • 112
  • 288
saunderl
  • 655
  • 1
  • 6
  • 7
  • 2
    None. `Check#1` is not a valid boolean expression in any language I'm aware of. – Edward Strange Feb 14 '11 at 20:39
  • 1
    This seems like it would be better served if it were asked on SO. – Walter Feb 14 '11 at 20:39
  • 1
    That is always a question in my mind, when I should use SO and when I should post here. Others seem to think I get it wrong a lot. :-) – saunderl Feb 14 '11 at 20:42
  • 23
    Crazy Eddie - Pseudo Code Dude. Not so literal. – saunderl Feb 14 '11 at 20:43
  • 2
    pseudo-code that doesn't say anything. Knowing what is being checked would be of some minor importance to how one might refactor it. – Edward Strange Feb 14 '11 at 20:53
  • Your code as posted checks for each variable and runs all code blocks up through the highest variable passed, and runs `rest-of-the-program` iff checks 1 through 6 succeed and 7 fails. Is that what you intended? Just checking. – David Thornley Feb 14 '11 at 20:55
  • 1
    @Crazy...not really. The bottom line is that there is a lot of state being maintained and maintained poorly at that. No matter what is being checked this code still screams state machine. – Pemdas Feb 14 '11 at 20:57
  • For clarification: is `CodeBlock#1` an actual executable block of code whose details have been omitted for simplicity, OR is it a *label* for the if-statement and nothing more? – FrustratedWithFormsDesigner Feb 14 '11 at 20:57
  • David Thornly - Yep, that was how the original Bloggers' code worked. – saunderl Feb 15 '11 at 18:54
  • 1
    @Walter: I think codereview.SE would be more appropriate than SO. – Chris Jun 19 '11 at 16:55
  • 1
    http://www.codinghorror.com/blog/2006/01/flattening-arrow-code.html – Job Jun 20 '11 at 02:21
  • 1
    Code blocks like this are quite common in parsers. I have never found a way to make it cleaner - every possible alternative has a large degree of suck-factor. – quickly_now Jun 20 '11 at 07:48

11 Answers11

58

That is called "Arrow Code" because of the shape of the code with proper indenting.

Jeff Atwood had a good blog post on Coding Horror about how to flatten out the arrows:

Flattening Arrow Code

Read the article for the full treatment, but here are the major points..

  • Replace conditions with guard clauses
  • Decompose conditional blocks into seperate functions
  • Convert negative checks into positive checks
  • Always opportunistically return as soon as possible from the function
JohnFx
  • 19,052
  • 8
  • 65
  • 112
46

It is really hard to tell without knowing how the different checks interact. Rigorous refactoring might be in order. Creating a topology of objects that execute the correct block depending on their type, also a strategy pattern or state pattern might do the trick.

Without knowing what to do best I would consider two possible simple refactorings that could be further refactored by extracting more methods.

The first one I don't realy like since I always like as litle exit points in a method (preferably one)

if (!Check#1)
{ 
    return;
}
CodeBlock#1

if (!Check#2)
{
    return;
}
CodeBlock#2
...

The second one remove's the multiple returns but also adds a lot of noise. (it basicly only removes the nesting)

bool stillValid = Check#1
if (stillValid)
{
  CodeBlock#1
}

stillValid = stillValid && Check#2
if (stillValid)
{
  CodeBlock#2
}

stillValid = stillValid && Check#3
if (stillValid)
{
  CodeBlock#3
}
...

This last one can be refactored nicely into functions and when you give them good names the result might be reasonable';

bool stillValid = DoCheck1AndCodeBlock1()
stillValid = stillValid && DoCheck2AndCodeBlock2()
stillValid = stillValid && DoCheck3AndCodeBlock3()

public bool DoCheck1AndCodeBlock1()
{
   bool returnValid = Check#1
   if (returnValid)
   {
      CodeBlock#1
   }
   return returnValid
}

All in all there are most likely way better options

KeesDijk
  • 8,918
  • 4
  • 35
  • 41
  • 2
    At this moment example1 exits when check#1 passes... should probably read if(!check#1){return;}codeblock#1; – Inca Jun 19 '11 at 11:07
  • 4
    Readability of the first refactoring is unbeatable. It also help reduce unneeded indentation. I use is as much as I can. – Ando Jun 19 '11 at 12:05
  • The first refactoring is not quite correct, you can get to check2 without check1 being true, and it seems that both have to be true. – Marcelo Jun 20 '11 at 02:21
  • 1
    @Inca and @Marcelo Hernández Rishmawy corrected, thanks ! – KeesDijk Jun 20 '11 at 09:52
  • That why I love Ruby: return if !check1 – Codism Aug 20 '15 at 16:22
  • "I always like as litle exit points in a method (preferably one)" - this is a prime example of cargo cult programming. The "single exit" schtick is based on limitations from the time of COBOL and is absolutely ridiculous in any modern language. – Davor Ždralo Oct 10 '17 at 09:27
29

I know some people will argue that it's a goto, but return; is the obvious refactoring, i.e.

if (!Check#1)
{ 
        return;
}
CodeBlock#1
if (!Check#2)
{
    return;
}
CodeBlock#2
.
.
.
if (Check#7)
{
    CodeBlock#7
}
else
{
    rest - of - the - program
}

If it really is just a bunch of guard checks before running the code, then this works fine. If it's more complicated than that, then this will only make it a bit simpler, and you need one of the other solutions.

Richard Gadsden
  • 737
  • 5
  • 12
  • 1
    I don't think this is valid at all... based on the logic shown above, *up to* all 7 code blocks could execute. In yours, *only one* code block could ever execute. – FrustratedWithFormsDesigner Feb 14 '11 at 20:56
  • You only need to tweak your logic and re-organize, return if `!Check#1`. – Nim Feb 14 '11 at 21:07
  • 1
    @FrustratedWithFormsDesigner Yes, this doesn't match the original. So it should be more like this... – Richard Gadsden Feb 14 '11 at 21:11
  • 2
    @moz Uhh, then what is refactoring? Adjusting whitespace? This example works just like the original. – Jay Feb 14 '11 at 21:40
  • @Jay - I edited it per the suggestion - click on the edited link to see how I got it wrong the first time. – Richard Gadsden Feb 16 '11 at 10:42
  • Yes I agree. This is the conclusion my current understanding of software development reaches.I do have a method that has 11 returns and one final 'Ok'. in my case, its a web api method so rightly, the caller needs to know if the method failed and why so it can display the message to the client. I feel it can be improved upon but all the 11 checks are necessary to do the final "Add to table" – Andy Nov 16 '16 at 18:26
11

That spaghetti code seems like the perfect candidate for refactoring it into a state machine.

Pemdas
  • 5,385
  • 3
  • 21
  • 41
  • 1
    Oh, a state machine ... I like that idea. In a way, the code is a Deterministic finite-state machine (DFA) – saunderl Feb 14 '11 at 20:45
6

This may already have been mentioned, but my "go-to" (pun intended) answer to the "arrowhead antipattern" shown here is to reverse the ifs. Test the opposite of the current conditions (easy enough with a not operator) and return from the method if that is true. No gotos (though pedantically speaking, a void return is little more than a jump to the line of calling code, with the trivial extra step of popping a frame off the stack).

Case in point, here's the original:

if (Check#1)
{
    CodeBlock#1
    if (Check#2)
    {
        CodeBlock#2
        if (Check#3)
        {
            CodeBlock#3
            if (Check#4)
            {
                CodeBlock#4
                if (Check#5)
                {
                    CodeBlock#5
                    if (Check#6)
                    {
                        CodeBlock#6
                        if (Check#7)
                        {
                            CodeBlock#7
                        }
                        else
                        {
                            rest - of - the - program
                        }
                    }
                }
            }
        }
    }
}

Refactored:

if (!Check#1) return;

CodeBlock#1

if (!Check#2) return;

CodeBlock#2

if (!Check#3) return;

CodeBlock#3

if (!Check#4) return;

CodeBlock#4

if (!Check#5) return;

CodeBlock#5

if (!Check#6) return;

CodeBlock#6

if (Check#7)
    CodeBlock#7
else
{
    //rest of the program
}

At each if, we basically check to see if we should continue. It works exactly the same way as the original with only one level of nesting.

If there's something beyond the end of this snippet that should also be run, extract this code into its own method and call it from wherever this code currently lives, before proceeding to the code that would come after this snippet. The snippet itself is long enough, given sufficient actual LOC in each code block, to justify splitting out several more methods, but I digress.

KeithS
  • 21,994
  • 6
  • 52
  • 79
  • 1
    While I don't subscribe to the "only one exit point" philosophy, I believe functions should when practical be divided into three sections; only the middle section should have side-effects, and it should not have any returns. To my mind, a middle-section `return` is worse than a `goto`, since an outdented label the latter will all attention to the fact that there's a `goto` lurking about. – supercat Mar 27 '14 at 15:46
  • "since an outdented label the latter will all attention to the fact that there's a goto lurking about." @supercat what does that mean? – reggaeguitar Feb 08 '19 at 18:04
  • @reggaeguitar: Strike "the latter", or else said "Is worse than a `goto`/label combination, since an outdented label used by the latter"... – supercat Feb 08 '19 at 18:13
2

Using an OO approach a composite pattern where leaf is simple condition and component a union of simple elements make this kind of code extensible and adaptable

guiman
  • 2,088
  • 13
  • 17
1

Personally, I like wrapping these if statements into separate functions which return a bool if the function succeeds.

The structure looks as follows:


if (DoCheck1AndCodeBlock1() && 
    DoCheck2AndCodeBlock2() && 
    DoCheck3AndCodeBlock3()) 
{
   // ... you may perform the final operation here ....
}

The only drawback is that these functions will have to output data using attributes passed by reference or member variables.

gogisan
  • 11
  • 1
1

If you've routinely got logic that actually requires this pyramid of if checks, you are probably (metaphorically) using a wrench to hammer nails. You'd be better served doing that kind of twisted, complicated logic in a language that supports that kind of twisted and complicated logic with better structures than linear if/else if/else-style constructs.

Languages that might be better-suited to this kind of structure could include SNOBOL4 (with its bizarre dual-GOTO style branching) or logic languages like Prolog and Mercury (with their unification and backtracking capabilities, not to mention their DCGs for rather succinct expression of complicated decisions).

Of course if this is not an option (because most programmers are, tragically, not polyglots) the ideas others have come up with are good like using various OOP structures or breaking up the clauses into functions or even, if you're desperate, and don't mind the fact that most people find them unreadable, using a state machine.

My solution, however, remains reaching for a language that permits you to express what logic you're trying to express (assuming this is commonplace in your code) in an easier, more readable fashion.

JUST MY correct OPINION
  • 4,002
  • 1
  • 23
  • 22
1

From here:

Quite often when I'm looking at a set of pac-man ifs I find that if I just draw out something like a truth table of all the conditions involved I can work out a much better route to resolving the problem.

That way you can also assess whether there is a better method, how you might break it down further and ( and this is a big one with this kind of code ) whether there are any holes in the logic.

Having done that you can probably break it down into a couple of switch statements and a couple of methods and save the next poor mook who has to go through the code a whole lot of problems.

Jim G.
  • 8,006
  • 3
  • 35
  • 66
0

It depends a lot on the size of each code block and the total size, but I tend to split either by a straight "extract method" on a block, or by moving the unconditional parts into separate methods. But the caveats @KeesDijk mentioned apply. The former approach gives you a series of functions each like

if (Check#1)
{
    CodeBlock#1
    NextFunction
}

Which can work well but does lead to code bloat and the "method only used once" antipattern. So I generally prefer this approach:

if (CheckFunctionOne)
{
    MethodOneWithDescriptiveName
    if (CheckFunctionTwo)
    {
        MethodTwoWithDescriptiveName
        ....

With appropriate use of private variables and parameter passing it can be done. Having glanced at literate programming can help here.

0

If you want a possible object orientated solution, the code looks like a canonical example for refactoring using the chain of responsibility design pattern.

FinnNk
  • 5,799
  • 3
  • 28
  • 32