Refactoring for loop (clean code aspect)

Question

I have to refactor a "for" loop which iterates over an array and does two independent things with the array element. For example:

doThisOrOtherStuff(array) {
 for(int i=0; i<array.length; i++) {
  element = array[i];
  id = element.getId();
  if(id < 0) {
   doThisStuff(element);
  } else {
   doOtherStuff(element);
  }
 }
}

My problem is, that I want to have a function, which only does one thing. Right now my function does two things based on the decision at "if". If I separate it in two functions, I have to iterate over the array twice. I mean:

doThisStuff(array) {
 for(int i=0; i<array.length; i++) {
  element = array[i];
  id = element.getId();
  if(id < 0) {
   doThisStuff(element);
  } 
 }
}

doOtherStuff(array) {
 for(int i=0; i<array.length; i++) {
  element = array[i];
  id = element.getId();
  if(id >= 0) {
   doOtherStuff(element);
  } 
 }
}

This is not optimal from the performance point of view. Could someone give me advice, how to separate the different responsibilities in different functions in such a case?

Possible duplicate of [Is there a design pattern to remove the need to check for flags?](https://softwareengineering.stackexchange.com/questions/306314/is-there-a-design-pattern-to-remove-the-need-to-check-for-flags) — gnat, Nov 15 '17 at 08:56
Thank you all for the answers! I think all of them is acceptable. As I can accept only one, I accepted that one, which was most suitable in my case. — Milan Tenk, Nov 15 '17 at 13:58
They're called "loops", not "cycles". The title of this question confused me greatly. — Sebastian Redl, Jan 25 '18 at 16:33
So much ["painting the bikeshed"](https://en.wikipedia.org/wiki/Law_of_triviality) going on in this post. — Robert Harvey, Jan 25 '18 at 22:29

score 7 · Answer 1 · answered Nov 15 '17 at 08:57

Of course it's not optimal from a performance point of view. Very few solutions are optimal, and even if they are it would be hard to prove it.

But clean code isn't about optimal ratings on any single dimension. It's always about a trade-off: for instance, how much time can I afford to spend executing this code vs. how much time will it take to write vs. (this is the biggie!) How much time will it take to understand the code if it ever needs to change?

It is very unlikely that the minimal gain from eliminating a second iteration over a data set will outweigh the cost of having to attend to two different things at the same time when dealing with this code (and performance gains should always be measured, never assumed!). The right thing to do is to separate responsibilities. If it does turn out that you do need extra speed you can easily refactor for speed in one place, but you can't easily undo the result of premature optimisation.

I wholeheartedly agree on this point. Unless the two actions are related or unless *serious* optimizations problem stem from having to cycle an array twice (that'd be some array), readability should win over optimization. — Neil, Nov 15 '17 at 09:55
+1 for trade-off and for measure. It's by no means obvious that one iteration doing twice the work will be faster than two iterations. Even if it is, it may very well be negligible. — doubleYou, Jan 25 '18 at 17:26

score 3 · Accepted Answer · edited Jan 25 '18 at 19:41

3

Imagine this function :

function updateOrCreateAll(array) {
  for (var i = 0; i < array.length; i++) {
    var element = array[i];
    var id = element.getId();
    if (id < 0) {
      create(element);
    }
    else {
      update(element);
    }
  }
}

Now this function has 2 responsibilities, first is to iterate through the array, second is to choose whether to create or to update. Refactor this function to only have 1 responsibility anymore :

function updateOrCreateAll(array) {
  for (var i = 0; i < array.length; i++) {
    updateOrCreate(array[i]);
  }
}

function updateOrCreate(element) {
  return element.getId() < 0 ? create(element) : update(element);
}

Now updateOrCreateAll has one responsibility which is to apply updateOrCreate to all elements on the array. The conditional is the responsibility of updateOrCreate, and create/update have only one responsibility too.

edited Jan 25 '18 at 19:41

Dherik

2,406
20
33

answered Nov 15 '17 at 10:28

Steve Chamaillard

1,691
1
11
20

The code is identical. You have only added a method call and made cosmetic changes. "Or" indicates the method (same as if statement previously) can do one of two things. SRP is definitely worthless... – Frank Hileman Nov 15 '17 at 22:29
8

You're clearly missing the whole point of refactoring, which is to keep the same logic but change the form of it to make it more maintainable and more readable. That's exactly what happens there. Also SRP means one responsibility, people often confuse this with one action which is the reason why most people feel like they're not respecting SRP whenever they write a controller method using the MVC pattern. – Steve Chamaillard Nov 15 '17 at 23:21
I would not find it more or less readable either way. And there is no software engineering in this question... – Frank Hileman Nov 15 '17 at 23:22
1

Well this is a toy example, and here it looks more or less like a matter of style, but SRP is about separating axes of change (and should really be informed by change request history), so that you can minimize/control interdependencies between the two. So while SRP could be better defined, it definitely isn't worthless - but shouldn't be applied blindly. – Filip Milovanović Nov 16 '17 at 13:39
1

As for readability - this particular example isn't the best demonstration, but extracting the "guts" of a method into a separate method *can* significantly improve readability for a dev who is new (or returning) to the project and just wants to quickly understand what the method does conceptually, without going into (or really caring about) the details of how it does it. – Filip Milovanović Nov 16 '17 at 13:39
@FilipMilovanović This is probably a poor example for that argument. The calling method was not large to start with, and the called method is used in only one place. Whether it is more or less readable depends on how well one can understand the meaning of the method name. Most developers I work with, would consider the creation of trivial methods to make the code less readable, but it is entirely subjective. – Frank Hileman Nov 16 '17 at 23:25
@FrankHileman: "This is probably a poor example" - I agree with that (I did say that it's "a toy example", and that "this particular example isn't the best demonstration"). Your other points make sense as well. – Filip Milovanović Nov 17 '17 at 08:38
Depending on language (OP didn't specify) the updateOrCreateAll may be unnecessary. For example, in JavaScript, it could be something like `array.forEach(updateOrCreate);` – user949300 Jan 25 '18 at 20:11

score 2 · Answer 3 · answered Nov 15 '17 at 09:10

As the functions You are calling are doing something with element, maybe you can move the functions into the element?

Then You can simplify the code by removing the if statement and just calling element->doSomething()

There are many things which can be done with this. Source: https://refactoring.guru/refactoring/techniques/simplifying-conditional-expressions

Vadim Samokhin · Answer 4 · 2017-11-15T09:35:29.143

Probably you could first split your array based on some condition, thus you have two arrays. Then you iterate each of them and no if needed, since the condition already holds true in each of them.

In php it could go like that:

function split(array $a, Closure $c)
{
    $arrayWhereConditionHoldsTrue = [];
    $arrayWhereConditionDoesNotHoldTrue = [];

    foreach ($a as $e) {
        if ($c($e)) {
            $arrayWhereConditionHoldsTrue[] = $e;
        } else {
            $arrayWhereConditionDoesNotHoldTrue[] = $e;
        }
    }

    return [$arrayWhereConditionHoldsTrue, $arrayWhereConditionDoesNotHoldTrue];
}

$isPositive =
    function ($n) {
        return $n > 0;
    };

$arrays =
    split(
        [1, -2, 3, -4, 5, -6],
        $isPositive
    )
;

$positive = $arrays[0];
$negative = $arrays[1];

print_r($positive);
print_r($negative);

function doStuffWithPositiveNumbers(array $positive)
{

}

function doStuffWithNegativeNumbers(array $negative)
{

}

But if we're talking about an objects, I'm with Robert Andrzejuk. You could introduce an interface that is conformed by all elements in array and use its polymorphic method:

interface Number
{
    public function doSmth();
}

class Positive implements Number
{
    public function doSmth()
    {
    }
}

class Negative implements Number
{
    public function doSmth()
    {
    }
}

function doStuff(array $numbers) {
    foreach ($numbers as $number) {
        $number->doSmth();
    }
}

It might be a bit much to have a Positive and Negative class, but I suppose it entirely depends on the amount of code being run inside those methods. — Neil, Nov 15 '17 at 10:39
It's just an example :) I guess the OP didn't mean positive and negative numbers as well. — Vadim Samokhin, Nov 15 '17 at 10:40

score 1 · Answer 5 · answered Nov 15 '17 at 11:33

1

Simply remove the conditional with polymorphism and loop once.

class DoThis : Base 
{
     public void Process() {} // DoThis logic 
}

class DoThat : Base 
{
     public void Process() {} // DoThat logic 
}


void ProcessItems(List<Base> items) 
{
     foreach(var i in items)
     {
          i.Process();
     } 
}

Your code is cleaner and just as effecient

answered Nov 15 '17 at 11:33

Ewan

70,664
5
76
161

In this case polymorphism doesn't seem to apply. All the elements have the same type; it's only the value of field `id` that varies. – Andres F. Jan 25 '18 at 17:20
the OP is using the id field to determine the 'type' of the element. If they used two types instead, then the if would disappear and the correct function called as requested – Ewan Jan 25 '18 at 17:24
I don't think so. `Id` is a terrible name to denote type. Without further comment from the OP, I wouldn't assume he/she is telling types apart. Furthermore, his/her question could be generalized to mean "do `f` if this number is negative or `g` if it's positive"; no types involved. – Andres F. Jan 25 '18 at 17:28
?? im saying _add_ the types to achieve the requested effect. – Ewan Jan 25 '18 at 18:34
Understood. It just doesn't seem to be a good idea to introduce ad hoc types in order to implement this kind of function. The particular function the OP seems to be asking about is a higher-order `map` (or actually a side-effectful `foreach`), with `map(f, array)` where `f` is the `if-else`. Polymorphism isn't the answer. – Andres F. Jan 26 '18 at 13:55
so if the example had used 'type' or 'make' or something instead of id. it would be the answer. seems like you are reading too much into it to me – Ewan Jan 26 '18 at 19:33

score 0 · Answer 6 · 2018-01-25T17:23:34.307

Personally like the first one best, shrug, that just does a simple branch on id inside the loop. Polymorphism is usually the straightforward alternative to conditional branching with if/switch, but may or may not be unwieldy if your elements IDs are dynamic and change.

Just to be a bit wacky and since this question seems to be language-agnostic, here you go:

// Applies 'true_func' to elements in the the random-access sequence
// that satisfies 'pred' or 'false_func' otherwise.
applyIf(array, pred, true_func, false_func)
{
    for (int i=0; i < array.length; i++) {
        if (pred(array[i]))
            true_func(array[i]);
        else
            false_func(array[i]);
    }
}

// Returns true if the element has a negative ID.
hasNegativeId(element) {
    return element.getId() < 0;
}

And in the caller:

applyIf(array, hasNegativeId, doThisStuff, DoOtherStuff);

Or simply:

applyIf(array, function(element) {element.getId() < 0;}, doThisStuff, DoOtherStuff);

I don't actually like this solution. I might even hate it. But it does dice up your simple function into two even simpler ones: one uber simple unary predicate that returns true if an element has a negative ID, and one very generally-applicable algorithm (applyIf) that calls one function or another on each element in a random-access sequence (of which arrays are but one type) based on a predicate.

Of course the client still has to compose this stuff together, but the individual functions all do something even simpler now. You can make it even more generalized by depending on a forward iterator/enumerable instead of a random-access sequence, allowing applyIf to work on any data structure which is forward iterable, including singly-linked lists, doubly-linked lists, binary search trees, arrays, hash tables, tries, quadtrees, you name it (as long as it can be iterated through sequentially).

It's playing to the kind of uber generality and code reuse mindset since that applyIf function can be made to work for just about anything with any combination of predicates and functions to apply, so it has much, much more reusability than a function hard-coded to check IDs (the kind of thing that might go into a standard library). But I still don't like it and like your plain old first solution better. These days I don't care as much about these uber flexible solutions. I like that plain old loop with the if inside unless you have to redundantly check for negative IDs all over the place (then I'd consider something else). We could generalize it further:

// Calls 'true_func' if pred(element) is true,
// 'false_func' otherwise.
iif(pred, true_func, false_func) {
    return function(element) {
         return pred(element) ? true_func(element): false_func(element);
    }
}

// Calls 'func' for each element in the container.
forEach(container, func) {
    for (it = container.begin(); it != container.end(); ++it)
        func(it.element())
}

At which point we can do this:

forEach(array, iif(hasNegativeId, doThisStuff, doOtherStuff));

There are ways to keep going further and dicing things up and generalizing them even more, but I'll stop there. Again I don't really like this. Just kinda putting it out there for people who get a kick out of dicing their code up into the teeniest, most reusable functions (something I used to enjoy but not so much these days).

The main reason I don't like this solution is that functional programming loses so much of its elegance if you use it to cause side effects. If doThisStuff and doOtherStuff cause side effects, which I'm assuming they do since you don't use a return value, then you only get the difficulty associated with functional programming of complex control flow combined with the side effects. The combo of those two can make things really difficult to reason about. So you often need simple control flow if you want to cause side effects, or no side effects if you want complex control flow (not a combination of the two). That's why I like your simple loop with an if statement inside the best. It's so easy to debug and follow in terms of control flow.

score 0 · Answer 7 · answered Jan 25 '18 at 17:28

You’ve not specified the language, but if yours supports Multicast Delegates you can leverage them to process the array “all at once” and still achieve a nice code separation.

Some psuedo code:

public class ListenerA
{
    public void Register(Foo foo)
    {
        foo.ProcessRequest += Process
    }

    public void Process(ProcessRequestData data)
    {
        if (data.getId < 0)
        {
            doStuff();
        }
    }
}

public class ListenerB
{
    public void Register(Foo foo)
    {
        foo.ProcessRequest += Process
    }

    public void Process(ProcessRequestData data)
    {
        if (data.getId >= 0)
        {
            doStuff();
        }
    }
}

public class Foo
{
    public Event ProcessRequest;
    public void Process()
    {
        for(int i=0; i<array.length; i++) 
        {
            if (null != ProcessRequest)
            {
                ProcessRequest(element);
            }
    }
}

// in some controller like class

listenerA.Register(foo);
listenerB.Register(foo);

foo.Process();

The beauty to this approach is that it’s completely closed to modification. If you have a need in the future to add a 3rd processor, you just create a new one and Register it.

score 0 · Answer 8 · answered Jan 25 '18 at 21:12

You just changed from 11 lines of code to 18 lines of code. Plus, wherever you had a call to one function, you now need two. Congratulations. But you are not going far enough. Look at Zapadio's answer. Now I ask you: Which version would I prefer if I ever have the (mis)fortune of maintaining this code, your 11 lines, your 18 lines, or Zapadio's 250 lines (rough estimate)?

Your function has one responsibility: To iterate through all the array elements, and call the appropriate one of two further functions for each element. Your function does indeed one thing.

Now if you turn 11 lines into 250, what are you going to do if you have actually a non-trivial problem? The "single responsibility" principle is probably one of the most misunderstood principles. When you hire a taxi, what would you think if there was a dozen drivers, one handling the kludge, one the gears, one the brakes, one the gas pedal, one the windscreen wipes, one the left indicator, one the right indicator, one opening the door, one putting your luggage in the boot? The ONE taxi driver has a single responsibility: Getting people and their luggage safely and efficiently to their destination.

PS. Now assume it is necessary to do all these function calls in the right order. Which is not unusual, many things are order dependent. Now you are stuck.

Refactoring for loop (clean code aspect)

8 Answers8