-1

I just let my brain run wild and the results are again chaotic. It is kind of bothering me how in the very basics, programming itself can be pretty redundant.

Let's look at this snippet of code and analyze what it does

public function __construct($data = array(), $strict = false, $prefix = null) {
    if (!$data) {
        return;
    }

    foreach ($data as $key => $val) {
        if ($prefix) {
            if (($pos = strpos($key, $prefix)) === 0) {
                $key = substr($key, 0, $pos);
            } else {
                continue;
            }
        }

        if ($strict === false || property_exists($this, $key)) {
            $this->$key = $val;
        }
    }
}

When constructing this object we can provide an array to fill in it's properties, however there is some filtering we can do. If we enable the strict parameter it will only fill in keys it already has, and if we supply a prefix parameter it will only fill in properties that correspond to keys starting with a certain prefix.

See, however, that we have n*2 if-statements evaluated in the given loop at the very least. Meaning if we have 100 elements in the $data array there will be at least 200 if-statements evaluated. Wouldn't it be better if we could reduce that?

public function __construct($data = array(), $strict = false, $prefix = null) {
    if (!$data) {
        return;
    }

    if ($strict === false) {
        if ($prefix) {
            foreach ($data as $key => $val) {
                if (($pos = strpos($key, $prefix)) === 0) {
                    $this->{substr($key, 0, $pos)} = $val;
                }
            }
        } else {
            foreach ($data as $key => $val) {
                $this->$key = $val;
            } 
        }
    } else { 
        if ($prefix) {
            foreach ($data as $key => $val) {
                if (property_exists($this, $key) && ($pos = strpos($key, $prefix)) === 0) { 
                    $this->{substr($key, 0, $pos)} = $val;
                } 
            }
        } else { 
            foreach ($data as $key => $val) {
                if (property_exists($this, $key)) {
                    $this->$key = $val;
                } 
            }
        }
    }
}

As we can clearly see, readability has gone to hell but nevertheless we have reduced the amount of expressions evaluated from 200 to 2 in the best case scenario when $strict === false and prefix === null - that is a 99% increase in efficiency, theoretically.

Mostly everything I said so far is theoretical, so the question is - Is there any actual benefit to example 2 opposed to example 1 regarding program execution?

Given that I have not graduated in computer science I'm not as familiar on programming (and php) internals and how code is evaluated and executed on a deeper level, but I have read this and that on how processors can shortcut redundant if-statements but as I said, am not as familiar to answer the question myself.

Telastyn
  • 108,850
  • 29
  • 239
  • 365
php_nub_qq
  • 2,204
  • 4
  • 17
  • 25
  • 9
    Please review question http://programmers.stackexchange.com/questions/206374. The gain involved in avoiding execution of some `if` statements is **much, much, much** smaller than the loss in maintainability of your code base. – Kilian Foth Oct 30 '15 at 11:59
  • 2
    (1) Is this code path hot enough to trade maintainability for speed? Usually performance of ~99% of code does not matter much, only ~1% is worth optimizing _after you have profiled the code_ and found the hot functions. (2) Maybe there's a way to split this function in two and reap the both the maintainability and performance gains? – 9000 Oct 30 '15 at 12:15
  • @9000 it is not a question about this function concretely, but about how programming works in its very base. We say that we sacrifice performance for maintainability and readability but why is this necessary, why is there no way to have the best out of both. – php_nub_qq Oct 30 '15 at 12:19
  • @php_nub_qq Abstraction inherently has cost. Most of that time, much of that cost ends up being performance. – nanny Oct 30 '15 at 13:24
  • 6
    I am not sure what you are looking for, but there is no "fundamental physical law" saying each performance optimization reduces maintainability. This is simply based on experience for "most cases". In fact, in some *rare* cases, one might find a solution for a given programming problem which is short,efficient and maintainable. But when one starts to micro-optimize a piece of code by throwing the DRY priciple overboard, to avoid repeated evaluation of some conditionals, the maintainability of the code will be reduced. – Doc Brown Oct 30 '15 at 13:24
  • "that is a 99% increase in efficiency, theoretically" Not really, because either way you're going to be doing several more operations which are probably more expensive. – Ben Aaronson Oct 30 '15 at 16:09
  • @DocBrown is right. The supposed tradeoff between performance and maintainability is merely anecdotal. But there's a bigger issue: don't look at performance by taking a magnifying glass to the code. Instead, get it running and then let *it* tell *you* what takes time. The bigger the code is, the more likely there are multiple ways to speed it up. Want a surprise? Take a look at [*this case*](http://scicomp.stackexchange.com/a/1870/1262) where the speedup factor is almost 3 orders of magnitude. That's like 20 minutes to 1 second. – Mike Dunlavey Oct 30 '15 at 16:26
  • Possible duplicate of [Should you sacrifice code readability with how efficient code is?](http://programmers.stackexchange.com/questions/43151/should-you-sacrifice-code-readability-with-how-efficient-code-is) – gnat Oct 30 '15 at 16:49
  • I recommend reading this SO question and its accepted answer: **[Why is processing a sorted array faster than an unsorted array?](http://stackoverflow.com/q/11227809/439793)** While the premise of the question is not exactly the same, the answer demonstrates a very important point with regards to modern compilers and how they optimize code like yours. Remember, PHP is still compiled and optimized even if it is technically a "script." The line between "compiled" and "interpreted" is very fuzzy in modern times. –  Oct 30 '15 at 17:26

4 Answers4

4

There is a great book called Clean Coder. It copes a lot of questions like this and contains some awesome tips for those fundamental issues you are talking about.

For your question, there is a rule called Beware of Optimizations!

The explaination starts with: Focus on writing comprehensible code. Optimized code is often anything but readable... and I personally agree on that.

If you don't need to optimize, don't do it. If you run into performance issues, or if it is probable that you would run into performance issues in the near futur, then you have no choice than to optimize your code for performance (even if that means making your code less readable).

Clean coder: http://clean-code-developer.com/Red-Grade.ashx#Beware_of_Optimizations!_2

Stefan Woehrer
  • 527
  • 3
  • 11
3

This is not ideal code because there are two distinct operations being performed in this one function. There is copying a key only if it exists on the target (filter), called "$strict". Then there is matching the key to a prefix value and transforming the key to that value (filter + map). (Note that this latter one especially looks like it has flawed logic, but I used your code verbatim in my answer anyway.)

There should be 2 separate functions which perform their specific operation on ONE item. Then you could simplify the constructor code and even increase maintainability. I haven't used PHP in some time, so I'm not sure if this will run, but here goes.

private function choosePrefix($prefix, $key) {
    if (($pos = strpos($key, $prefix)) === 0) {
        return substr($key, 0, $pos);
    } else {
        return false;
    }
}

private function strictCopy($target, $key, $value) {
    if (property_exists($target, $key)) {
        $target->$key = $value;
    }
}

Note that separating things into functions adds a lot of overhead (well, not in all languages, but in this case). However, it has the advantage of describing the intention of what you are doing! You'll see below.

Secondly, you are treating the object as though it is a collection (dictionary). This feels perfectly natural in PHP, but in most other languages, it is common to represent collection data as a property of the class instead of key/values being intermixed with concrete properties.

$data = $this->data;
$data[$key] = $value;

Thirdly, what solves your specific gripe is choosing your processing functions before running the loop and then running the chosen functions on the data inside the loop. This is pretty common in functional languages, and I believe now available in PHP, if a bit awkwardly. Here is an example... but again, been a while since I've used PHP.

private function passThru($dummy, $key) { return $key; }
private function choosePrefix($prefix, $key) {
    if (($pos = strpos($key, $prefix)) === 0) {
        return substr($key, 0, $pos);
    } else {
        return false;
    }
}

private function strictFilter($target, $key) {
    if ($key !== false && property_exists($target, $key)) {
        return $key;
    } else {
        return false;
    }
}

public function __construct($data = array(), $strict = false, $prefix = null) {
    if (!$data) {
        return;
    }
    // ternary form of if/then/else
    $chooseFn = ($prefix ? 'choosePrefix' : 'passThru');
    $filterFn = ($strict ? 'strictFilter' : 'passThru');

    foreach ($data as $key => $val) {
        $key = $this->$chooseFn($prefix, $key);
        $key = $this->$filterFn($this, $key);
        if ($key !== false) {
            $this->$key = $value;
        }
    }
}

Now if we look at the constructor, without having to actually mentally work through the code, I can see that I might skip some keys, otherwise I will copy the key/value to this object. Giving the operations names by putting them in separate functions helps readability here.

Kasey Speakman
  • 4,341
  • 19
  • 26
  • This suggestion just swaps the conditions with function calls which are more expensive than an if statement. – php_nub_qq Oct 30 '15 at 15:39
  • @php_nub_qq, Actually look at the code, especially at the end. I have removed the $prefix and $strict checks from inside the loop. Instead I decide up front how to process it, then loop. The remaining if branches are because they are required. Whether or not this is marginally more efficient is irrelevant. Your logic will cause time complexity problems long before extra `if` statements in a loop. The important factor is the human one. Is your code comprehensible when you come back to it in a year? – Kasey Speakman Oct 30 '15 at 15:51
1

In answer to this question from the comments:

We say that we sacrifice performance for maintainability and readability but why is this necessary, why is there no way to have the best out of both.

Code is a construct that is used to express a sequence of steps which represents a task to be carried out, or to express a set of expressions to calculate a value. In order for this logic to be of any use it has to be executed at some point, so the reality of this execution has to be taken into account. This is where optimization comes in.

Optimization means adding to or changing the logic of your task to account for details of the mechanism of execution, in order to speed things up. Often times this complicates the code, e.g. by breaking abstractions that are used by your code to get at and change implementation details.

Maintainable code has the properties of (among others) clarity and simplicity; removing these in your code, e.g. by breaking the abstractions that buy them, will cause the loss of some maintainability.

This doesn't necessarily mean that all optimization breaks maintainability. It can be possible to write maintainable code taking into account the particulars of the execution mechanism, or to apply different abstractions to do so. (Or to use abstractions properly in the first place.)

paul
  • 2,074
  • 13
  • 16
1

There needn't be any trade-off. I unfortunately don't know php well enough to write in it, so I'll use pseudo-code instead. Your algorithm can look like:

if($prefix) {
    $data = $data filtered to only include items prefixed with $prefix;
    $data = $data mapped to remove $prefix;
}

if($strict) {
    $data = $data filtered to only include items where $item.$key already exists;
}

foreach($item in $data) {
    add $item to $this;
}

This allows you to not repeat your logic or your checks, and in a language that provides you with declarative collection manipulation (like C#'s LINQ or Java 8's Stream API), would be very clean and clear.

Note that, realistically because checking booleans is so cheap, this would probably still be less performant than your version, but that's just because you chose an operation that's so cheap as to be pointless to optimize. Hopefully this addresses a more general point of performance vs cleanliness: often, it's illusive.

Ben Aaronson
  • 6,883
  • 1
  • 30
  • 30