11

I've been reading Martin Fowler's Refactoring. It is generally excellent but one of Fowler's recommendations seems to be causing a little trouble.

Fowler recommends that you replace temporary variables with a query, so instead of this:

double getPrice() {
    final int basePrice = _quantity * _itemPrice;
    final double discountFactor;
    if (basePrice > 1000) discountFactor = 0.95;
    else discountFactor = 0.98;
    return basePrice * discountFactor;
}

you pull out into a helper method:

double basePrice() {
    return _quantity * _itemPrice;
}

double getPrice() {
    final double discountFactor;
    if (basePrice() > 1000) discountFactor = 0.95;
    else discountFactor = 0.98;
    return basePrice() * discountFactor;
}

In general I agree except that one reason I use temporary variables is when a line is too long. For example:

$host = 'https://api.twilio.com';
$uri = "$host/2010-04-01/Accounts/$accountSid/Usage/Records/AllTime";
$response = Api::makeRequest($uri);

If I tried inlining that, the line would go longer than 80 characters.

Alternately I end up with chains of code, which are themselves not that much easier to read:

$params = MustacheOptions::build(self::flattenParams($bagcheck->getParams()));

What are some strategies for reconciling the two?

gnat
  • 21,442
  • 29
  • 112
  • 288
Kevin Burke
  • 463
  • 5
  • 13
  • 11
    80 chars is about 1/3rd of 1 of my monitors. are you sure that sticking to 80 char lines is still worthwhile for you? – jk. Jan 16 '13 at 10:36
  • 10
    yes, see for example http://programmers.stackexchange.com/questions/604/is-the-80-character-limit-still-relevant-in-times-of-widescreen-monitors – Kevin Burke Jan 16 '13 at 16:59
  • Your `$host` and `$uri` example is kind of contrived, though - unless the host was being read from a setting or other input, I'd prefer them being on the same line, even if it does wrap or go off the edge. – Izkata Jan 21 '13 at 04:07
  • 5
    No need to be so dogmatic. The book is a list of techniques that can be used when they help, not a set of rules you have to apply everywhere, every time. The point is to make your code more maintainable & easier to read. If a refactor doesn't do this, you don't use it. – Sean McSomething Jan 24 '13 at 19:41
  • While I think an 80 character limit is a little excessive, a similar limit (100?) is reasonable. For example, I often like programming on portrait oriented monitors, so extra long lines can be annoying (at least if they are common). – Thomas Eding Sep 06 '13 at 18:09

7 Answers7

16

How To
1. Line length restrictions exist so you can see + understand more code. They are still valid.
2. Emphasize judgement over blind convention.
3. Avoid temp variables unless optimizing for performance.
4. Avoid using deep indentation for alignment in multi-line statements.
5. Break long statements into multiple lines along idea boundaries:

// prefer this
var distance = Math.Sqrt(
    Math.Pow(point2.GetX() - point1.GetX(), 2) + // x's
    Math.Pow(point2.GetY() - point1.GetY(), 2)   // y's
);

// over this
var distance = Math.Sqrt(Math.Pow(point2.GetX() -
    point1.GetX(), 2) + Math.Pow(point2.GetY() -
    point1.GetY(), 2)); // not even sure if I typed that correctly.

Reasoning
The main source of my (debugging) problems with temp variables is that they tend to be mutable. Namely, I'll assume they are one value when I write the code, but if the function is complex, some other piece of code changes their state halfway through. (Or the converse, where the state of the variable remains the same but the result of the query has changed).

Consider sticking with queries unless you are optimizing for performance. This keeps any logic you've used to calculate that value in a single place.

The examples you've given (Java and ...PHP?) both allow for multi-line statements. If the lines get long, break them up. The jquery source takes this to extremes. (The first statement runs to line 69!) Not that I necessarily agree, but there are other ways to make your code readable than using temp vars.

Some Examples
1. PEP 8 Style Guide for python (not the prettiest example)
2. Paul M Jones on the Pear Style Guide (middle of the road argument)
3. Oracle line length + wrapping conventions (useful stratagems for keeping to 80 chars)
4. MDN Java Practices (emphasizes programmer judgement over convention)

Zachary Yates
  • 1,776
  • 13
  • 16
  • 1
    The other part of the problem is that a temporary variable often outlives its value. Not a problem in small scope blocks, but in larger ones, yeah, a big problem. – Ross Patterson Jan 16 '13 at 12:10
  • 9
    If you are worried about the temporary being modified, put a const on it. – Thomas Eding Jan 17 '13 at 22:52
3

I think, the best argument for using helper methods instead of temporary variables is the human readability. If you, as a human, have more trouble reading the helper-method-chain than the temporary varialbe, i see no reason why you should extract them.

(please correct me if i'm wrong)

mhr
  • 660
  • 6
  • 18
3

I don't think you need to strictly follow 80 character guidelines or that ever local temp variable should be extracted. But long lines and local temps should be investigated for better ways of expressing the same idea. Basically, they indicate that a given function or line is too complicated and we need to break it down. But we need to be careful, because breaking a task into pieces in a bad way only makes the situation more complicated. So I am to break things down into resuable and simple components.

Let me look at the examples you posted.

$host = 'https://api.twilio.com';
$uri = "$host/2010-04-01/Accounts/$accountSid/Usage/Records/AllTime";
$response = Api::makeRequest($uri);

My observation is that all twilio api calls will start with "https://api.twilio.com/2010-04-1/", and thus there is a very obvious reusable function to be had:

$uri = twilioURL("Accounts/$accountSid/Usage/Records/AllTime")

In fact, I figure that the only reason to generate a URL is to make the request, so I'd do:

$response = TwilioApi::makeRequest("Accounts/$accountSid/Usage/Records/AllTime")

In fact, many of the urls actually start with the "Accounts/$accountSid", so I'd probably extract that as well:

$response = TwilioApi::makeAccountRequest($accountSid, "Usage/Records/AllTime")

And if we make the twilio api an object that holds the account number, we could do something like:

$response = $twilio->makeAccountRequest("Usage/Records/AllTime")

Using a $twilio object has the benefit of making unit testing easier. I can give the object a different $twilio object that doesn't actually call back to twilio, which will be faster and won't do strange things to twilio.

Let's look at the other one

$params = MustacheOptions::build(self::flattenParams($bagcheck->getParams()));

Here I'd think about either:

$params = MustacheOptions::buildFromParams($bagcheck->getParams());

or

$params = MustacheOptions::build($bagcheck->getFlatParams());

or

$params = MustacheOptions::build(flatParams($backCheck));

Depending on which is the more reusable idiom.

Winston Ewert
  • 24,732
  • 12
  • 72
  • 103
1

Actually, I disagree with the eminent Mr. Fowler on this in the general case.

The advantage of extracting a method from formerly inlined code is code reuse; the code in the method is now severed from its initial usage, and can now be used in other places in code without being copied and pasted (which would necessitate making changes in multiple places if the general logic of the copied code ever had to change).

However, of equal, often greater conceptual value is "value reuse". Mr. Fowler calls these extracted methods to replace temp variables "queries". Well, what's more efficient; querying a database each of the multiple times you need a particular value, or querying once and storing the result (assuming that the value is static enough that you wouldn't expect it to change)?

For almost any calculation beyond the relatively trivial one in your example, in most languages it's cheaper to store the result of one calculation than to keep calculating it. Therefore, the general recommendation to recompute on-demand is disingenuous; it costs more developer time and more CPU time, and saves a trivial amount of memory, which in most modern systems is the cheapest resource of those three.

Now, the helper method, in conjunction with other code, could be made "lazy". It would, when first run, initialize a variable. All further calls would return that variable until the method were told explicitly to recompute. This could be a parameter to the method, or a flag set by other code that changes any value this method's computation depends on:

double? _basePrice; //not sure if Java has C#'s "nullable" concept
double basePrice(bool forceCalc)
{
   if(forceCalc || !_basePrice.HasValue)
      return _basePrice = _quantity * _itemPrice;
   return _basePrice.Value;
}

Now, for this trivial calculation that's even more work performed than saved, and so I'd generally recommend sticking with the temp variable; however, for more complex calculations that you would generally want to avoid running multiple times, and that you need in multiple places in code, this is the way you'd do it.

KeithS
  • 21,994
  • 6
  • 52
  • 79
1

Helper methods have a place, but you have to be careful about ensuring consistency of data, and unnecessary increase in scope of variables.

For example, your own example cites:

double getPrice() {
    final double discountFactor;
    if (basePrice() > 1000) discountFactor = 0.95;      <--- first call
    else discountFactor = 0.98;
    return basePrice() * discountFactor;                <--- second call
}

Clearly both _quantity and _itemPrice are global variables (or at least class level) and therefore there is a potential for them to be modified outside of getPrice()

Therefore there is a potential for the first call to basePrice() to return a different value from the second call!

Therefore, I would suggest that helper functions can be useful to isolate complex mathematics, but as a replacement for local variables, you need to be careful.


You also have to avoid reductio ad absurdum - should the calculation of discountFactor be hived off to a method? So your example becomes:

double getPrice()
{
    final double basePrice      = calculateBasePrice();
    final double discountFactor = calculateDiscount( basePrice );

    return basePrice * discountFactor;
}

Partitioning beyond a certain level actually makes the code less readable.

Andrew
  • 2,018
  • 2
  • 16
  • 27
  • +1 for making code less readable. Over partitioning can hide the business problem that the source code is trying to solve. There could be special cases where a coupon is applied in getPrice(), but if that is hidden deep in a chain of function calls then the business rule is also hidden. – Reactgular Jan 24 '13 at 16:18
0

If you happen to work in a language with named parameters (ObjectiveC, Python, Ruby, etc), temp vars are less useful.

However, in your basePrice example, the query may take a while to execute and you may want to store the result in a temp variable for future use.

Like you, though, I use temp variables for clarity and line length considerations.

I've also seen programmers do the following in PHP. It's interesting and great for debugging, but it's a little strange.

$rs = DB::query( $query = "SELECT * FROM table" );
if (DEBUG) echo $query;
// do something with $rs
Dimitry
  • 361
  • 1
  • 7
0

The rationale behind this recommendation is that you want to be able to use the same precomputation elsewhere in your application. See Replace Temp with Query in refactoring patterns catalog:

The new method can then be used in other methods

    double basePrice = _quantity * _itemPrice;
    if (basePrice > 1000)
        return basePrice * 0.95;
    else
        return basePrice * 0.98;

           https://i.stack.imgur.com/mKbQM.gif

    if (basePrice() > 1000)
        return basePrice() * 0.95;
    else
        return basePrice() * 0.98;
...
double basePrice() {
    return _quantity * _itemPrice;
}

Thus, in your host and URI example, I would only apply this recommendation if I plan to reuse the same URI or host definition.

If this is the case, due to namespace, I won't define a global uri() or host() method, but a name with more information, like twilio_host(), or archive_request_uri().

Then, for line length issue, I see several options:

  • Create a local variable, like uri = archive_request_uri().

Rationale: In the current method, you want the URI to be the one mentioned. The URI definition is still factorized.

  • Define a local method, like uri() { return archive_request_uri() }

If you often use the Fowler's recommendation, you'd know that the uri() method is the same pattern.

If due to language choice you need to access local method with a 'self.', I'd recommend the first solution, for increased expressiveness (in Python, I'd define the uri function inside the current method).

gnat
  • 21,442
  • 29
  • 112
  • 288