37

I often find myself struggling to decide which of these two ways to use when I require to use common data across some methods in my classes. What would be a better choice?

In this option, I can create an instance variable to avoid the need of having to declare additional variables, and also to avoid defining method parameters, but it may be not so clear where those variables are being instantiated/modified:

public class MyClass {
    private int var1;

    MyClass(){
        doSomething();
        doSomethingElse();
        doMoreStuff();
    }

    private void doSomething(){
        var1 = 2;
    }

    private void doSomethingElse(){
        int var2 = var1 + 1;
    }

    private void doMoreStuff(){
        int var3 = var1 - 1;
    }
}

Or just instantiating local variables and passing them as arguments?

public class MyClass {  
    MyClass(){
        int var1 = doSomething();
        doSomethingElse(var1);
        doMoreStuff(var1);
    }

    private int doSomething(){
        int var = 2;
        return var;
    }

    private void doSomethingElse(int var){
        int var2 = var + 1;
    }

    private void doMoreStuff(int var){
        int var3 = var - 1;
    }
}

If the answer is that they are both correct, which one is seen/used more often? Also, if you can provide additional pros/cons for each option would be very valuable.

MetaFight
  • 11,549
  • 3
  • 44
  • 75
carlossierra
  • 1,395
  • 2
  • 13
  • 17
  • 1
    Reletad (duplicate?): http://programmers.stackexchange.com/questions/164347/method-flags-as-arguments-or-as-member-variables – Martin Ba Apr 22 '16 at 12:43
  • 1
    I don't think anyone has yet pointed out that putting intermediate results in instance variables can make concurrency more difficult, on account of the possibility for contention between threads for these variables. – sdenham Apr 24 '16 at 18:50

8 Answers8

110

I'm surprised this hasn't mentioned yet...

It depends if var1 is actually part of your object's state.

You assume that both of these approaches are correct and that it's just a matter of style. You are wrong.

This is entirely about how to properly model.

Similarly, private instance methods exist to mutate your object's state. If that's not what your method is doing then it should be private static.

MetaFight
  • 11,549
  • 3
  • 44
  • 75
  • could you care to explain further? – carlossierra Apr 21 '16 at 17:30
  • Hmm, although you could mark fields as [`transient`](http://docs.oracle.com/javase/specs/jls/se6/html/classes.html#8.3.1.3) _"to indicate that they are not part of the persistent state of an object."_ – mucaho Apr 21 '16 at 19:51
  • 7
    @mucaho: `transient` doesn't have anything to do with this, because `transient` is about *persistent* state, as in parts of the object that get saved when you do something like serialize the object. For example, an ArrayList's backing store is `transient` even though it's utterly crucial to the state of the ArrayList, because when you're serializing an ArrayList, you only want to save the part of the backing store that holds actual ArrayList elements, not the free space at the end reserved for further element additions. – user2357112 Apr 21 '16 at 20:56
  • 4
    Adding to the answer - if `var1` is needed for a couple of methods but not part of the state of `MyClass`, it might be time to put `var1` and those methods into another class to be used by `MyClass`. – Mike Partridge Apr 21 '16 at 21:16
  • 1
    @CandiedOrange No? This answer is spot on. If it's part of the state of the object, it should be stored. If it's something *derived* from the state of the object, it's not *part* of the state and shouldn't be stored. – jpmc26 Apr 22 '16 at 06:07
  • @jpmc26 if it is stored it is part of the state. Doesn't matter what it's useful lifetime is. Next you'll tell me i'm not allowed to initialize `i` outside of the `while` loop I want to use it in. – candied_orange Apr 22 '16 at 06:18
  • 5
    @CandiedOrange We're talking about correct modeling here. When we say "part of the object's state," we're not talking about the literal pieces in the code. We're discussing a *conceptual* notion of state, what *should* be the object's state to model the concepts properly. The "useful lifetime" is likely the biggest deciding factor in that. – jpmc26 Apr 22 '16 at 06:21
  • @jpmc26 yes you are. And that's sad. Because the issue isn't about state. I could get exactly the same effect by referring to a public static method or reading and writing from a file as I'm getting from having `var1` at the class level. To these methods it's nothing but a side effect. It's not the only way to get a side effect. So questioning `var1`'s right to exist simply misses the point entirely. I can code these methods either way, with, or without it. Whether I should is a darn good question but conceptual state isn't an answer for this. It's a distraction. – candied_orange Apr 22 '16 at 07:06
  • 1
    @candiedorange Consider an enumerator for an array. Next and Previous needs to know the index. Coceptually this should be part of the objects state, so you should implement it as a private variable. You could instead pass this value to the method, the methods would work but the enumerator would become pointless. – Taemyr Apr 22 '16 at 07:30
  • @Taemyr Ah, a useful example. Something that actually does something. Consider an enumerator with public `next` and `previous` methods that call private methods. What conceptual justification do you offer for telling me those private methods should or should not use the heap or the stack to read and write the index? It's the very fact that this can be done in either style, completely invisible to the outside world, that makes this answer frustrating. – candied_orange Apr 22 '16 at 07:54
  • 4
    @CandiedOrange Next and Previous are obviously public methods. If the value of var1 is only relevant across a sequence of private methods it's clearly not part of the persistent state of the object and thus should be passed as arguments. – Taemyr Apr 22 '16 at 08:26
  • "Similarly, private instance methods exist to mutate your object's state. If that's not what your method is doing then it should be private static." A private method can depend on state without changing it, in which case you can't implement it as static. – Taemyr Apr 22 '16 at 08:28
  • 1
    @Taemyr I'm struggling to think of a place in which a private instance method depending on but not mutating state is anything but dead code. Maybe you're thinking that something could do a calculation based on several fields and return the result, but IMO that should be static too; otherwise you have to step into another method to understand what's influencing what, which makes debugging a pain. – Nic Apr 22 '16 at 11:12
  • 1
    @QPaysTaxes Helper functions that are used in multiple public functions. – Taemyr Apr 22 '16 at 11:31
  • 1
    @Taemyr So... "*something could do a calculation based on several fields and return the result*"? Unless I'm misunderstanding you. – Nic Apr 22 '16 at 12:08
  • @QPaysTaxes let's be clear on one thing: program state > object's state. They are not the same. Both can be manipulated by private methods. A private method that doesn't return and doesn't mutate it's host object's state still has a whole system it could be talking to. Private helper methods can do a lot more than "mutate your object's state". Private helper methods are just an abstraction. They are called from within some other method which could have any accessibility. – candied_orange Apr 22 '16 at 13:05
  • @CandiedOrange Oh, I see. I'm not sure why you feel the need to be condescending; you could easily have clarified what you meant in the first place, but chose not to. A simple example like saving to a file or the like would have done perfectly well. In the future, please try to refrain from being rude when you're asked a simple question. – Nic Apr 22 '16 at 16:03
  • 1
    @QPaysTaxes I appologize. I didn't mean to offend. I am however confused. I have been trying very hard to clarify what I mean. Even offered an answer that you are welcome to downvote and comment on. If you understand now I'd love to hear your opinion. The popularity of this answer is making me think subroutines are forbidden in stateless objects which I'd never heard before. If you asked me a question, simple or otherwise, I can't find it. Please clarify. I'm flying in the face of popular opinion here and you're taking the time to argue with me. Which I respect and am grateful for. – candied_orange Apr 22 '16 at 20:05
  • 2
    @CandiedOrange You did clarify what you meant, after two comments. I'm saying that you easily could have in the first reply. Also, yeah, I forgot that there's more than just the object; I do agree that sometimes private instance methods have a purpose. I just couldn't bring any to mind. EDIT: I just realized you weren't, in fact, the first person to reply to me. My bad. I was confused as to why one reply was a curt, one-sentence, non-reply, while the next actually explained things. – Nic Apr 22 '16 at 21:16
  • @QPaysTaxes that's fine. Now if you agree that private instance methods exist **for more than just** to mutate your object's state, that some class variables do not exist to express state, that doing so might sometimes be a good design then how should we pick between these styles? – candied_orange Apr 22 '16 at 22:57
18

I don't know which is more prevalent, but I would always do the latter. It more clearly communicates the data flow and lifetime, and it doesn't bloat every instance of your class with a field whose only relevant lifetime is during initialization. I would say the former is just confusing, and makes code review significantly more difficult, because I have to consider the possibility that any method might modify var1.

Derek Elkins left SE
  • 6,591
  • 2
  • 13
  • 21
7

You should reduce the scope of your variables as much as possible (and reasonable). Not only in methods, but generally.

For your question that means it depends whether the variable is part of the object's state. If yes, it's OK to use it in that scope, i.e. the whole object. In this case go with the first option. If no, go with the second option as it reduces the visibility of variables and thus overall complexity.

Andy
  • 1,305
  • 7
  • 13
5

What Style is Better (Instance Variable vs. Return Value) in Java

There's another style - use a context/state.

public static class MyClass {
    // Hold my state from one call to the next.
    public static final class State {
        int var1;
    }

    MyClass() {
        State state = new State();
        doSomething(state);
        doSomethingElse(state);
        doMoreStuff(state);
    }

    private void doSomething(State state) {
        state.var1 = 2;
    }

    private void doSomethingElse(State state) {
        int var2 = state.var1 + 1;
    }

    private void doMoreStuff(State state) {
        int var3 = state.var1 - 1;
    }
}

There are a number of benefits to this approach. The state objects can change independently of the object for example, giving much wiggle room for the future.

This is a pattern that also works well in a distributed/server system where some details must be preserved across calls. You can store user details, database connections, etc. in the state object.

OldCurmudgeon
  • 778
  • 5
  • 11
3

It's about side effects.

Asking whether var1 is part of state misses the point of this question. Sure if var1 must persist, it has to be an instance. Either approach can be made to work whether persistence is needed or not.

The side effect approach

Some instance variables are only used to communicate between private methods from call to call. This kind of instance variable can be refactored out of existence but it doesn't have to be. Sometimes things are clearer with them. But this is not without risk.

You are letting a variable out of its scope because it is used in two different private scopes. Not because it's needed in the scope you're placing it. This can be confusing. The "globals are evil!" level of confusing. This can work but it just won't scale well. It only works in the small. No big objects. No long inheritance chains. Don't cause a yo yo effect.

The functional approach

Now, even if var1 must persist nothing says you have to use if for every transient value it might take on before it reaches the state you want preserved between public calls. That means you can still set a var1 instance using nothing but more functional methods.

So part of state or not, you can still use either approach.

In these examples 'var1' is so encapsulated nothing besides your debugger knows it exists. I'm guessing you did that deliberately because you don't want to bias us. Fortunately I don't care which.

The risk of side effects

That said, I know where your question is coming from. I've worked under miserable yo yo'ing inheritance that mutates an instance variable at multiple levels in multiple methods and gone squirrelly trying to follow it. This is the risk.

This is the pain that drives me to a more functional approach. A method can document its dependencies and output in its signature. This is a powerful, clear approach. It also lets you change what you pass the private method making it more reusable within the class.

The upside of side effects

It's also limiting. Pure functions have no side effects. That can be a good thing but it's not object oriented. A big part of object orientation is the ability to refer to a context outside of the method. Doing that without leaking globals all over here and gone is OOP's strength. I get the flexibility of a global but it's nicely contained in the class. I can call one method and mutate every instance variable at once if I like. If I do that I'm obligated to at least give the method a name that makes clear what it's up to so people won't be surprised when that happens. Comments can help as well. Sometimes these comments are formalized as "post conditions".

The downside of functional private methods

The functional approach makes some dependencies clear. Unless your in a pure functional language it can't rule out hidden dependencies. You don't know, looking just at a methods signature, that it isn't hiding a side effect from you in the rest of it's code. You just don't.

Post conditionals

If you, and everyone else on the team, reliably documents the side effects (pre/post conditions) in comments, then the gain from the functional approach is much less. Yeah I know, dream on.

Conclusion

Personally I tend towards functional private methods in either case if I can, but honestly it's mostly because those pre/post conditional side effect comments don't cause compiler errors when they're outdated or when methods are called out of order. Unless I really need the flexibility of side effects I'd rather just know that things work.

  • 1
    "Some instance variables are only used to communicate between private methods from call to call." That's a code smell. When instance variables are used this way, it is a sign the class is too large. Extract that variables and methods into a new class. – kevin cline Apr 21 '16 at 23:31
  • 2
    This really doesn't make any sense. While the OP *could* write the code in the functional paradigm (and this would generally be a good thing), that obviously isn't the context of the question. Trying to tell the OP that they can avoid storing the object's state by changing paradigms isn't really relevant... – jpmc26 Apr 22 '16 at 07:18
  • @kevincline the OP's example has only one instance variable and 3 methods. Essentially, it already has been extracted into a new class. Not that the example does anything useful. Class size has nothing to do with how your private helper methods communicate with each other. – MagicWindow Apr 22 '16 at 14:48
  • @jpmc26 OP has already shown that they can avoid storing `var1` as a state variable by changing paradigms. Class scope is NOT just where state is stored. It is also an enclosing scope. That means there are two possible motivations to put a variable at the class level. You can claim doing it only for the enclosing scope and not state is evil but I say it's a trade off with [arity](https://en.wikipedia.org/wiki/Arity) which is also evil. I know this because I've had to maintain code that does this. Some was done well. Some was a nightmare. The line between the two isn't state. It's readability. – MagicWindow Apr 22 '16 at 18:25
  • The state rule enhances readability: 1) avoid having interim results of operations look like state 2) avoid hiding the dependencies of methods. If the arity of a method is high, then it either irreducibly and genuinely represents the complexity of that method or the current design has unnecessary complexity. – sdenham Apr 24 '16 at 17:11
1

The first variant looks non-intuitive and potentially dangerous to me (imagine for whatever reason someone makes your private method public).

I would much rather instantiate your variables upon class construction, or pass them in as arguments. The latter gives you the option of using functional idioms and not relying on the state of the containing object.

Brian Agnew
  • 4,676
  • 24
  • 19
1

There already are answers talking about object state and when the second method is preferred. I just want to add one common use case for the first pattern.

The first pattern is perfectly valid when all what your class does is that it encapsulates an algorithm. One use case for this is that if you wrote the algorithm to a single method it would be too large. So you break it down into smaller methods make it a class and make the sub methods private.

Now passing all the state of the algorithm via parameters might get tedious so you use the private fields. It is also in agreement with the rule in the first paragraph because it is basically a state of the instance. You only have to keep in mind and properly document that the algorithm is not reentrant if you use private fields for this. This shouldn't be an issue most of the time but it can possibly bite you.

Honza Brabec
  • 602
  • 4
  • 11
0

Let's try an example that does something. Forgive me since this is javascript not java. The point should be the same.

Visit https://blockly-games.appspot.com/pond-duck?lang=en, click the javascript tab, and paste this:

hunt = lock(-90,1), 
heading = -135;

while(true) {
  hunt()  
  heading += 2
  swim(heading,30)
}

function lock(direction, width) {
  var dir = direction
  var wid = width
  var dis = 10000

  //hunt
  return function() {
    //randomize() //Calling this here makes the state of dir meaningless
    scanLock()
    adjustWid()
    if (isSpotted()) {
      if (inRange()) {
        if (wid <= 4) {
          cannon(dir, dis)
        }
      }
    } else {
      if (!left()) {
        right()
      }
    }
  }

  function scanLock() {
    dis = scan(dir, wid);
  }

  function adjustWid() {
    if (inRange()) {
      if (wid > 1)
        wid /= 2;
    } else {
      if (wid < 16) {
        wid *= 2; 
      }
    }
  }

  function isSpotted() {
    return dis < 1000;
  }

  function left() {
    dir += wid;
    scanLock();
    return isSpotted();
  }

  function right() {
    dir -= wid*2;
    scanLock();
    return isSpotted()
  }

  function inRange() {
    return dis < 70;
  }

  function randomize() {
    dir = Math.random() * 360
  }
}

You should notice that dir, wid, and dis are not getting passed around a lot. You should also notice that the code in the function that lock() returns looks a lot like pseudo code. Well it's actual code. Yet it's very easy to read. We could add passing and assigning but that would add clutter you would never see in pseudo code.

If you wish to argue that not doing the assign and passing is fine because those three vars are persistent state then consider a redesign that assigns dir a random value every loop. Not persistent now, is it?

Sure, now we could drop dir down in scope now but not without being forced to clutter our pseudo like code with passing and setting.

So no, state is not why you decide to use or not use side effects rather than passing and returning. Nor do side effects alone mean your code is unreadable. You don't get the benefits of pure functional code. But done well, with good names, they can actually be nice to read.

That isn't to say they can't turn into a spaghetti like nightmare. But then, what can't?

Happy duck hunting.

candied_orange
  • 102,279
  • 24
  • 197
  • 315
  • 1
    Inner/outer functions are a different paradigm than what OP describes. - I agree that the trade off between local variables in an outer function contra passed arguments to the inner functions is similar to private variables on a class contra passed argumetns to private functions. However if you do this comparison the lifetime of the object would correspond to a single run of the function, so the variables in the outer function is part of the persistent state. – Taemyr Apr 22 '16 at 11:29
  • @Taemyr the OP describes a private methods called within a public constructor which seems a lot like inner's within outer's to me. The state isn't really 'persistent' if you step on it every time you enter a function. Sometimes state is used as a shared place methods can write and read. Doesn't mean it needs to be persisted. The fact that it is persisted anyway isn't something anything HAS to depend on. – candied_orange Apr 22 '16 at 13:18
  • 1
    It's persistent even if you step on it every time you dispose of the object. – Taemyr Apr 22 '16 at 13:41
  • @Taemyr The memory allocation persists. But no logical or conceptual dependency exists. Simply allocating memory doesn't mean the memorys value or state can be expressed as part of the object's state. If X is set in A ,only ever read in B, and A & B are always called in order with no time delay then please explain why I should think of X as state? It only has logical existence between them and can't be accessed. Yet this can be used to do work. Are you saying it should never be done this way? – candied_orange Apr 22 '16 at 20:29
  • 1
    Good points, but expressing them in JavaScript really obfuscates the discussion. Also, if you strip away the JS syntax, you end up with the context object being passed around (the closure of the outer function) – fdreger Apr 23 '16 at 10:48
  • @fdreger You're right of course. if you think of `lock()` as the constructor and `hunt()` as a public method it's not that different from java. `dir`,`wid`, and `dis` are the instance variables or state. Calling randomize() at the start of `hunt()` would keep `dir`s state from having any meaning. If I decide to call `randomize()` at the start of `hunt()` do I have to rewrite all the methods that use `dir` to accept it as a parameter since now it's not really state? – candied_orange Apr 23 '16 at 14:51
  • Constancy of an attribute's value is not what people mean by persistence in this discussion, which is that the attribute is part of the complete description of the object - e.g. the water level is part of of a reservoir's state. The concept of class in Java etc. is predicated on the premise that a type's semantics can be represented by a fixed set of such attributes (I am not defending this use of 'persistence'; I am trying to explain it.) As @fdreger points out, your wid, dis and dir are valid state attributes, so this isn't the counter-example against the state rule that you are seeking. – sdenham Apr 24 '16 at 18:15
  • @sdenham it is when I refactor to call `randomize()` as the first step in the `hunt()` function. Now `dir` is no longer stateful. When I make that change am I obligated to push dir down out of `hunt()`'s scope and end up having to pass it to everything that uses it? I already mentioned this in the 4th paragraph. – candied_orange Apr 24 '16 at 19:20
  • Randomizing the value of an attribute does not rule out that attribute from being an element of state. For example, the number showing uppermost on a die after it has been rolled is part of its state. – sdenham Apr 27 '16 at 17:14
  • @sdenham it rules it out as being persistent state for the object. Really every bit of data and every line of code is system state. But thinking that way is not that helpful when trying to answer this question. At issue is if anything that doesn't represent the persistent state of the object has any business being declared at the class level. If you think nothing else does then once I use `randomize()` and make `dir`s state irrelevant then it's in the wrong place and must be moved. If you think the class level is for more things than persistent object state then it's fine left where it is. – candied_orange Apr 27 '16 at 18:12
  • Your first sentence is mistaken. A persistent attribute can have a variable value - see the reservoir example in my first comment here. I (and others here) disagree with your last sentence because one can generally distinguish between the state attributes of an object and the intermediate results of an operation, and values that are only the latter should not be bound to objects' member attributes without good reason. Simply avoiding putting them in method argument lists is not enough IOHO. The 'state rule' is a rule of thumb, not an inviolate law. BTW I did not vote you down. – sdenham Apr 27 '16 at 19:55
  • @sdenham Nothing depends on the persisting value. So it's persistence isn't needed. That means there is no logical requirement for it to be at the class level. That means it could be pushed down from the class level if it's going to be randomized. But doing that also means it has to be passed into all methods that would use it. That's not nothing. That's arity. Your reservoir would only fit this example if the water level changed every time you looked at it. If that was the case is there any need for it to hold water when you aren't looking? – candied_orange Apr 27 '16 at 20:19
  • Before posting, you might have asked yourself whether there are not obvious examples that avoid your latest objection - e.g. object: capacitor, attribute: charge. More importantly, this repetition of your values argument shows you still don't get the distinction between attributes and their values. Until you do, you will not see why others (@metaFight, @jpmc26, @Taemyr) see a difference between the attributes of a class of objects and transient intermediate results in an operation. Just because both may vary, it does not follow that they are equivalent; there are semantic and lifetime issues. – sdenham Apr 28 '16 at 12:45
  • 1
    @sdenham I'm arguing that there is a difference between the attributes of a class and transient intermediate results of an operation. They are not equivalent. But one may be stored in the other. It certainly doesn't have to be. The question is if it ever should be. Yes there are semantic and lifetime issues. There are also arity issues. You don't think they're important enough. That's fine. But to say that using the class level for anything other than object state is incorrect modeling is insisting on one way of modeling. I've maintained professional code that did it the other way. – candied_orange Apr 28 '16 at 13:42