43

I'm writing a program in Java where at one point I need to load a password for my keystore. Just for fun, I tried to keep my password in Java as short as possible by doing this:

//Some code
....

KeyManagerFactory keyManager = KeyManagerFactory.getInstance("SunX509");
Keystore keyStore = KeyStore.getInstance("JKS");

{
    char[] password = getPassword();
    keyStore.load(new FileInputStream(keyStoreLocation), password);
    keyManager.init(keyStore, password);
}

...
//Some more code

Now, I know in this instance that's kinda dumb. There are a bunch of other things I could've done, most of them actually better (I could've not used a variable at all).

However, I was curious if there's a case where doing this wasn't so dumb. The only other thing I can think of is if you wanted to reuse common variable names like count, or temp, but good naming conventions and short methods make it unlikely that would be useful.

Is there a case where using blocks only to reduce variable scope makes sense?

Lord Farquaad
  • 569
  • 5
  • 10
  • Possible duplicate of [How would you know if you've written readable and easily maintainable code?](https://softwareengineering.stackexchange.com/questions/141005/how-would-you-know-if-youve-written-readable-and-easily-maintainable-code) – gnat Nov 29 '17 at 15:00
  • 13
    @gnat ....what? I don't even know how to edit my question to be less similar to that dupe target. What part of this led you to believe these questions are the same? – Lord Farquaad Nov 29 '17 at 15:07
  • 22
    The usual reply to this is that your anonymous block should be a method with a significant name instead, which gives you automatic documentation *and* the reduced scope you wanted. Offhand, I can't think of a situation where extracting a method isn't better than inserting a naked block. – Kilian Foth Nov 29 '17 at 15:08
  • 4
    There are a couple of similar questions that ask this for C++: [Is it bad practice to create blocks of code?](https://softwareengineering.stackexchange.com/q/360123) and [Structure Code with curly brackets](https://softwareengineering.stackexchange.com/q/357450). Maybe one of those contains an answer to this question, though C++ and Java are very different in this respect. – amon Nov 29 '17 at 15:09
  • @LordFarquaad, I don't think you need to delete it. If it gets closed as a duplicate, it's still useful to keep as it then provides another way of directing folk to the original questions and their answers. – David Arno Nov 29 '17 at 16:19
  • In this example, if you added a final line, `password = null;`, it **might** make a little sense that you were paranoid about somebody sniffing through the executing code somehow. Not sure if it would actually help, but maybe somebody with security expertise could comment? – user949300 Nov 29 '17 at 17:20
  • 2
    Possible duplicate of [Is it bad practice to create blocks of code?](https://softwareengineering.stackexchange.com/questions/360123/is-it-bad-practice-to-create-blocks-of-code) – Panzercrisis Nov 29 '17 at 20:21
  • In your specific case, I don't think Java requires the JVM to clean up that password when exiting there. I would not be surprised if the compiler implementation and/or JVM implementation did clean that up when you expect, but I don't think that's to be counted on, especially if it were a security concern. – Aaron Nov 29 '17 at 22:05
  • 4
    @gnat Why is that question even open? It's one of the broadest things I have *ever* seen. Is it an attempt at a canonical question? – jpmc26 Nov 29 '17 at 23:49
  • Also relevant, on SE: [Is it wrong to use braces for variable scope purposes?](https://stackoverflow.com/questions/3189366/is-it-wrong-to-use-braces-for-variable-scope-purposes/3189889#3189889). (Full disclosure: I have an answer on that question). – Mac Nov 30 '17 at 03:46
  • @jpmc26 not canonical, no. Business as usual, see [Recent Trouble With Popularity](https://softwareengineering.meta.stackexchange.com/q/7525/31260) – gnat Nov 30 '17 at 06:36
  • @KilianFoth Surely extracting a method does the opposite of "reducing scope"? You're going from e.g. a `temp` var polluting a single method/block, to a `newlyExtractedMethod` polluting the entire class (including the original scope that `temp` was in) – Warbo Nov 30 '17 at 14:30

7 Answers7

36

First, speaking to the underlying mechanics:

In C++ scope == lifetime b/c destructors are invoked on the exit from the scope. Further, an important distinction in C/C++ we can declare local objects. In the runtime for C/C++ the compiler will generally allocate a stack frame for the method that is as large as may be needed, in advance, rather than allocating more stack space on entry to each scope (that declares variables). So, the compiler is collapsing or flattening the scopes.

The C/C++ compiler may reuse stack storage space for locals that don't conflict in usage lifetime (it generally will use analysis of the actual code to determine this rather than scope, b/c that is more accurate than scope!).

I mention C/C++ b/c Java's syntax, i.e. curly braces and scoping, is at least in part derived from that family. And also because C++ came up in question comments.

By contrast, it is not possible to have local objects in Java: all objects are heap objects, and all locals/formals are either reference variables or primitive types (btw, the same is true for statics).

Further, in Java scope and lifetime are not exactly equated: being in/out of scope is a mostly a compile time concept going to accessibility and name conflicts; nothing really happens in Java on scope exit with regard to cleanup of variables. Java's garbage collection determines (the ending point of the) lifetime of objects.

Also Java's bytecode mechanism (the output of the Java compiler) tends to promote user variables declared within limited scopes to the top level of the method, because there is no scope feature at the bytecode level (this is similar to C/C++ handling of the stack). At best the compiler could reuse local variable slots, but (unlike C/C++) only if their type is the same.

(Though, to be sure the underlying JIT compiler in the runtime could reuse the same stack storage space for two differently typed (and different slotted) locals, with sufficient analysis of the bytecode.)

Speaking to the programmer advantage, I would tend to agree with others that a (private) method is a better construct even if used only once.

Still, there is nothing really wrong with using it to deconflict names.

I have done this in rare circumstances when making individual methods is a burden. For example, when writing an interpreter using a large switch statement within a loop, I might (depending on factors) introduce a separate block for each case to keep the individual cases more separate from each other, instead of making each case a different method (each of which is only invoked once).

(Note that as code in blocks, the individual cases have access to the "break;" and "continue;" statements regarding the enclosing loop, whereas as methods would require returning booleans and the caller's use of conditionals to gain access to these control flow statements.)

Erik Eidt
  • 33,282
  • 5
  • 57
  • 91
  • Very interesting answer (+1) ! However one complement to the C++ part. If password is a string in a block, the string object will allocate space somewhere on the free store to for the variable size part. When leaving the bloc, the string will be destroyed, resulting in the deallocation of the variable part. But as it's not on the stack, nothing guarantees that it will be overwritten soon. So better manage the confidentiality by overwriting each of its character before the string gets destroyed ;-) – Christophe Nov 29 '17 at 23:16
  • @Christophe, I'd be careful of how you do that. Overwriting the characters doesn't produce any side effects. If the string is being destroyed right after, the compiler is free to ignore the overwriting. This is why Windows has a [`SecureZeroMemory`](https://blogs.msdn.microsoft.com/oldnewthing/20130529-00/?p=4223) function. – chris Nov 30 '17 at 03:43
  • I never knew C had methods. How interesting! – tchrist Nov 30 '17 at 03:53
  • @tchrist, oh, was speaking about Java here, sorry for the misunderstanding. – Erik Eidt Nov 30 '17 at 03:59
  • 1
    @ErikEidt I was trying to draw attention to the problem of speaking of "C/C++" as though it were actually a "thing", but it is not: *"In the runtime for C/C++ the compiler will generally allocate a stack frame for the **method** that is as large as may be needed"* but C does not have methods at all. It has functions. These are different, especially in C++. – tchrist Nov 30 '17 at 04:01
  • @tchrist, ok, you're right, my bad! C does not have methods, but functions! – Erik Eidt Nov 30 '17 at 06:19
  • 7
    "_By contrast, it is not possible to have local objects in Java: all objects are heap objects, and all locals/formals are either reference variables or primitive types (btw, the same is true for statics)._" - Note that this is not entirely true, especially not for method local variables. [Escape analysis](https://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html#escapeAnalysis) can allocate objects to the stack. – Boris the Spider Nov 30 '17 at 07:42
  • Last 4 paragraphs are on point. But the first part, while entertaining is less relevant. I suggest restructuring to put more emphasis on actual answer and less on the historic aspect. – Agent_L Nov 30 '17 at 10:26
  • “In C++ scope == lifetime” — noooooo. That’s true only for a small subset of objects (automatic variables). In fact, the distinction between scope and lifetime is *crucial* in C++. And, as others have mentioned, the rules are quite different for C and C++; and speaking of “C/C++” is seriously misguided in this context. – Konrad Rudolph Nov 30 '17 at 11:05
  • 1
    “*At best the compiler could reuse local variable slots, but (unlike C/C++) only if their type is the same.*” is simply wrong. At bytecode level, local variables can get assigned and reassigned with whatever you want, the only constraint is that the *subsequent use* is compatible to the latest assigned item’s type. Reusing local variable slots is the norm, e.g. with `{ int x = 42; String s="blah"; } { long y = 0; }`, the `long` variable will reuse the slots of *both* variables, `x` and `s` with all commonly used compilers (`javac` and `ecj`). – Holger Nov 30 '17 at 14:02
  • 1
    @Boris the Spider: that’s an often repeated colloquial statement that doesn’t really match what’s going on. Escape Analysis may enable Scalarization, which is the decomposition of the object’s fields into variables, which are then subject to further optimizations. Some of them might be eliminated as unused, others may get folded with the source variables used to initialize them, others get mapped to CPU registers. The result rarely matches what people might imagine when saying “allocated on the stack”. – Holger Nov 30 '17 at 14:54
  • [To demonstrate the reuse of local variables in a stack frame in Java (Ideone)](https://ideone.com/UX6x15)… – Holger Nov 30 '17 at 15:29
  • @BoristheSpider, you are very correct in that the underlying runtime may locate an object on the stack, which is cool. Yes, under the covers, optimizations are possible (e.g to place instantiated objects on the stack, and maybe even elide the reference variables that refer to them). Yet still, local variables we can declare in the Java language are references (or primitives); we cannot declare variables that are themselves objects, we can only instantiate objects via `new` . – Erik Eidt Dec 14 '17 at 00:19
  • If a variable goes out of scope, doesn't that signal to Java's GC that it is ready to be garbage collected? – moonman239 Jul 02 '18 at 08:33
  • @moonman239, no, it goes the other way around: the GC detects what is active (in scope) only during GC collection operations, which are triggered by low memory (which is caused by memory allocation). This means that it doesn't need a signal for variables going out of scope; the program can run free between GC collection operations. – Erik Eidt Jul 02 '18 at 14:23
  • @moonman239, Further, nested scopes -- those inside a method -- apply only during compile time: the Java bytecode mechanism looses these scopes, flattening them all to the method level. So, the JVM doesn't even see the original inner scopes; it only sees flattened bytecode methods. – Erik Eidt Jul 02 '18 at 14:25
  • I would not introduce a private methods in order to avoid nested scopes in all cases. They also document how long a local variable is used in the code. So even if using them has no influence on the compiled code it is still a useful documentation tool. And if you see it like this it doesn't always make sense to move the few lines to separate method. Or do you always move a few lines of code into a separate method in oder to place a code comment over that code section. – Jonny Dee Oct 24 '18 at 06:22
27

In my opinion it would be more clear to pull the block out into its own method. If you left this block in, I would hope to see a comment clarifying why you're putting the block there in the first place. At that point it just feels cleaner to have the method call to initializeKeyManager() which keeps the password variable limited to the scope of the method and shows your intent with the code block.

Casey Langford
  • 444
  • 3
  • 7
  • 2
    I think your answer skirts by a key usecase, it’s an intermediate step during a refactor. It can be useful to limit the scope and know your code still works just prior to extracting a method. – RubberDuck Nov 29 '17 at 22:00
  • 19
    @RubberDuck true, but in that case the rest of us should never see it so it doesn't matter what we think. What an adult coder and an a consenting compiler get up to in the privacy of their own cubicle is no one else's concern. – candied_orange Nov 30 '17 at 01:51
  • @candied_orange I fear I get it not :( Care to elaborate ? – doubleOrt Jul 10 '18 at 20:27
  • @doubleOrt I meant intermediate refactoring steps are real and necessary but certainly don't need to be enshrined in source control where everyone can look at it. Just finish refactoring before you check in. – candied_orange Jul 10 '18 at 21:12
  • @candied_orange Oh, thought you were arguing and not extending RubberDuck's argument. – doubleOrt Jul 10 '18 at 21:38
17

They can be useful in Rust, with it's strict borrowing rules. And generally, in functional languages, where that block returns a value. But in languages like Java? While the idea seems elegant, in practice it's rarely used so the confusion from seeing it will dwarf the potential clarity of limiting the variables' scope.

There is one case though where I find this useful - in a switch statement! Sometimes you want to declare a variable in a case:

switch (op) {
    case ADD:
        int result = a + b;
        System.out.printf("%s + %s = %s\n", a, b, result);
        break;
    case SUB:
        int result = a - b;
        System.out.printf("%s - %s = %s\n", a, b, result);
        break;
}

This, however, will fail:

error: variable result is already defined in method main(String[])
            int result = a - b;

switch statements are weird - they are control statements, but due to their fallthrough they don't introduce scopes.

So - you can just use blocks to create the scopes yourself:

switch (op) {
    case ADD: {
        int result = a + b;
        System.out.printf("%s + %s = %s\n", a, b, result);
    } break;
    case SUB: {
        int result = a - b;
        System.out.printf("%s - %s = %s\n", a, b, result);
    } break;
}
Idan Arye
  • 12,032
  • 31
  • 40
6
  • General case:

Is there a case where using blocks only to reduce variable scope makes sense?

It is generally considered good practice to keep methods short. Reducing the scope of some variables can be a sign that your method is quite long, sufficiently so that you think the scope of the variables needs to be reduced. It this case, it's probably worth creating a separate method for that part of the code.

The cases that I would find problematic is when that part of code uses more than a handful of arguments. There are situations where you could end up with small methods that each take a lot of parameters (or need workaround to simulate returning multiple values). The solution to that particular problem is to create another class that keeps hold of the various variables you use, and implements implements all the steps you have in mind in its relatively short methods: this is a typical use-case for using a Builder pattern.

That said, all these coding guidelines can be applied loosely sometimes. There are odd cases sometimes, where the code can be more readable if you keep it in a slightly longer method, instead of splitting it into small sub-methods (or builder patterns). Typically, this works for problems that are medium-sized, quite linear, and where too much splitting actually hinders readability (especially if you need to jump between too many methods to understand what the code does).

It can make sense, but it's very rare.

  • Your use case:

Here is your code:

KeyManagerFactory keyManager = KeyManagerFactory.getInstance("SunX509");
Keystore keyStore = KeyStore.getInstance("JKS");

{
    char[] password = getPassword();
    keyStore.load(new FileInputStream(keyStoreLocation), password);
    keyManager.init(keyStore, password);
}

You're creating a FileInputStream, but you're never closing it.

One way of solving this is to use a try-with-resources statement. It scopes the InputStream, and you can also use this block to reduce the scope of other variables if you wish (within reason, since these lines are related):

KeyManagerFactory keyManager = KeyManagerFactory.getInstance(KeyManagerFactory.getDefaultAlgorithm());
Keystore keyStore = KeyStore.getInstance("JKS");

try (InputStream fis = new FileInputStream(keyStoreLocation)) {
    char[] password = getPassword();
    keyStore.load(fis, password);
    keyManager.init(keyStore, password);
}

(It'a also often better to use getDefaultAlgorithm() instead of SunX509, by the way.)

In addition, while the advice to separate chunks of code into separate methods is generally sound. It's rarely worth it if it's for just 3 lines (if anything, that's a case where creating a separate method hinders the readability).

Bruno
  • 327
  • 1
  • 5
  • "Reducing the scope of some variables can be a sign that your method is quite long" - It _can_ be a sign but it also _can_ and IMHO _should_ be used in order to _document_ how long a local variable's value is used. In fact, for languages not supporting nested scopes (e.g. Python) I'd appreciate if a corresponding code comment stated the "end of life" of a variable. Should the same variable be reused later it would be easy to figure out if it must (or should) have been initialized with a new value somewhere before (but after the "end of life" comment). – Jonny Dee Oct 24 '18 at 07:13
2

In my humble opinion, it's generally something to avoid largely for production code because you're generally not tempted to do it except in sporadic functions that perform disparate tasks. I tend to do it in some scrap code used to test things, but find no temptation to do it in production code where I've put some thought in advance as to what each function should do, since then the function will naturally have a very limited scope with respect to its local state.

I've never really seen examples of anonymous blocks being used like this (not involving conditionals, a try block for a transaction, etc) to reduce scope in a meaningful way in a function which didn't beg the question of why it couldn't be divided further into simpler functions with reduced scopes if it actually really benefited practically from a genuine SE standpoint from the anonymous blocks. It's usually eclectic code that's doing a bunch of loosely related or very unrelated things where we're most tempted to reach for this.

As an example, if you are trying to do this to reuse a variable named count, then it suggests you're counting two disparate things. If the variable name is going to be as short as count, then it makes sense to me to tie it to the context of the function, which could potentially just be counting one type of thing. Then you can instantly look at the function's name and/or documentation, see count, and instantly know what it means in the context of what the function is doing without analyzing all the code. I don't often find a good argument for a function to count two different things reusing the same variable name in ways that make anonymous scopes/blocks so appealing compared to the alternatives. That's not so suggest that all functions have to count only one thing. I'm saying I see little practical engineering benefit to a function reusing the same variable name to count two or more things and using anonymous blocks to limit scope of each individual count. If the function is simple and clear, it's not the end of the world to have two differently named count variables with the first one possibly having a few more lines of scope visibility than ideally required. Such functions generally are not the source of errors lacking such anonymous blocks to reduce the minimal scope of its local variables even further.

Not a Suggestion for Superfluous Methods

This is not to suggest you forcefully create methods either just to reduce scope. That's arguably just as bad or worse, and what I'm suggesting shouldn't call for a need for awkward private "helper" methods any more than a need for anonymous scopes. That's thinking too much about the code as it is now and how to reduce the scope of variables than thinking about how to conceptually solve the problem at the interface level in ways that yield clean, short visibility of local function states naturally without deep nesting of blocks and 6+ levels of indentation. I agree with Bruno that you can hinder code readability by forcefully shoving 3 lines of code into a function, but that's starting with the assumption that you are evolving the functions you create based on your existing implementation, rather than designing the functions without getting tangled up in implementations. If you do it the latter way, I have found little need for anonymous blocks that serve no purpose beyond reducing variable scope within a given method unless you're trying zealously to reduce the scope of a variable by just a few less lines of harmless code where the exotic introduction of these anonymous blocks arguably contributes as much intellectual overhead as they remove.

Trying to Reduce Minimum Scopes Even Further

If reducing local variable scopes to the absolute minimum was worthwhile, then there should be a wide acceptance of code like this:

ImageIO.write(new Robot("borg").createScreenCapture(new Rectangle(Toolkit.getDefaultToolkit().getScreenSize())), "png", new File(Db.getUserId(User.handle()).toString()));

... as that causes the minimum visibility of state by not even creating variables to refer to them in the first place. I don't want to come off as dogmatic but I actually think the pragmatic solution is to avoid anonymous blocks when possible just as it is to avoid the monstrous line of code above, and if they seem so absolutely necessary in a production context from the perspective of correctness and maintaining invariants within a function, then I definitely think how you are organizing your code into functions and designing your interfaces is worth re-examining. Naturally if your method is 400 lines long and a variable's scope is visible to 300 lines of code more than needed, that could be a genuine engineering problem, but it's not necessarily a problem to solve with anonymous blocks.

If nothing else, utilizing anonymous blocks all over the place is exotic, not idiomatic, and exotic code carries the risk of being hated by others, if not yourself, years later.

The Practical Usefulness of Reducing Scope

The ultimate usefulness of reducing the scope of variables is to allow you to get state management correct and keep it correct and allow you to easily reason about what any given portion of a codebase does -- to be able to maintain conceptual invariants. If a single function's local state management is so complex that you have to forcefully reduce scope with an anonymous block in code that's not meant to be finalized and good, then again, that's a sign to me that the function itself needs to be re-examined. If you have difficulty reasoning about the state management of variables in a local function scope, imagine the difficulty of reasoning about the private variables accessible to every method of an entire class. We can't use anonymous blocks to reduce their visibility. To me it helps to begin with the acceptance that variables will tend to have a slightly wider scope than they ideally need to have in many languages, provided it's not getting out of hand to the point where you'll have difficulty maintaining invariants. It's not something to solve so much with anonymous blocks as I see it as accept from a pragmatic standpoint.

  • As a final caveat, as I mentioned I still utilize anonymous blocks from time to time, often in main entry point methods for scrap code. Someone else mentioned it as a refactoring tool and I think that's fine, since the result is intermediary and intended to be further refactored, not finalized code. I'm not discounting their usefulness entirely, but if you are trying to write code that is widely accepted, well-tested, and correct, I hardly see the need for anonymous blocks. Getting obsessed about using them can shift away the focus from designing clearer functions with naturally small blocks. –  Nov 29 '17 at 23:05
2

Is there a case where using blocks only to reduce variable scope makes sense?

I think that motivation is sufficient reason to introduce a block, yes.

As Casey noted, the block is a "code smell" that indicates you may be better off extracting the block into a function. But you may not.

John Carmark wrote a memo on inlined code in 2007. His concluding summary

  • If a function is only called from a single place, consider inlining it.
  • If a function is called from multiple places, see if it is possible to arrange for the work to be done in a single place, perhaps with flags, and inline that.
  • If there are multiple versions of a function, consider making a single function with more, possibly defaulted, parameters.
  • If the work is close to purely functional, with few references to global state, try to make it completely functional.

So think, make a choice with purpose, and (optional) leave evidence describing your motivation for the choice you made.

VoiceOfUnreason
  • 32,131
  • 2
  • 42
  • 79
1

My opinion may be controversial, but yes, I think that there are certainly situations where I would use this kind of style. However, "just to reduce the variable's scope" is not a valid argument, because that's what you are already doing. And doing something just to do it is not a valid reason. Also note that this explanation has no meaning if your team has already resolved whether this kind of syntax is conventional or not.

I am primarily a C# programmer, but I have heard that in Java, returning a password as char[] is to allow you to reduce the time the password will stay in the memory. In this example, it will not help, because the garbage collector is permitted to collect the array from the moment it is not in use, so it matters not if the password leaves the scope. I am not arguing whether it is viable to clear the array after you are done with it, but in this case, if you want to do it, scoping the variable makes sense:

{
    char[] password = getPassword();
    try{
        keyStore.load(new FileInputStream(keyStoreLocation), password);
        keyManager.init(keyStore, password);
    }finally{
        Arrays.fill(password, '\0');
    }
}

This is really similar to the try-with-resources statement, because it scopes the resource and performs a finalization on it after you are done. Again please note that I am not arguing for handling passwords in this manner, just that if you decide to do so, this style is reasonable.

The reason for this is that the variable is no longer valid. You have created it, used it, and invalidated its state to contain no meaningful information. It makes no sense to use the variable after this block, so it is reasonable to scope it.

Another example I can think of is when you have two variables that have a similar name and meaning, but you work with one and then with another, and you want to keep them separate. I have written this code in C#:

{
    MethodBuilder m_ToString = tb.DefineMethod("ToString", MethodAttributes.Public | MethodAttributes.Virtual | MethodAttributes.Final, typeofString, Type.EmptyTypes);
    var il = m_ToString.GetILGenerator();
    il.Emit(OpCodes.Ldstr, templateType.ToString()+":"+staticType.ToString());
    il.Emit(OpCodes.Ret);
    tb.DefineMethodOverride(m_ToString, m_Object_ToString);
}
{
    PropertyBuilder p_Class = tb.DefineProperty("Class", PropertyAttributes.None, typeofType, Type.EmptyTypes);
    MethodBuilder m_get_Class = tb.DefineMethod("get_Class", MethodAttributes.Public | MethodAttributes.Virtual | MethodAttributes.Final, typeofType, Type.EmptyTypes);
    var il = m_get_Class.GetILGenerator();
    il.Emit(OpCodes.Ldtoken, staticType);
    il.Emit(OpCodes.Call, m_Type_GetTypeFromHandle);
    il.Emit(OpCodes.Ret);
    p_Class.SetGetMethod(m_get_Class);
    tb.DefineMethodOverride(m_get_Class, m_IPattern_get_Class);
}

You can argue that I can simply declare ILGenerator il; at the top of the method, but I also don't want to reuse the variable for different objects (kinda functional approach). In this case, the blocks make it easier to separate the jobs that are performed, both syntactically and visually. It also tells that after the block, I am done with il and nothing should access it.

An argument against this example is using methods. Maybe yes, but in this case, the code is not that long, and separating it into different methods would also need to pass along all the variables that are used in the code.

IS4
  • 215
  • 1
  • 7