17

I'm a long time Java developer and finally, after majoring, I have time to study it decently in order to take the certification exam... One thing that has always bothered me is String being "final". I do understand it when a read about the security issues and related stuff... But, seriously, does anyone have a true example of that?

For instance, what would happen if String weren't final? Like it's not in Ruby. I haven't heard any complaints coming from the Ruby community... And I'm aware of the StringUtils and related classes that you have to either implement yourself or search over the web to just implement that behavior (4 lines of code) you're willing to.

wleao
  • 1,402
  • 1
  • 9
  • 14

6 Answers6

25

The main reason is speed: final classes can't be extended which allowed the JIT to do all kinds of optimizations when handling strings - there is never a need to check for overridden methods.

Another reason is thread safety: Immutables are always thread safe because a thread has to completely build them before they can be passed to someone else - and after building, they can't be changed anymore.

Also, the inventors of the Java runtime always wanted to err on the side of safety. Being able to extend String (something I often do in Groovy because it's so convenient) can open a whole can of worms if you don't know what you're doing.

Aaron Digulla
  • 2,865
  • 21
  • 22
  • @JarrodRoberson: That is true. I believe the point is `String` itself is immutable and making it final ensures that anything that is a `String` is immutable, because to mutable subclass can be slipped in. – back2dos Nov 08 '11 at 22:25
  • 2
    It's an incorrect answer. Other than for verification required by the JVM spec, HotSpot only uses `final` as a quick check that a method is not overridden when compiling code. – Tom Hawtin - tackline Nov 27 '12 at 18:06
  • 1
    @TomHawtin-tackline: References? – Aaron Digulla Nov 28 '12 at 08:28
  • 1
    You seem to imply that immutability and being final are somehow related. "Another reason is thread safety: Immutables are always thread safe b...". The reason an object is immutable or not is not becasue they are or aren't final. – Tulains Córdova Jul 10 '14 at 18:32
  • @user61852 if it is not final, it cannot be guaranteed to be immutable, since you can create a mutable subclass. See [this answer](http://stackoverflow.com/a/2068812/2626593) on SO. – abl Jul 10 '14 at 21:05
  • 1
    In addition, the String class _is_ immutable irrespective of the `final` keyword on the class. The character array it contains _is_ final, making it immutable. Making the class itself final closes the loop as @abl mentions since a mutable subclass cannot be introduced. –  Jul 11 '14 at 21:15
  • 1
    @Snowman To be precise, the character array being final only makes the array reference immutable, not the array's contents. – Michał Kosmulski Jun 17 '15 at 16:09
  • @MichałKosmulski yep, I thought the JVM locked down that internal `value` field but I guess not. Probably requires a security manager to be in place, which will not typically be there in most realistic scenarios. –  Jun 17 '15 at 20:07
11

There's another reason why Java's String class needs to be final: it is important for security in some scenarios. If the String class weren't final, the following code would be vulnerable to subtle attacks by a malicious caller:

 public String makeSafeLink(String url, String text) {
     if (!url.startsWith("http:") && !url.startsWith("https:"))
         throw SecurityException("only http/https URLs are allowed");
     return "<a href=\"" + escape(url) + "\">" + escape(text) + "</a>";
 }

The attack: a malicious caller could create a subclass of String, EvilString, where EvilString.startsWith() always returns true but where the value of EvilString is something evil (e.g., javascript:alert('xss')). Due to the subclassing, this would evade the security check. This attack is known as a time-of-check-to-time-of-use (TOCTTOU) vulnerability: between the time when the check is done (that the url starts with http/https) and the time when the value is used (to construct the html snippet), the effective value can change. If String wasn't final, then TOCTTOU risks would be pervasive.

So if String is not final, it becomes tricky to write secure code if you can't trust your caller. Of course, that's exactly the position that the Java libraries are in: they might be invoked by untrusted applets, so they don't dare trust their caller. This means that it would be unreasonably tricky to write secure library code, if String weren't final.

D.W.
  • 447
  • 8
  • 11
  • Of course, it's still possible to pull off that TOCTTOU vuln by using reflection to modify the `char[]` inside the string, but it requires some luck in the timing. – Peter Taylor Nov 07 '11 at 08:27
  • 2
    @Peter Taylor, it's more complicated than that. You can only use reflection to access private variables if the SecurityManager gives you permission to do so. Applets and other untrusted code are not given this permission, and thus cannot use reflection to modify the private `char[]` inside the string; therefore, they cannot exploit the TOCTTOU vulnerability. – D.W. Nov 08 '11 at 08:10
  • I know, but that still leaves applications and signed applets - and how many people are actually scared off by the warning dialog for a self-signed applet? – Peter Taylor Nov 08 '11 at 08:52
  • 2
    @Peter, that's not the point. The point is that writing trusted Java libraries would be a lot more difficult, and there would be a lot more reported vulnerabilities in Java libraries, if `String` wasn't final. As long as some of the time this threat model is relevant, then it becomes important to defend against. As long as it is important for applets and other untrusted code, that is arguably enough to justify making `String` final, even if it isn't needed for trusted applications. (In any case, trusted applications can already bypass all security measures, regardless whether it's final.) – D.W. Nov 08 '11 at 21:47
  • "The attack: a malicious caller could create a subclass of String, EvilString, where EvilString.startsWith() always returns true..." -- not if startsWith() is final. – Nova Jun 11 '18 at 22:38
  • @Erik, Sure. But that's got its own issues. If every method and every field of `String` were either final or private, then those risks would go away. But that would greatly reduce the value of making `String` non-final; once you eliminate the possibility to override any of its methods, the ability to subclass it becomes less useful. Alternatively, if you allow subclassing and only make some of the methods final, then you open a range of possibilities for sneaky TOCTTOU vulnerabilities, and it becomes challenging to reason about all the ways that might go awry. – D.W. Jun 11 '18 at 22:49
4

If not, multithreaded java apps would be a mess (even worse than what actually is).

IMHO The primarily advantage of final strings (immutable) is that they are inherently thread-safe: they require no synchronization (writing that code is sometimes fairly trivial but more than less far from that). If the were mutable, guarantee things like multithreaded windows toolkits would be very hard, and those things are strictly necessary.

1

Another advantage to String being final is that it ensures predictable results, just like immutability. String, though a class, is meant to be treated in a few scenarios, like a value type (the compiler even supports it via String blah = "test"). Therefore, if you make a string with a certain value, you expect it to have certain behavior that is inherit to any string, similar to expectations you'd have with integers.

Think about this: What if you sub-class java.lang.String and over-rode the equals or hashCode methods? Suddenly two Strings "test" could no longer equal each other.

Imagine the chaos if I could do this:

public class Password extends String {
    public Password(String text) {
        super(text);
    }

    public boolean equals(Object o) {
        return true;
    }
}

some other class:

public boolean isRightPassword(String suppliedString) {
    return suppliedString != null && suppliedString.equals("secret");
}

public void login(String username, String password) {
    return isRightUsername(username) && isRightPassword(password);
}

public void loginTest() {
    boolean success = login("testuser", new Password("wrongpassword"));
    // success is true, despite the password being wrong
}
Brandon
  • 4,555
  • 19
  • 21
1

Background

There are three places where final can show up in Java:

Making a class final prevents all subclassing of the class. Making a method final prevents subclasses of the method from overriding it. Making a field final prevents it from being changed later.

Misconceptions

There are optimizations that happen around final methods and fields.

A final method makes it easier for HotSpot to optimize via inlining. However, HotSpot does this even if the method isn't final as it works on the assumption that it hasn't been overridden until proven otherwise. More about this on SO

A final variable can be aggressively optimized, and more about that can be read in the JLS section 17.5.3.

However, with that understanding one should be aware that neither of these optimizations are about making a class final. There is no performance gain by making a class final.

The final aspect of a class has nothing to do with immutability either. One can have an immutable class (such as BigInteger) that is not final, or a class that is mutable and final (such as StringBuilder). The decision about if a class should be final is a question of design.

Final Design

Strings are one of the most used data types. They are found as keys to maps, they store user names and passwords, they are what you read in from a keyboard or a field on a web page. Strings are everywhere.

Maps

The first thing to consider of what would happen if you could subclass String is realizing that someone could construct a mutable String class that would appear to otherwise be a String. This would mess up Maps everywhere.

Consider this hypothetical code:

Map t = new TreeMap<String, Integer>();
Map h = new HashMap<String, Integer>();
MyString one = new MyString("one");
MyString two = new MyString("two");

t.put(one, 1); h.put(one, 1);
t.put(two, 2); h.put(two, 2);

one.prepend("z");

This is a problem with using a mutable key in general with a Map, but the thing that I'm trying to get at there is that suddenly a number of things about the Map break. The Entry is no longer at the right spot in the map. In a HashMap, the hash value has (should have) changed and thus its no longer at the right entry. In the TreeMap, the tree is now broken because one of the nodes is on the wrong side.

Since using a String for these keys is so common, this behavior should be prevented by making the String final.

You may be interested in reading Why is String immutable in Java? for more about the immutable nature of Strings.

Nefarious strings

There are a number of various nefarious options for Strings. Consider if I made a String that always returned true when equal was called... and passed that into a password check? Or made it so that assignments to MyString would send a copy of the String to some email address?

These are very real possibilities when you have the ability to subclass String.

Java.lang String optimizations

While before I mentioned that final doesn't make String faster. However, the String class (and other classes in java.lang) make frequent use of package level protection of fields and methods to allow other java.lang classes to be able to tinker with the internals rather than going through the public API for String all the time. Functions like getChars without range checking or lastIndexOf that is used by StringBuffer, or the constructor that shares the underlying array (note that thats a Java 6 thing that was changed because of memory issues).

If someone made a subclass of String, it wouldn't be able to share those optimizations (unless it was part of java.lang too, but that's a sealed package).

Its harder to design something for extension

Designing something to be extendable is hard. It means that you've got to expose parts of your internals for something else to be able to modify.

An extendable String could not have had its memory leak fixed. Those parts of it would need to have been exposed to subclasses and changing that code would then mean the subclasses would break.

Java prides itself in backwards compatibility and by opening up core classes to extension, one loses some of that ability to fix things while still maintaining the computability with third party subclasses.

Checkstyle has a rule that it enforces (that really frustrates me when writing internal code) called "DesignForExtension" which enforces that every class is either:

  • Abstract
  • Final
  • Empty implementation

The rational for which is:

This API design style protects superclasses against being broken by subclasses. The downside is that subclasses are limited in their flexibility, in particular they cannot prevent execution of code in the superclass, but that also means that subclasses cannot corrupt the state of the superclass by forgetting to call the super method.

Allowing implementation classes to be extended means that the subclasses can possibly corrupt the state of the class it is based off of and make it so that various guarantees that the superclass gives are invalid. For something as complex as String, it is almost a certainty that changing part of it will break something.

Developer hurbis

Its part of being a developer. Consider the likelihood that each developer will create their own String subclass with their own collection of utils in it. But now these subclasses cannot be freely assigned back to each other.

WleaoString foo = new WleaoString("foo");
MichaelTString bar = foo; // This doesn't work.

This way leads to madness. Casting to String all over the place and checking to see if the String is an instance of your String class or if not, making a new String based on that one and... just, no. Don't.

I'm sure you can write a good String class... but leave writing multiple string implementations to those crazy people who write C++ and have to deal with std::string and char* and something from boost and SString, and all the rest.

Java String magic

There are several magical things that Java does with Strings. These make it easier for a programmer to deal with but introduce some inconsistencies in the language. Allowing subclasses on String would take some very significant thought about how to deal with these magical things:

  • String Literals (JLS 3.10.5)

    Having code that allows one to do:

    String foo = "foo";
    

    This is not to be confused with boxing of the numeric types like Integer. You can't do 1.toString(), but you can do "foo".concat(bar).

  • The + operator (JLS 15.18.1)

    No other reference type in Java allows for an operator to be used on it. The String is special. The string concatenation operator also works at the compiler level so that "foo" + "bar" becomes "foobar" when it is compiled rather than at runtime.

  • String Conversion (JLS 5.1.11)

    All objects can be converted into Strings just by using them in a String context.

  • String Interning (JavaDoc)

    The String class has access to the pool of Strings that allow it to have canonical representations of the object which is populated at compile type with the String literals.

Allowing a subclass of String would mean that these bits with the String that make it easier to program would become very difficult or impossible to do when other String types are possible.

0

If String was not final, every programmer tool chest would contain its own "String with just a few nice helper methods". Each of these would be incompatible with all other tool chests.

I considered it to be a silly restriction. Today I am convinced this was a very sound decision. It keeps Strings to be Strings.

  • `final` in Java is not the same thing as `final` in C#. In this context, it is the same thing as C#'s `readonly`. See http://stackoverflow.com/questions/1327544/what-is-the-equivalent-of-javas-final-in-c – Arseni Mourzenko Jul 31 '11 at 21:16
  • 9
    @MainMa: Actually, in this context it seems that it's the same as C#'s sealed, as in `public final class String`. – configurator Sep 02 '11 at 16:02