140

The Java team has done a ton of great work removing barriers to functional programming in Java 8. In particular, the changes to the java.util Collections do a great job of chaining transformations into very fast streamed operations. Considering how good a job they have done adding first class functions and functional methods on collections, why have they completely failed to provide immutable collections or even immutable collection interfaces?

Without changing any existing code, the Java team could at any time add immutable interfaces that are the same as the mutable ones, minus the setter methods and make the existing interfaces extend from them, like this:

                  ImmutableIterable
     ____________/       |
    /                    |
Iterable        ImmutableCollection
   |    _______/    /          \   \___________
   |   /           /            \              \
 Collection  ImmutableList  ImmutableSet  ImmutableMap  ...
    \  \  \_________|______________|__________   |
     \  \___________|____________  |          \  |
      \___________  |            \ |           \ |
                  List            Set           Map ...

Sure, operations like List.add() and Map.put() currently return a boolean or previous value for the given key to indicate whether the operation succeeded or failed. Immutable collections would have to treat such methods as factories and return a new collection containing the added element - which is incompatible with the current signature. But that could be worked-around by using a different method name like ImmutableList.append() or .addAt() and ImmutableMap.putEntry(). The resulting verbosity would be more than outweighed by the benefits of working with immutable collections, and the type system would prevent errors of calling the wrong method. Over time, the old methods could be deprecated.

Wins of immutable collections:

  • Simplicity — reasoning about code is simpler when the underlying data does not change.
  • Documentation — if a method takes an immutable collection interface, you know it isn't going to modify that collection. If a method returns an immutable collection, you know you can't modify it.
  • Concurrency — immutable collections can be shared safely across threads.

As someone who has tasted languages which assume immutability, it is very hard to go back to the Wild West of rampant mutation. Clojure's collections (sequence abstraction) already have everything that Java 8 collections provide, plus immutability (though maybe using extra memory and time due to synchronized linked-lists instead of streams). Scala has both mutable and immutable collections with a full set of operations, and though those operations are eager, calling .iterator gives a lazy view (and there are other ways of lazily evaluating them). I don't see how Java can continue to compete without immutable collections.

Can someone point me to the history or discussion about this? Surely it's public somewhere.

GlenPeterson
  • 14,890
  • 6
  • 47
  • 75
  • 9
    Related to this - Ayende blogged recently about collections and immutable collections in C#, with benchmarks. http://ayende.com/blog/tags/performance - tl;dr - immutability is **slow**. – Oded Dec 18 '13 at 15:05
  • 24
    with your hierarchy I can give you a ImmutableList and then change it on you when you don't expect it which can break a lot of things, as is you only have `const` collections – ratchet freak Dec 18 '13 at 15:16
  • 1
    also related: http://programmers.stackexchange.com/q/77897/25768 – ratchet freak Dec 18 '13 at 15:25
  • 19
    @Oded Immutability is slow, but so is locking. So is maintaining a history. Simplicity/Correctness is worth speed in many situations. With small collections, speed is not an issue. Ayende's analysis is based on the assumption that you don't need history, locking, or simplicity and that you are working with a large data set. Sometimes that's true, but it's not a one-is-always-better thing. There are trade-offs. – GlenPeterson Dec 18 '13 at 15:36
  • 1
    @ratchetfreak True, but we have the same problem with the existing interfaces. Isn't it better to know that only the caller can modify the parameter (not the function)? Having immutable interfaces allows you to document, and the compiler to verify, that some of your functions have no side effects. Unless a function's arguments are extremely restricted, you can usually create some kind of unsafe input. Immutable collections won't prevent all mischief, but they prevent some kinds of mischief - and that's a win! Thx for the link - interesting, but Guava, Clojure, and Scala now disprove it! – GlenPeterson Dec 18 '13 at 15:49
  • Of course there are trade-offs - for his use case, immutability causes unacceptable performance degradation. Just saying that the discussion there is related, not that it is the holy grail of all truth. – Oded Dec 18 '13 at 15:49
  • 3
    @Oded technically it isn't immutability that slows things down but *copy-on-write*. But we can expect these classes to get used like this (see the proliferation of string concatenation in loops). – ratchet freak Dec 18 '13 at 15:52
  • 5
    @GlenPeterson that's what defensive copies and `Collections.unmodifiable*()` are for. but don't treat these as immutable when they are not – ratchet freak Dec 18 '13 at 15:56
  • 1
    Don't know exactly what you mean by Scala's "not yet much streaming/lazy-evaluation." You can convert any collection to a [Stream](http://www.scala-lang.org/api/current/#scala.collection.immutable.Stream), or get an iterator which works much the same way, but without the memoization. [Source](http://www.scala-lang.org/api/current/#scala.io.Source)s also evaluate lazily. – Karl Bielefeldt Dec 18 '13 at 16:32
  • @ratchetfreak True. You have to initialize them within a lexical closure for them to be truly immutable: http://glenpeterson.blogspot.com/2013/07/immutable-java-with-lists-and-other.html – GlenPeterson Dec 18 '13 at 16:33
  • @KarlBielefeldt - Sweet!!! I've been wondering how to do that! I'll have to look into that in more depth soon... – GlenPeterson Dec 18 '13 at 16:36
  • 15
    Eh? If your function takes an `ImmutableList` in that diagram, people can pass in a mutable `List`? No, that's a **very** bad violation of LSP. – Telastyn Dec 18 '13 at 16:59
  • 1
    "The Java team has done a ton of great work removing barriers to functional programming in Java 8.": As a side comment, I would like to ask why one would want to do functional programming in Java. First, Java was not designed for it. Second, there exist viable alternatives already. – Giorgio Dec 19 '13 at 16:17
  • 1
    @Giorgio: 2 words: Legacy systems. When you have 10's or 100's of thousands of lines of Java code, it becomes very difficult to port it to the coolest new language that comes along. 3 years later when an even cooler newer language comes along, are you supposed to port it all over again? It's just not feasible. Functional programming brings great things to Java, even if Java doesn't bring such great things to it. – GlenPeterson Dec 19 '13 at 19:37
  • 1
    @GlenPeterson: You can write the new code in Clojure or Scala and integrate it with legacy Java code. I have tried this out (with Scala) and it worked pretty smoothly. So you do not need to port the old code: you can just call it from your new code. – Giorgio Dec 19 '13 at 22:36
  • The only sensible answer to this question is that Project Management decided not to do it. / Interfaces are a poor match for value types, so you'd want a distinct package of types, and probably language support for (non-trivial) value types. / Compete against what? Ruby? – Tom Hawtin - tackline Dec 20 '13 at 01:53
  • @Tom And they are considering [adding value types](http://cr.openjdk.java.net/~jrose/values/values-0.html) in a future version. – David Conrad Feb 04 '15 at 22:14
  • Turns out there is a list of alternative immutable collection implementations for Java here: http://stackoverflow.com/questions/8575723/whats-a-good-persistent-collections-framework-for-use-in-java/32923847#32923847 – GlenPeterson Oct 03 '15 at 15:03
  • Kotlin does this suggestion, it provides compile-time difference between readonly and mutable interfaces to java collections and can enforce at compile time the contract (but not enforce immutability obviously) – Jayson Minard Dec 27 '15 at 01:27

7 Answers7

119

Because immutable collections absolutely require sharing to be usable. Otherwise, every single operation drops a whole other list into the heap somewhere. Languages that are entirely immutable, like Haskell, generate astonishing amounts of garbage without aggressive optimizations and sharing. Having collection that's only usable with <50 elements is not worth putting in the standard library.

Further more, immutable collections often have fundamentally different implementations than their mutable counterparts. Consider for example ArrayList, an efficient immutable ArrayList wouldn't be an array at all! It should be implemented with a balanced tree with a large branching factor, Clojure uses 32 IIRC. Making mutable collections be "immutable" by just adding a functional update is a performance bug just as much as a memory leak is.

Furthermore, sharing isn't viable in Java. Java provides too many unrestricted hooks to mutability and reference equality to make sharing "just an optimization". It'd probably irk you a bit if you could modify an element in a list, and realize you just modified an element in the other 20 versions of that list you had.

This also rules out huge classes of very vital optimizations for efficient immutability, sharing, stream fusion, you name it, mutability breaks it. (That'd make a good slogan for FP evangelists)

daniel gratzer
  • 11,678
  • 3
  • 46
  • 51
  • 22
    My example talked about immutable *interfaces*. Java could provide a full suite of both mutable and immutable *implementations* of those interfaces that would make the necessary trade-offs. It's up to the programmer to choose mutable or immutable as appropriate. Programmers have to know when to use a List vs. Set now. You generally don't need the mutable version until you have a performance issue, and then it may only be necessary as a builder. In any case, having the immutable interface would be a win on its own. – GlenPeterson Dec 18 '13 at 17:21
  • 4
    I read your answer again and I think you are saying that Java has a fundamental assumption of mutability (e.g. java beans) and that the collections are just the tip of the iceberg and carving off that tip won't solve the underlying problem. A valid point. I might accept this answer and speed up my adoption of Scala! :-) – GlenPeterson Dec 18 '13 at 17:28
  • 2
    Good points. This is why frege (a pure functional language in the spirit of Haskell that compiles to Java) has its own lists, maps, etc. The existing collections in Java are unusable in pure functional code due to the reasons you point out. – Ingo Dec 19 '13 at 10:00
  • 8
    I'm not sure immutable collections require the ability to share common parts to be useful. The most common immutable type in Java, an immutable collection of characters, used to allow sharing but doesn't anymore. The key thing that makes it useful is the ability to quickly copy data from a `String` into a `StringBuffer`, manipulate it, and then copy the data into a new immutable `String`. Using such a pattern with sets and lists might be as good as using immutable types that are designed to facilitate the production of slightly-changed instances, but could still be better... – supercat Jan 17 '14 at 19:30
  • 1
    ...than having to constantly make defensive copies of mutable types all the time. – supercat Jan 17 '14 at 19:31
  • 4
    It is entirely possible to make an immutable collection in Java using sharing. The items stored in the collection are *references* and their referents may be mutated - so what? Such behavior already breaks existing collections such as HashMap and TreeSet, yet those are implemented in Java. And if multiple collections contain references to the same object, it's entirely expected that modifying the object will cause a change visible when it is viewed from all collections. – Reinstate Monica Aug 13 '15 at 21:51
  • 4
    jozefg, it is entirely possible to implement efficient immutable collections on JVM with structural sharing. Scala and Clojure have them as part of their standard library, both implementations are based on Phil Bagwell's HAMT (Hash Array Mapped Trie). Your statement regarding Clojure implementing immutable data structures with BALANCED trees is completely wrong. – sesm Sep 06 '15 at 20:54
  • This answer doesn't make any sense to me as a Scala developer. Immutable collections are inherently sharable. Make an immutable collection, read it across threads. Scala does it but has to implement its own collections library to achieve this. To paraphrase a meme, "why not both?" It's up to the developer to choose, not the collections library. You choose an immutable implementation until you it is optimal to use a mutable one. – JasonG Dec 14 '16 at 01:06
  • 1
    Performance is important. YES! But nobody underlines the fact that an Immutable type help to declare our intent to NOT modify a value. For me immutable means "read only". For example if a method declare to take an immutable collection as argument, then as a caller I can make the assumption it will not be mutated (special care of collection of mutable objects of course...) – mcoolive Feb 17 '17 at 13:20
  • If you were right, then Google should trash half of Guava. They provide great many immutable collections and none of them is sharable/persistent (both names are pretty stupid terms for "efficiently allowing to make a modified copy"). – maaartinus Aug 23 '17 at 22:40
  • It doesn't irk me a bit to work with my own immutable set of data, which may be a modified variation of the set of data used down the call stack or on another thread. Actually, it gives me a very warm and cosy and safe feeling. If the function down the call stack wants to see my modification, I`ll need to return it. This provides for clarity in the call contract, and gives the caller control over the data that is kept and discarded. It is not possible for some function far up the call stack to decide for me what variation of the set is relevant to me... – dsmith Dec 12 '17 at 21:56
  • ... This is why we say that this style of coding is 'easy to reason about'. – dsmith Dec 12 '17 at 21:56
  • You can get very good performance characteristics out of functional immutable collections without aggressive optimisations. In the SO link below I compare 2 array modules in Erlang that have good constant factor performance on append, lookup and updates. The two modules are written in Erlang without any special optimisations. https://stackoverflow.com/questions/12395074/is-there-a-module-that-implements-an-efficient-array-type-in-erlang – dsmith Dec 12 '17 at 22:23
  • @JasonG, the collection hierarchy Scala chose was a mistake though: https://youtu.be/TS1lpKBMkgg?t=14m28s – aioobe Mar 13 '18 at 08:47
  • I'm not sure that Paul Phillip's opinion is completely unbiased :) – JasonG Mar 13 '18 at 19:02
86

A mutable collection is not a subtype of an immutable collection. Instead, mutable and immutable collections are sibling descendants of readable collections. Unfortunately, the concepts of "readable", "read-only", and "immutable" seem to get blurred together, even though they mean three different things.

  • A readable collection base class or interface type promises that one may read items, and does not provide any direct means of modifying the collection, but does not guarantee that code receiving the reference cannot cast or manipulate it in such a way as to permit modification.

  • A read-only collection interface doesn't include any new members, but should only be implemented by a class which promises that there is no way to manipulate a reference to it in such a way as to mutate the collection nor receive a reference to something that could do so. It does not, however, promise that the collection won't be modified by something else which has a reference to the internals. Note that a read-only collection interface may not be able to prevent implementation by mutable classes, but can specify that any any implementation, or class derived from an implementation, which allows mutation shall be considered an "illegitimate" implementation or derivative of an implementation.

  • An immutable collection is one which will always hold the same data as long as any reference to it exists. Any implementation of an immutable interface which does not always return the same data in response to a particular request is broken.

It is sometimes useful to have strongly-associated mutable and immutable collection types which both implement or derive from the same "readable" type, and to have the readable type include AsImmutable, AsMutable, and AsNewMutable methods. Such a design can allow code which wants to persist the data in a collection to call AsImmutable; that method will make a defensive copy if the collection is mutable, but skip the copy if it's already immutable.

supercat
  • 8,335
  • 22
  • 28
  • 2
    Great answer. Immutable collections can give you a pretty strong guarantee related to thread-safeness and how you can reason about them as time goes by. A Readable/Read-only collection does not. In fact, to honor the liskov substition principle, Read-Only and Immutable should probably be abstract base type with final method and private members to ensure that no derived class can destroy the garantee given by the type. Or they should be fully concrete type that either wrap a collection (Read-Only), or always take a defensive copy (Immutable). This is how guava ImmutableList does it. – Laurent Bourgault-Roy Dec 23 '13 at 15:12
  • 1
    @LaurentBourgault-Roy: There are advantages to both sealed and inheritable immutable types. If one doesn't want to allow an illegitimate derived class to break one's invariants, sealed types can offer protection against that while inheritable classes offer none. On the other hand, it may be possible for code that knows something about the data it holds to store it *much* more compactly than would a type that knows nothing about it. Consider, for example, a ReadableIndexedIntSequence type which encapsulates a sequence of `int`, with methods `getLength()` and `getItemAt(int)`. – supercat Dec 23 '13 at 16:34
  • Indeed. No silver bullet here. And even a sealed class can be defeated using reflection. – Laurent Bourgault-Roy Dec 23 '13 at 16:39
  • 1
    @LaurentBourgault-Roy: Given a `ReadableIndexedIntSequence`, one could produce an instance of an array-backed immutable type by copying all the items into an array, but suppose that a particular implementation simply returned 16777216 for length and `((long)index*index)>>24` for each item. That would be a legitimate immutable sequence of integers, but copying it to an array would be a huge waste of time and memory. – supercat Dec 23 '13 at 16:39
  • 1
    I fully agree. My solution give you *correctness* (up to a point), but to get performance with large dataset, you must have persistent structure and design for immutability from the beginning. For small collection though you can get away with taking an immutable copy from time to time. I remember that Scala did some analysis of various program and found that something like 90% of the lists instanciated were 10 or less items long. – Laurent Bourgault-Roy Dec 23 '13 at 16:43
  • 1
    @LaurentBourgault-Roy: The fundamental question is whether one trusts people not to produce broken implementations or derived classes. If one does, and if the interfaces/base classes provide asMutable/asImmutable methods, it may be possible to improve performance by many orders of magnitude [e.g. compare the cost of calling `asImmutable` on an instance of the above-defined sequence versus the cost of constructing an immutable array-backed copy]. I would posit that having interfaces defined for such purposes is probably better than trying to use ad-hoc approaches; IMHO, the biggest reason... – supercat Dec 23 '13 at 16:51
  • let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/12124/discussion-between-supercat-and-laurent-bourgault-roy) – supercat Dec 23 '13 at 16:51
  • 1
    @LaurentBourgault-Roy: Since writing the above, I've come to realize a rather annoying "feature" of Java's memory model which distinguishes between immutable and effectively-immutable fields: if an immutable type uses a backing store of a mutable type to hold lazily-generated content, making it thread-safe will require extra work not only on the part of code generating the content (not a problem, since such code would only execute once, and the the extra work will likely be cheap relative to the cost of generating the content) but will also require extra work whenever the content is *read*. – supercat Dec 06 '14 at 16:41
  • 1
    @LaurentBourgault-Roy: Since the cases where lazy content generation is most appropriate are those where many things will either be read zero times or thousands, it's often good to focus on minimizing the cost of reading already-generated data. Unfortunately, the way the JLS is written, such an objective would be inconsistent with a desire to make all immutable classes inherently thread-safe. From a hardware perspective, on every platform I've read of or can imagine, the creator of the data would have to do extra work to ensure safety but the reader would not. Unfortunately, Java provides... – supercat Dec 06 '14 at 17:01
  • 1
    @LaurentBourgault-Roy: No mechanism via which the method creating the object can force the generated machine code to do the necessary work there, without imposing extra obligations on the part of any code reading it. – supercat Dec 06 '14 at 17:02
  • @LaurentBourgault-Roy In general, you have to assume people aren't doing evil things using reflection. If they are, you're already screwed, since you can't even trust that numbers will behave sanely: http://codegolf.stackexchange.com/a/28818/7786 . Either trust that they won't, or change languages. – Patrick M Aug 12 '15 at 06:39
  • 1
    This should be the accepted answer, as it answers explicitly why the provided hierarchy cannot work based on basic object-oriented principles. – BeeOnRope Feb 20 '16 at 03:17
  • @BeeOnRope It only explains, why the proposed hierarchy is wrong. But with a slight modification reflecting this answer, it could work. I'm still missing why no attempt were done, especially when there are so many open source immutable collections in widespread use (mainly Guava). – maaartinus Aug 23 '17 at 22:55
  • 1
    @maaartinus - I agree. There are upsides and downsides and early Java erred on the side of less complexity in general and also perhaps the usefulness of immutable collections wasn't as apparent 20 years ago. – BeeOnRope Aug 23 '17 at 22:57
  • 1
    @maaartinus: A more general problem with Java is that it fails to distinguish between references that exist to encapsulate state stored in private mutable objects, those that exist to encapsulate state stored in objects that will never be mutated (whether or not they are of mutable type), references that "own" mutable objects to which other entities also have access, and references that identify mutable objects owned by other entities. A language that distinguished those kinds of references, and has separate methods for equality and equivalence (the latter being a stronger and... – supercat Aug 23 '17 at 23:02
  • ...permanent condition), could automatically implement methods for cloning and for equality/equivalence testing in most cases. Treating all references the same way may simplify the language, but makes it necessary to manually write a lot of code that would otherwise not be necessary. – supercat Aug 23 '17 at 23:03
  • Very good answer. It might be worth adding that a mutable collection derived from an immutable collection would be a violation of the Liskov Substitution Principle's history constraint. – bdsl Dec 21 '22 at 15:10
  • @bdsl: I'm not familiar enough with the terminology to know the phrase "history constraint". Are you suggesting Liskov's terminology would be clearer than "Any implementation of an immutable interface which does not always return the same data in response to a particular request is broken"? I suppose there are some corner cases where the meanings could be slightly different if an interface has a means of reporting "resource unavailable"; the fact that one attempt to read data from an immutable interface succeeds and another reports "resource unavailable" would not necessarily... – supercat Dec 21 '22 at 16:00
  • ...imply that the implementation was broken, if the possibility of the resource being unavailable had been accounted for in the interface. If code asks for item #5 of an item held on a CD-ROM, and then asks for that item at a later time while the CD is being taken out for periodic cleaning, an implementation that returns item #5 would be correct, but so would one that reports "resource temporarily unavailable", even though the latter response doesn't match the first. – supercat Dec 21 '22 at 16:03
  • @supercat I think your terminology is very clear. I just wanted to let people know that they can find many other people agreeing with you, and going into more detail on the general principles involved, by searching for "liskov history constraint". – bdsl Dec 21 '22 at 16:04
  • The interface that reads from a CD on demand shouldn't be thought of as like an immutable data structure held inside the programs memory. It's an I/O mechanism, not a collection - unless you want to say the computer should be treated as out-of-order and not to be used any time the CD is taken out. – bdsl Dec 21 '22 at 16:07
  • @bdsl: There may also be situations where an "add-only" collection interface might be usable in much the same was as an immutable interface, if it specifies that any item which is ever observed to exist will never change, but changes to the original may or may not be reflected in snapshots thereof. This should be a parent interface of one that guarantees such extensions won't happen, since some kinds of wrappers would uphold a stronger immutability guarantee if the wrapped objects do, but would fail to uphold a weaker guarantee if wrapped objects only uphold that. – supercat Dec 21 '22 at 16:18
  • @bdsl: Treating external resources as immutable collections is often useful, and in many cases it may be tolerable to design things so that a resource that goes off line will trigger a cascade of exceptions that inelegantly terminates a program, but in some cases, especially with interactive programs, it may be more useful to have a program announce "Resource unavailable" to the user and then skip any operations which relied upon its contents, but allow actions that don't depend upon the contents to proceed anyway. – supercat Dec 21 '22 at 16:25
17

The Java Collections Framework does provide the ability to create a read-only version of a collection by way of six static methods in the java.util.Collections class:

As someone has pointed out in the comments to the original question, the collections returned may not be considered immutable because even though the collections cannot be modified (no members can be added or removed from such a collection), the actual objects referenced by the collection can be modified if their object type allows it.

However, this problem would remain regardless of whether code returns a single object, or an unmodifiable collection of objects. If the type allows its objects to be mutated, then that decision was made in the design of the type and I don't see how a change to the JCF could alter that. If immutability is important, then the members of a collection should be of an immutable type.

Arkanon
  • 334
  • 2
  • 7
  • 5
    The design of the unmodifiable collections would have been greatly enhanced if the wrappers included an indication of whether the thing being wrapped was already immutable, and there were `immutableList` etc. factory methods which would return a read-only wrapper around a copy of a passed-in list *unless the passed-in list was already immutable*. It would be easy to create user-defined types like that but for one problem: there would be no way for `joesCollections.immutableList` method to recognize that it shouldn't need to copy the object returned by `fredsCollections.immutableList`. – supercat Jul 14 '14 at 16:27
11

This is a very good question. I enjoy entertaining the idea that of all the code written in java and running on millions of computers all over the world, every day, around the clock, about half the total clock cycles must be wasted doing nothing but making safety copies of collections returned by functions, which are garbage-collected milliseconds later.

A percentage of java programmers are aware of the existence of the unmodifiableCollection() family of methods of the Collections class, but most don't bother with it, and I can't blame them: an interface which pretends to be read-write but will slap you in the face with an UnsupportedOperationException if you make the mistake of invoking any of its 'write' methods is quite an evil thing to have!

Now, an interface like Collection which would be missing the add(), remove() and clear() methods would not be an "ImmutableCollection" interface; it would be an "UnmodifiableCollection" interface. As a matter of fact, there could never be an "ImmutableCollection" interface, because immutability is a nature of an implementation, not a characteristic of an interface. I know, that's not very clear; let me explain.

Suppose someone hands you such a read-only collection interface; is it safe to pass it to another thread? If you knew for sure that it represents a truly immutable collection, then the answer would be "yes"; unfortunately, since it is an interface, you do not know how it is implemented, so the answer has to be a no: for all you know, it may be an unmodifiable (to you) view of a collection which is in fact mutable, (like what you get with Collections.unmodifiableCollection(),) so attempting to read from it while another thread is modifying it would result in reading corrupt data.

So, what you have essentially described is a set of not "Immutable", but "Unmodifiable" collection interfaces. It is important to understand that "Unmodifiable" simply means that whoever has a reference to such an interface is prevented from modifying the underlying collection not because the underlying collection is necessarily immutable, but simply because the interface lacks any modification methods.

Now, I happened to be present at Devoxx Belgium 2017 where several people from Oracle gave talks, and one of them was Brian Goetz. I went to him after his talk and asked him precisely this: why are there no unmodifiable interfaces in Java? I must say that the answer he gave me was not particularly convincing: he said that they are not doing it due to a security concern: if you have a mutable collection class implementing a mutable collection interface which extends an unmodifiable collection interface, and you hand out the unmodifiable interface, whoever holds that interface can up-cast it to the mutable interface and modify the original collection.

This is not convincing because it describes a very simplistic scenario which can very easily be taken care of with some simple extra measures: If you have a security concern you can instantiate and hand-out an unmodifiable collection adaptor which wraps (delegates to) your mutable collection without implementing the mutable collection interface. As a matter of fact, you are already doing this structurally when you use the unmodifiableCollecton() family of methods, the only problem is that the interface you are handing out is secretly unmodifiable, meaning that it obtusely exposes mutation methods, but invoke any of them and you die. Thus, unmodifiable interfaces could eliminate this existing obtuseness while keeping everything else structurally the same if need be.

But of course, things would not have to be kept the same. The next step would be to add immutable collection classes, which implement the unmodifiable collection interfaces. When there is a need to guarantee immutability, (such as the case is, for example, when passing a collection from one thread to another,) such collections would be passed around as immutable classes, not as unmodifiable interfaces, so that the receiver knows for sure that what they have in their hands is immutable.

So, in order to have a complete set of collections in java, (or any other declarative imperative language,) we would need the following:

  1. A set of unmodifiable collection interfaces.

  2. A set of mutable collection interfaces, extending the unmodifiable ones.

  3. A set of mutable collection classes implementing the mutable interfaces, and by extension also the unmodifiable interfaces.

  4. A set of immutable collection classes, implementing the unmodifiable interfaces, but mostly passed around as classes, so as to guarantee immutability.

I have implemented all of the above for fun, and I am using them in projects, and they work like a charm.

The reason why they are not part of the java runtime is probably because it was thought that this would be too much / too complex / too difficult to understand.

Personally, I think that what I described above is not even enough; one more thing that appears to be needed is a set of mutable interfaces & classes for structural immutability. (Which may simply be called "Rigid" because the prefix "StructurallyImmutable" is too damn long.)

Mike Nakis
  • 32,003
  • 7
  • 76
  • 111
  • Good points. Two details: 1. Immutable collections require certain method signatures, specifically (using a List as an example): `List add(T t)` - all "mutator" methods must return a new collection that reflects the change. 2. For better or worse, Interfaces often represent a *contract* in addition to a signature. Serializable is one such interface. Similarly, Comparable requires that you correctly implement your `compareTo()` method to work correctly and ideally be compatible with `equals()` and `hashCode()`. – GlenPeterson Jun 04 '15 at 16:48
  • Oh, I did not even have mutate-by-copy immutability in mind. What I wrote above refers to plain simple immutable collections that *really* have no methods like `add()`. But I suppose that if mutator methods were to be added to the immutable classes, then they would need to return *also* immutable classes. So, if there is a problem lurking there, I do not see it. – Mike Nakis Jun 04 '15 at 17:16
  • Is your implementation publicly available? I should have asked this months ago. Anyway, mine is: https://github.com/GlenKPeterson/UncleJim – GlenPeterson Nov 10 '15 at 14:11
  • @GlenPeterson awesome! C-:= no, mine is not publicly available, though I should do that. I just need to find the time. Other concerns seem to always take precedence. – Mike Nakis Nov 10 '15 at 19:18
  • 4
    `Suppose someone hands you such a read-only collection interface; is it safe to pass it to another thread?` Suppose someone passes you an instance of a mutable collection interface. Is it safe to invoke any method on it? You don't know that that the implementation doesn't loop forever, throw an exception, or completely disregards the interface's contract. Why have a double standard specifically for immutable collections? – Doval Jun 22 '16 at 14:59
  • Even in a read-only context, mutable and unmodifiable collections are not thread safe, but immutable collections are thread safe. The point I am making is that if you have an immutable collection class you can pass it between threads, while if you have an unmodifiable collection interface you cannot pass it between threads. – Mike Nakis Jun 22 '16 at 15:28
  • 1
    IMHO your reasoning against mutable interfaces is wrong. You may write a mutable implementation of immutable interfaces, and then it breaks. Sure. But that's your fault as you are violating the contract. Just stop doing that. It's no different from breaking a `SortedSet` by subclassing the set with a non-conforming implementation. Or by passing an inconsistent `Comparable`. Nearly anything can be broken if you want. I guess, that's what @Doval meant by "double standards". – maaartinus Aug 23 '17 at 23:12
  • 1
    @maaartinus the thing is, unmodifiable (read-only) interfaces have a usefulness even outside the context of immutability. A class may legitimately expose an umodifiable interface of a mutable collection that it contains, and mutate the collection by other means. So, the unmodifiable collection interface cannot possibly impose a contract that says that the backing collection must be immutable, because there are legitimate usages where the backing collection is not immutable. – Mike Nakis Aug 24 '17 at 06:34
  • Also, there is the issue of guarantees: an immutable collection class is one of a kind, you write it once, you test it thoroughly, and then you use it all over your code base without having to worry whether it is correctly written. When you have an immutable interface, you have absolutely no guarantees as to what is going on behind the scenes. – Mike Nakis Aug 24 '17 at 06:35
  • @GlenPeterson here is my library: https://github.com/mikenakis/Public/tree/master/tyraki unfortunately, it is not as well-documented as yours yet. I can tell you a couple of things right from the start: I have no such thing as persistent collections. I have no support whatsoever for checked exception lambdas. The library is also small, but it does not exactly depend on nothing. – Mike Nakis Dec 20 '22 at 12:30
  • It depends on a couple of other libraries (also in `mikenakis/Public`) which provide some stuff I commonly use in all my projects, and also stuff for ensuring that the collection is being used in a safe threading context and that map keys are immutable. – Mike Nakis Dec 20 '22 at 12:30
3

Immutable collections can be deeply recursive, compared to eachother, and not unreasonably inefficient if object equality is by secureHash. This is called a merkle forest. It can be per collection or within parts of them like an (self balancing binary) AVL tree for a sorted map.

Unless all java objects in these collections have a unique id or some bitstring to hash, the collection has nothing to hash to uniquely name itself.

Example: On my 4x1.6ghz laptop, I can run 200K sha256s per second of the smallest size that fits in 1 hash cycle (up to 55 bytes), compared to 500K HashMap ops or 3M ops in a hashtable of longs. 200K/log(collectionSize) new collections per second is fast enough for some things where data integrity and anonymous global scalability is important.

0

This is not immutable, but Read-Only. I.e. a type to represent an collection with only read operations. This is not a bad idea, and was done in C# collection framework. I think Java creators should consider this as a quite useful thing to do in future versions of Java.

kan
  • 402
  • 5
  • 12
-3

Performance. Collections by their nature can be very large. Copying 1000 elements to a new structure with 1001 elements instead of inserting a single element is just plain horrible.

Concurrency. If you have several threads running they may want to get the current version of the collection and not the version that was passed 12 hours ago when the thread started.

Storage. With immutable objects in a multi threaded environment you can end up with dozens of copies of the "same" object at different points of its life cycle. Doesnt matter for a Calendar or Date object but when its a collection of 10,000 widgets this will kill you.

James Anderson
  • 18,049
  • 1
  • 42
  • 72
  • 12
    Immutable collections only require copying if you can’t share because of pervasive mutability like Java has. Concurrency is generally easier with immutable collections because they don’t require locking; and for visibility you can always have a mutable reference to an immutable collection (common in OCaml). With sharing, updates can be essentially free. You may do logarithmically more allocations than with a mutable structure, but on update, many expired subobjects can be freed immediately or reused, so you don’t necessarily have higher memory overhead. – Jon Purdy Dec 19 '13 at 01:32
  • 4
    Couple problems. The collections in Clojure and Scala are both immutable, but support light-weight copies. Adding element 1001 means copying less than 33 elements, plus making a few new pointers. If you share a mutable collection across threads, you have all kinds of synchronization issues when you change it. Operations like "remove()" are nightmarish. Also, immutable collections can be built mutably, then copied once into an immutable version safe to share across threads. – GlenPeterson Dec 19 '13 at 02:33
  • 4
    Using concurrency as an argument against immutability is unusual. Duplicates as well. – Tom Hawtin - tackline Dec 20 '13 at 01:41
  • If several threads need to access the latest version then a classic immutable object does not work. Instead you have the performance overhead and complication of a "pretend" immutable object. – James Anderson Dec 20 '13 at 03:27
  • 4
    A little bit miffed about the down votes here. The OP asked why they did not implement immutable collections, and, I provided a considered answer to the question. Presumably the only acceptable answer among the fashion conscious is "because they made a mistake". I actually have some experience with this having to refactor large chunks of code using the otherwise excellent BigDecimal class purely because of poor perfomance due to immutability 512 times that of using a double plus some messing around to fix the decimals. – James Anderson Dec 20 '13 at 03:33
  • 3
    @JamesAnderson: My problems with your answer: "Performance" - you could say that real life immutable collections always implement some form of sharing and reuse to avoid exactly the issue you describe. "Concurrency" - the argument boils down to "If you want mutability, then an immutable object does not work." I mean that if there is a notion of "latest version of the same thing", then something needs to mutate, either the thing itself, or something that owns the thing. And in "Storage", you seem to say that mutability is sometimes not desired. – jhominal Dec 23 '13 at 15:42
  • 1
    @JamesAnderson: In short, your reasons 2 and 3, are more reasons to avoid using immutability - they are not reasons for Java designers to avoid implementing it. (A good reason would be that, e.g., designing performant immutable collections requires non-trivial work) – jhominal Dec 23 '13 at 15:44
  • 2
    @JamesAnderson: About `BigDecimal`: do you recognize that, if the `BigDecimal` was not immutable, then nobody could use it to materialize *values*, in the way one can use `int`, `double`, and any other number type? Because if I get the value of a property `getLength()` in a `BigDecimal`, I would not like being told "Warning, the value that you get can be changed at any time." - that could force me to do additional defensive copies. (Sometimes I would like to know the new length after it changes! But it cannot be the job of `BigDecimal`, which is supposed to represent purely a number value.) – jhominal Dec 23 '13 at 15:49
  • 2
    @jhominal: IMHO, the proper pattern in many cases should be to have pairs of related mutable and immutable classes, analogous to `String` and `StringBuilder`. A `BigIntAccumulator` class could support a mutating `Add` method which was substantially faster than a non-mutating `Plus` method which had to construct a new instance. If the accumulator type allocated some extra storage for temporary calculations it could also support other mutating operations like `MultiplyBy`. – supercat Aug 13 '15 at 06:32