I simply wanted to know if the internal implementation of the TreeSet class in java is Red Black Binary Search Tree. Ideally, I would think that it is a threaded RB-BST, since it supports iteration over the elements stored in it, and this would be much more efficient in terms of space complexity if the tree were threaded. Am I correct?
2 Answers
if the internal implementation of the TreeSet class in java is Red Black Binary Search Tree
Yes.
According to TreeSet Oracle doc:
TreeSet is a NavigableSet implementation backed by TreeMap instance.
public class TreeSet<E>
extends AbstractSet<E>
implements NavigableSet<E>, Cloneable, java.io.Serializable
{
private transient NavigableMap<E,Object> map;
// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
public TreeSet() {
this(new TreeMap<E,Object>());
}
// SOME CODE ,i.e Other methods in TreeSet
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
// SOME CODE ,i.e Other methods in TreeSet
}
The TreeMap class source code should be referred to from JDK, the Java doc mentions
A Red-Black tree based {@link NavigableMap} implementation. The map is sorted according to the {@linkplain Comparable natural ordering} of its keys, or by a {@link Comparator} provided at map creation time, depending on which constructor is used.
So the answer to this question is yes.
it supports iteration over the elements stored in it, and this would be much more efficient in terms of space complexity if the tree were threaded
To my knowledge, threaded tree is not used in JDK. I'm not sure how space complexity could affect the transverse performance in practice --- it's worth implementing it (example) and profile it.
In TreeMap class, rebalancing operations are implemented during insertion and deletion to maintain the "optimal" height of the red black tree.
Iterator
In Java TreeMap class, there are several iterators (EntryIterator
, ValueIterator
, NavigableMapIterator
, SubMapIterator
,..etc) that all extends a PrivateEntryIterator
:
abstract class PrivateEntryIterator<T> implements Iterator<T> {
Entry<K,V> next;
Entry<K,V> lastReturned;
int expectedModCount;
PrivateEntryIterator(Entry<K,V> first) {
expectedModCount = modCount;
lastReturned = null;
next = first;
}
public final boolean hasNext() {
return next != null;
}
final Entry<K,V> nextEntry() {
Entry<K,V> e = next;
if (e == null)
throw new NoSuchElementException();
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
next = successor(e);
lastReturned = e;
return e;
}
final Entry<K,V> prevEntry() {
Entry<K,V> e = next;
if (e == null)
throw new NoSuchElementException();
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
next = predecessor(e);
lastReturned = e;
return e;
}
public void remove() {
if (lastReturned == null)
throw new IllegalStateException();
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
// deleted entries are replaced by their successors
if (lastReturned.left != null && lastReturned.right != null)
next = lastReturned;
deleteEntry(lastReturned);
expectedModCount = modCount;
lastReturned = null;
}
}
and the key methods successor
(and predecessor
) are implemented in Entry
class with left, right, and parent node:
static <K,V> TreeMap.Entry<K,V> successor(Entry<K,V> t) {
if (t == null)
return null;
else if (t.right != null) {
Entry<K,V> p = t.right;
while (p.left != null)
p = p.left;
return p;
} else {
Entry<K,V> p = t.parent;
Entry<K,V> ch = t;
while (p != null && ch == p.right) {
ch = p;
p = p.parent;
}
return p;
}
}

- 3,132
- 6
- 16
- 33
The JRE Specification does not prescribe any particular implementation strategy, it only specifies the API. Therefore, every implementor is free to implement it in any way they wish, and to change it at any time for any reason without warning.
In other words: we don't, can't, and shouldn't know.
Maybe IBM has the same implementation as RedHat, maybe they don't. Maybe Azul has the same implementation as Excelsior, maybe they don't. Maybe, all the different implementations that Oracle has, all have the same implementation, maybe they don't. (Actually, considering the wildly different application domains, I would be surprised if they were all identical.)
Even if two vendors have the same implementation today, one of them could change it tomorrow.

- 101,921
- 24
- 218
- 318
-
Is there any specification regarding the lookup time and such other things. Otherwise how would we know when to use it and when not to? – ShayakSarkar Aug 08 '20 at 08:10
-
3@ShayakSarkar: The documentation for [TreeSet](https://docs.oracle.com/javase/7/docs/api/java/util/TreeSet.html) says that it is "A NavigableSet implementation based on a TreeMap." TreeMap says it is "A Red-Black tree based NavigableMap implementation. NavigableMap says it is "A SortedMap extended with navigation methods returning the closest matches for given search targets," and so on. – Robert Harvey Aug 08 '20 at 13:53
-
3This answer is totally incomplete and borderline useless without at least citing some JDK implementations and the data structures they happen to currently use. If OP's motivation is to try to connect concepts like "I just learned about Red-Black trees, I'm curious if that's what I've been using in Java all this time", saying "it's implementation dependent and you shouldn't need to know" is pointless. – Alexander Sep 07 '20 at 20:50