A Guide to TreeSet in Java

1. Overview

In this article, we’ll have a look at an integral part of the Java Collections Framework and one of the most popular Set implementations – the TreeSet.

2. Intro to TreeSet

Simply put, the TreeSet is a sorted collection that extends the AbstractSet class and implements the NavigableSet interface.

Here’s a quick summary of the most important aspects of this implementation:

  • It stores unique elements
  • It doesn’t preserve the insertion order of the elements
  • It sorts the elements in ascending order
  • It’s not thread-safe

In this implementation, objects are sorted and stored in ascending order according to their natural order. The TreeSet uses a self-balancing binary search tree, more specifically a Red-Black tree.

Simply put, being a self-balancing binary search tree, each node of the binary tree comprises of an extra bit, which is used to identify the color of the node which is either red or black. During subsequent insertions and deletions, these “color” bits helps in ensuring that the tree remains more or less balanced.

So, let’s create an instance of a TreeSet:

Set<String> treeSet = new TreeSet<>();

2.1. TreeSet With a Constructor Comparator Param

Optionally, we can construct a TreeSet with a constructor that lets us define the order in which the elements get sorted by using a Comparable or Comparator:

Set<String> treeSet = new TreeSet<>(Comparator.comparing(String::length));

Although TreeSet isn’t thread-safe, it can be synchronized externally using the Collections.synchronizedSet() wrapper:

Set<String> syncTreeSet = Collections.synchronizedSet(treeSet);

Alright, now that we have a clear idea of how to create a TreeSet instance, let’s have a look at the common operations we have available.

3. TreeSet add()

The add() method, as expected, can be used for adding elements to a TreeSet. If an element was added, the method returns true, otherwise – false.

The contract of the method states that an element will be added only when the same isn’t already present in the Set.

Let’s add an element to a TreeSet:

@Test
public void whenAddingElement_shouldAddElement() {
    Set<String> treeSet = new TreeSet<>();

    assertTrue(treeSet.add("String Added"));
 }

The add method is extremely important as the implementation details of the method illustrate how the TreeSet works internally, how it leverages the TreeMap’s put method to store the elements:

public boolean add(E e) {
    return m.put(e, PRESENT) == null;
}

The variable m refers to an internal backing TreeMap (note that TreeMap implements NavigateableMap):

private transient NavigableMap<E, Object> m;

Therefore, the TreeSet internally depends on a backing NavigableMap which gets initialized with an instance of TreeMap when an instance of the TreeSet is created:

public TreeSet() {
    this(new TreeMap<E,Object>());
}

4. TreeSet contains()

The contains() method is used to check if a given element is present in a given TreeSetIf the element is found, it returns true, otherwise false.

Let’s see the contains() in action:

@Test
public void whenCheckingForElement_shouldSearchForElement() {
    Set<String> treeSetContains = new TreeSet<>();
    treeSetContains.add("String Added");

    assertTrue(treeSetContains.contains("String Added"));
}

5. TreeSet remove()

The remove() method is used to remove the specified element from the set if it’s present.

If a set contained the specified element, this method returns true.

Let’s see it in action:

@Test
public void whenRemovingElement_shouldRemoveElement() {
    Set<String> removeFromTreeSet = new TreeSet<>();
    removeFromTreeSet.add("String Added");

    assertTrue(removeFromTreeSet.remove("String Added"));
}

6. TreeSet clear()

If we want to remove all the items from a set, we can use the clear() method:

@Test
public void whenClearingTreeSet_shouldClearTreeSet() {
    Set<String> clearTreeSet = new TreeSet<>();
    clearTreeSet.add("String Added");
    clearTreeSet.clear();
 
    assertTrue(clearTreeSet.isEmpty());
}

7. TreeSet size()

The size() method is used to identify the number of elements present in the TreeSet. It’s one of the fundamental methods in the API:

@Test
public void whenCheckingTheSizeOfTreeSet_shouldReturnThesize() {
    Set<String> treeSetSize = new TreeSet<>();
    treeSetSize.add("String Added");
 
    assertEquals(1, treeSetSize.size());
}

8. TreeSet isEmpty()

The isEmpty() method can be used to figure out if a given TreeSet instance is empty or not:

@Test
public void whenCheckingForEmptyTreeSet_shouldCheckForEmpty() {
    Set<String> emptyTreeSet = new TreeSet<>();
    
    assertTrue(emptyTreeSet.isEmpty());
}

9. TreeSet iterator()

The iterator() method returns an iterator iterating in the ascending order over the elements in the Set. Those iterators are fail-fast.

We can observe the ascending iteration order here:

@Test
public void whenIteratingTreeSet_shouldIterateTreeSetInAscendingOrder() {
    Set<String> treeSet = new TreeSet<>();
    treeSet.add("First");
    treeSet.add("Second");
    treeSet.add("Third");
    Iterator<String> itr = treeSet.iterator();
    while (itr.hasNext()) {
        System.out.println(itr.next());
    }
}

Additionally, TreeSet enables us to iterate through the Set in descending order.

Let’s see that in action:

@Test
public void whenIteratingTreeSet_shouldIterateTreeSetInDescendingOrder() {
    TreeSet<String> treeSet = new TreeSet<>();
    treeSet.add("First");
    treeSet.add("Second");
    treeSet.add("Third");
    Iterator<String> itr = treeSet.descendingIterator();
    while (itr.hasNext()) {
        System.out.println(itr.next());
    }
}

The Iterator throws a ConcurrentModificationException if the set is modified at any time after the iterator is created in any way except through the iterator’s remove() method.

Let’s create a test for this:

@Test(expected = ConcurrentModificationException.class)
public void whenModifyingTreeSetWhileIterating_shouldThrowException() {
    Set<String> treeSet = new TreeSet<>();
    treeSet.add("First");
    treeSet.add("Second");
    treeSet.add("Third");
    Iterator<String> itr = treeSet.iterator();
    while (itr.hasNext()) {
        itr.next();
        treeSet.remove("Second");
    }
}

Alternatively, if we had used the iterator’s remove method, then we wouldn’t have encountered the exception:

@Test
public void whenRemovingElementUsingIterator_shouldRemoveElement() {
 
    Set<String> treeSet = new TreeSet<>();
    treeSet.add("First");
    treeSet.add("Second");
    treeSet.add("Third");
    Iterator<String> itr = treeSet.iterator();
    while (itr.hasNext()) {
        String element = itr.next();
        if (element.equals("Second"))
           itr.remove();
    }
 
    assertEquals(2, treeSet.size());
}

There’s no guarantee on the fail-fast behavior of an iterator as it’s impossible to make any hard guarantees in the presence of unsynchronized concurrent modification.

10. TreeSet first()

This method returns the first element from a TreeSet if it’s not empty. Otherwise, it throws a NoSuchElementException.

Let’s see an example:

@Test
public void whenCheckingFirstElement_shouldReturnFirstElement() {
    TreeSet<String> treeSet = new TreeSet<>();
    treeSet.add("First");
   
    assertEquals("First", treeSet.first());
}

11. TreeSet last()

Analogously to the above example, this method will return the last element if the set is not empty:

@Test
public void whenCheckingLastElement_shouldReturnLastElement() {
    TreeSet<String> treeSet = new TreeSet<>();
    treeSet.add("First");
    treeSet.add("Last");
    
    assertEquals("Last", treeSet.last());
}

12. TreeSet subSet()

This method will return the elements ranging from fromElement to toElement. Note that fromElement is inclusive and toElement is exclusive:

@Test
public void whenUsingSubSet_shouldReturnSubSetElements() {
    SortedSet<Integer> treeSet = new TreeSet<>();
    treeSet.add(1);
    treeSet.add(2);
    treeSet.add(3);
    treeSet.add(4);
    treeSet.add(5);
    treeSet.add(6);
    
    Set<Integer> expectedSet = new TreeSet<>();
    expectedSet.add(2);
    expectedSet.add(3);
    expectedSet.add(4);
    expectedSet.add(5);

    Set<Integer> subSet = treeSet.subSet(2, 6);
 
    assertEquals(expectedSet, subSet);
}

13. TreeSet headSet()

This method will return elements of TreeSet which are smaller than the specified element:

@Test
public void whenUsingHeadSet_shouldReturnHeadSetElements() {
    SortedSet<Integer> treeSet = new TreeSet<>();
    treeSet.add(1);
    treeSet.add(2);
    treeSet.add(3);
    treeSet.add(4);
    treeSet.add(5);
    treeSet.add(6);

    Set<Integer> subSet = treeSet.headSet(6);
 
    assertEquals(subSet, treeSet.subSet(1, 6));
}

14. TreeSet tailSet()

This method will return the elements of a TreeSet which are greater than or equal to the specified element:

@Test
public void whenUsingTailSet_shouldReturnTailSetElements() {
    NavigableSet<Integer> treeSet = new TreeSet<>();
    treeSet.add(1);
    treeSet.add(2);
    treeSet.add(3);
    treeSet.add(4);
    treeSet.add(5);
    treeSet.add(6);

    Set<Integer> subSet = treeSet.tailSet(3);
 
    assertEquals(subSet, treeSet.subSet(3, true, 6, true));
}

15. Storing Null Elements

Before Java 7, it was possible to add null elements to an empty TreeSet.

However, that was considered a bug. Therefore, TreeSet no longer supports the addition of null.

When we add elements to the TreeSet, the elements get sorted according to their natural order or as specified by the comparator. Hence adding a null, when compared to existing elements, results in a NullPointerException since null cannot be compared to any value:

@Test(expected = NullPointerException.class)
public void whenAddingNullToNonEmptyTreeSet_shouldThrowException() {
    Set<String> treeSet = new TreeSet<>();
    treeSet.add("First");
    treeSet.add(null);
}

Elements inserted into the TreeSet must either implement the Comparable interface or at least be accepted by the specified comparator. All such elements must be mutually comparable, i.e. e1.compareTo(e2) or comparator.compare(e1, e2) mustn’t throw a ClassCastException.

Let’s see an example:

class Element {
    private Integer id;

    // Other methods...
}

Comparator<Element> comparator = (ele1, ele2) -> {
    return ele1.getId().compareTo(ele2.getId());
};

@Test
public void whenUsingComparator_shouldSortAndInsertElements() {
    Set<Element> treeSet = new TreeSet<>(comparator);
    Element ele1 = new Element();
    ele1.setId(100);
    Element ele2 = new Element();
    ele2.setId(200);
    
    treeSet.add(ele1);
    treeSet.add(ele2);
    
    System.out.println(treeSet);
}

16. Performance of TreeSet

When compared to a HashSet the performance of a TreeSet is on the lower side. Operations like addremove and search take O(log n) time while operations like printing n elements in sorted order require O(n) time.

TreeSet should be our primary choice if we want to keep our entries sorted as a TreeSet may be accessed and traversed in either ascending or descending order, and the performance of ascending operations and views is likely to be faster than that of descending ones.

The Principle of Locality – is a term for the phenomenon in which the same values, or related storage locations, are frequently accessed, depending on the memory access pattern.

When we say locality:

  • Similar data is often accessed by an application with similar frequency
  • If two entries are nearby given an ordering, a TreeSet places them near each other in the data structure, and hence in memory

TreeSet being a data-structure with greater locality we can, therefore, conclude in accordance to the Principle of Locality, that we should give preference to a TreeSet if we’re short on memory and if we want to access elements that are relatively close to each other according to their natural ordering.

In case data needs to be read from the hard drive (which has greater latency than data read from the cache or memory) then prefer TreeSet as it has greater locality

17. Conclusion

In this article, we focus on understanding how to use the standard TreeSet implementation in Java. We saw its purpose and how efficient it is regarding usability given its ability to avoid duplicates and sort elements.

As always, code snippets can be found over on GitHub.