Apache Commons Collections SetUtils

1. Overview

In this article, we’ll be exploring the SetUtils API of Apache Commons Collections library. Simply put, these utilities can be used to execute certain operations on Set data structures in Java.

2. Dependency Installation

In order for us to use the SetUtils library in our project, we need to add the following dependency to our project’s pom.xml file:


Alternatively, if our project is Gradle-based, we should add the dependency to our project’s build.gradle file. Also, we need to add mavenCentral() to the repositories section of the build.gradle file:

compile 'org.apache.commons:commons-collections4:4.1'

3. Predicated Set

The predicatedSet() method of the SetUtils library permits defining conditions that should be met by all elements that are to be inserted into a set. It accepts a source Set object and a predicate.

We can use this to easily validate that all elements of a Set satisfy a certain condition, which can be handy when developing a third-party library/API.

If the validation fails for any element, an IllegalArgumentException will be thrown. The snippet below prevents the addition of strings that do not start with ‘L’ into the sourceSet or the returned validatingSet:

Set<String> validatingSet
  = SetUtils.predicatedSet(sourceSet, s -> s.startsWith("L"));

The library also has predicatedSortedSet() and predicatedNavigableSet() for working with SortedSet and NavigableSet respectively.

4. Union, Difference, and Intersection of a Set

The library has methods that can compute union, difference, and the intersection of Set elements.

The difference() method takes two Set objects and returns an immutable SetUtils.SetView object. The returned SetUtils.SetView contains the elements that are in set a but not in set b:

Set<Integer> a = new HashSet<>(Arrays.asList(1, 2, 5));
Set<Integer> b = new HashSet<>(Arrays.asList(1, 2));
SetUtils.SetView<Integer> result = SetUtils.difference(a, b);
assertTrue(result.size() == 1 && result.contains(5));

Note that, trying to perform write operations, like add() or addAll(), on the returned SetUtils.SetView will throw an UnsupportedOperationException.

To modify the returned result, we need to call the toSet() method of the returned SetUtils.SetView to obtain a writable Set object:

Set<Integer> mutableSet = result.toSet();

The union method of the SetUtils library does exactly what it sounds like – it returns all the elements of set a and b. The union method also returns a SetUtil.SetView object that is immutable:

Set<Integer> expected = new HashSet<>(Arrays.asList(1, 2, 5));
SetUtils.SetView<Integer> union = SetUtils.union(a, b);
assertTrue(SetUtils.isEqualSet(expected, union));

Take note of the isEqualSet() method used in the assert statement. It is a convenient static method of SetUtils library that effectively checks if two sets are equal.

To get the intersection of a set, i.e. elements that are both present in set a and set b, we’ll use the SetUtils.intersection() method. This method also returns a SetUtil.SetView object:

Set<Integer> expected = new HashSet<>(Arrays.asList(1, 2));
SetUtils.SetView<Integer> intersect = SetUtils.intersection(a, b);
assertTrue(SetUtils.isEqualSet(expected, intersect));

5. Transforming Set Elements

Let’s take a look at another exciting method – SetUtils.transformedSet(). This method accepts a Set object and a Transformer interface. Backed by the source set, it uses the transform() method of the Transformer interface to transform every element of a set.

The transforming logic is defined in the transform() method of the Transformer interface, which is applied to every element added to the set. The code snippet below multiplies every element added to the set by 2:

Set<Integer> a = SetUtils.transformedSet(new HashSet<>(), e -> e * 2  );
assertEquals(a.toArray()[0], 4);

The transformedSet() method is pretty handy – they can even be used to cast elements of a set – say from String to Integer. Just make sure that the type of the output is a subtype of the input.

Let’s say we are working with SortedSet or NavigableSet instead of HashSet, we can use the transformedSortedSet() or transformedNavigableSet() respectively.

Note that a new HashSet instance is passed to the transformedSet() method. In situations where an existing, non-empty Set is passed to the method, the pre-existing elements will not be transformed.

If we want to transform pre-existing elements (and those added thereafter), we need to use the transformedSet() method of org.apache.commons.collections4.set.TransformedSet:

Set<Integer> source = new HashSet<>(Arrays.asList(1));
Set<Integer> newSet = TransformedSet.transformedSet(source, e -> e * 2);
assertEquals(newSet.toArray()[0], 2);
assertEquals(source.toArray()[0], 2);

Note that elements from the source set are transformed and the result is copied to the returned newSet.

6. Set Disjunction

The SetUtils library provides a static method that can be used to find set disjunctions. The disjunction of set a and set b are all the elements that are unique to set a and set b.

Let’s see how to use the disjunction() method of the SetUtils library:

Set<Integer> a = new HashSet<>(Arrays.asList(1, 2, 5));
Set<Integer> b = new HashSet<>(Arrays.asList(1, 2, 3));
SetUtils.SetView<Integer> result = SetUtils.disjunction(a, b);
  result.toSet().contains(5) && result.toSet().contains(3));

7. Other Methods in SetUtils Library

There are other methods in the SetUtils library that make processing of set data a breeze:

  • We can use the synchronizedSet() or synchronizedSortedSet() to get a thread-safe Set. However, as stated in the docs, we must manually synchronize the returned set’s iterator to avoid non-deterministic behavior
  • We can use the SetUtils.unmodifiableSet() to get a read-only set. Note that an attempt to add elements to the returned Set Object will throw an UnsupportedOperationException
  • There is also the SetUtils.emptySet() method that returns a type-safe, immutable empty set
  • The SetUtils.emptyIfNull() method accepts a nullable Set object. It returns an empty, read-only, Set if the supplied Set is null; otherwise, it returns the supplied Set
  • SetUtils.orderedSet() will return a Set object that maintains the order in which elements are added
  • SetUtils.hashCodeForSet() can generate a hashcode for a set – in such a way that two sets of the same elements will have the same hashcode
  • SetUtils.newIdentityHashSet() will return a HashSet that uses == to match an element instead of the equals() method. Please read about its caveats here

8. Conclusion

In this article, we’ve explored the nitty-gritty of the SetUtils library. The utility class offers static methods that make working with a set data structure easy and exciting. It also boosts productivity.

As always, code snippets are available over on GitHub. The official doc for the SetUtils API can be found here.