Finding the Differences Between Two Lists in Java

1. Overview

Finding differences between collections of objects of the same data type is a common programming task. As an example, imagine we have a list of students who applied for an exam and another list of students who passed it. The difference between those two lists would give us the students who didn’t pass the exam.

In Java, there’s no explicit way for finding differences between two lists in the List API, though there are some helper methods that come close.

In this quick tutorial, we’ll look at how to find the differences between the two lists. We’ll try a few different approaches, including plain Java (with and without Streams) and using third-party libraries such as Guava and the Apache Commons Collections.

2. Test Setup

Let’s start by defining two lists, which we’ll use to test out our examples:

public class FindDifferencesBetweenListsUnitTest {

    private static final List listOne = Arrays.asList("Jack", "Tom", "Sam", "John", "James", "Jack");
    private static final List listTwo = Arrays.asList("Jack", "Daniel", "Sam", "Alan", "James", "George");

}

3. Using the Java List API

We can create a copy of one list and then remove all the elements common with the other, using the List method removeAll():

List<String> differences = new ArrayList<>(listOne);
differences.removeAll(listTwo);
assertEquals(2, differences.size());
assertThat(differences).containsExactly("Tom", "John");

Let’s reverse this to find the differences the other way around:

List<String> differences = new ArrayList<>(listTwo);
differences.removeAll(listOne);
assertEquals(3, differences.size());
assertThat(differences).containsExactly("Daniel", "Alan", "George");

We should also note that if we want to find the common elements between the two lists, List also contains a retainAll method.

4. Using the Streams API

A Java Stream can be used for performing sequential operations on data from collections, which includes filtering differences between lists:

List<String> differences = listOne.stream()
            .filter(element -> !listTwo.contains(element))
            .collect(Collectors.toList());
assertEquals(2, differences.size());
assertThat(differences).containsExactly("Tom", "John");

As in our first example, we can switch the order of lists to find the different elements from the second list:

List<String> differences = listTwo.stream()
            .filter(element -> !listOne.contains(element))
            .collect(Collectors.toList());
assertEquals(3, differences.size());
assertThat(differences).containsExactly("Daniel", "Alan", "George");

We should note that the repeated calling of List.contains() can be a costly operation for larger lists.

5. Using Third-Party Libraries

5.1. Using Google Guava

Guava contains a handy Sets.difference method, but to use it we need to first convert our List to a Set:

List<String> differences = new ArrayList<>(Sets.difference(Sets.newHashSet(listOne), Sets.newHashSet(listTwo)));
assertEquals(2, differences.size());
assertThat(differences).containsExactlyInAnyOrder("Tom", "John");

We should note that converting the List to a Set will have the effect of deduplicating and reordering it.

5.2. Using Apache Commons Collections

The CollectionUtils class from Apache Commons Collections contains a removeAll method.

This method does the same as List.removeAll, while also creating a new collection for the result:

List<String> differences = new ArrayList<>((CollectionUtils.removeAll(listOne, listTwo)));
assertEquals(2, differences.size());
assertThat(differences).containsExactly("Tom", "John");

6. Handling Duplicate Values

Let’s now look at finding differences when two lists contain duplicated values.

To achieve this, we need to remove the duplicate elements from the first list, precisely as many times as they are contained in the second list.

In our example, the value “Jack” appears twice in the first list and only once in the second list:

List<String> differences = new ArrayList<>(listOne);
listTwo.forEach(differences::remove);
assertThat(differences).containsExactly("Tom", "John", "Jack");

We can also achieve this using the subtract method from Apache Commons Collections:

List<String> differences = new ArrayList<>(CollectionUtils.subtract(listOne, listTwo));
assertEquals(3, differences.size());
assertThat(differences).containsExactly("Tom", "John", "Jack");

7. Conclusion

In this article, we explored a few ways to find differences between lists.

In the examples, we covered a basic Java solutiona solution using the Streams API, and with third-party libraries like Google Guava and Apache Commons Collections.

We also saw how to handle duplicate values.

As always, the complete source code is available over on GitHub.

Related posts:

Java Program to Implement Interval Tree
Java Program to Find Path Between Two Nodes in a Graph
Java Program to Implement Uniform-Cost Search
Using Spring @ResponseStatus to Set HTTP Status Code
Java Program to Implement Merge Sort on n Numbers Without tail-recursion
An Example of Load Balancing with Zuul and Eureka
Spring Security Form Login
Java Program to Implement CopyOnWriteArrayList API
Convert char to String in Java
How to Read a File in Java
Java String Conversions
Giới thiệu Google Guice – Aspect Oriented Programming (AOP)
Feign – Tạo ứng dụng Java RESTful Client
Java Program to Implement Segment Tree
Finding Max/Min of a List or Collection
Java Program to Implement Sorted Doubly Linked List
Java Program to Implement AttributeList API
Java Program to Remove the Edges in a Given Cyclic Graph such that its Linear Extension can be Found
Java Program to Sort an Array of 10 Elements Using Heap Sort Algorithm
Java Program to Implement Fermat Factorization Algorithm
Tránh lỗi NullPointerException trong Java như thế nào?
Hướng dẫn sử dụng luồng vào ra ký tự trong Java
Introduction to Spring Cloud Stream
Introduction to the Java NIO2 File API
Finding articulation points in a graph in $O(N+M)$
Java Program to Implement Sparse Array
Java Program to Check the Connectivity of Graph Using DFS
Java Program to Check if a Given Graph Contain Hamiltonian Cycle or Not
Java TreeMap vs HashMap
Java Program to Implement Ternary Heap
Java Program to Generate All Possible Combinations Out of a, b, c, d, e
Partition a List in Java