Removing all duplicates from a List in Java

In this quick tutorial we’re going to show how to clean up the duplicate elements from a List – first using plain Java, then Guava, and finally a Java 8 Lambda-based solution.

1. Remove Duplicates From a List Using Plain Java

Removing the duplicate elements from a List with the standard Java Collections Framework is done easily through a Set:

public void 
  givenListContainsDuplicates_whenRemovingDuplicatesWithPlainJava_thenCorrect() {
    List<Integer> listWithDuplicates = Lists.newArrayList(5, 0, 3, 1, 2, 3, 0, 0);
    List<Integer> listWithoutDuplicates = new ArrayList<>(
      new HashSet<>(listWithDuplicates));

    assertThat(listWithoutDuplicates, hasSize(5));
    assertThat(listWithoutDuplicates, containsInAnyOrder(5, 0, 3, 1, 2));
}

As we can see, the original list remains unchanged.

In the example above, we used HashSet implementation, which is an unordered collection. As a result, the order of the cleaned-up listWithoutDuplicates might be different than the order of the original listWithDuplicates.

If we need to preserve the order, we may use the LinkedHashSet instead:

public void 
  givenListContainsDuplicates_whenRemovingDuplicatesPreservingOrderWithPlainJava_thenCorrect() {
    List<Integer> listWithDuplicates = Lists.newArrayList(5, 0, 3, 1, 2, 3, 0, 0);
    List<Integer> listWithoutDuplicates = new ArrayList<>(
      new LinkedHashSet<>(listWithDuplicates));

    assertThat(listWithoutDuplicates, hasSize(5));
    assertThat(listWithoutDuplicates, containsInRelativeOrder(5, 0, 3, 1, 2));
}

2. Remove Duplicates From a List Using Guava

The same can be done using Guava as well:

public void 
  givenListContainsDuplicates_whenRemovingDuplicatesWithGuava_thenCorrect() {
    List<Integer> listWithDuplicates = Lists.newArrayList(5, 0, 3, 1, 2, 3, 0, 0);
    List<Integer> listWithoutDuplicates 
      = Lists.newArrayList(Sets.newHashSet(listWithDuplicates));

    assertThat(listWithoutDuplicates, hasSize(5));
    assertThat(listWithoutDuplicates, containsInAnyOrder(5, 0, 3, 1, 2));
}

Here also the original list remains unchanged.

And again, the order of elements in the cleaned-up list might be random.

If we use LinkedHashSet implementation, we’ll preserve the initial order:

public void 
  givenListContainsDuplicates_whenRemovingDuplicatesPreservingOrderWithGuava_thenCorrect() {
    List<Integer> listWithDuplicates = Lists.newArrayList(5, 0, 3, 1, 2, 3, 0, 0);
    List<Integer> listWithoutDuplicates 
      = Lists.newArrayList(Sets.newLinkedHashSet(listWithDuplicates));

    assertThat(listWithoutDuplicates, hasSize(5));
    assertThat(listWithoutDuplicates, containsInRelativeOrder(5, 0, 3, 1, 2));
}

3. Remove Duplicates From a List Using Java 8 Lambdas

Finally – let’s look at a new solution, using Lambdas in Java 8; we’re going to use the distinct() method from the Stream API which returns a stream consisting of distinct elements based on the result returned by equals() method.

Additionally, for ordered streams, the selection of distinct elements is stable. This means that for duplicated elements, the element appearing first in the encounter order is preserved.

public void 
  givenListContainsDuplicates_whenRemovingDuplicatesWithJava8_thenCorrect() {
    List<Integer> listWithDuplicates = Lists.newArrayList(5, 0, 3, 1, 2, 3, 0, 0);
    List<Integer> listWithoutDuplicates = listWithDuplicates.stream()
     .distinct()
     .collect(Collectors.toList());

    assertThat(listWithoutDuplicates, hasSize(5));
    assertThat(listWithoutDuplicates, containsInAnyOrder(5, 0, 3, 1, 2));
}

And there we have it – three quick ways to clean up all the duplicate items from a List.

4. Conclusion

In this article, we’ve demonstrated how easy it is to remove duplicates from a list using plain Java, Google Guava, and Java 8.

The implementation of all of these examples and snippets can be found in the GitHub project. This is a Maven-based project so it should be easy to import and run.