Removing all duplicates from a List in Java

In this quick tutorial we’re going to show how to clean up the duplicate elements from a List – first using plain Java, then Guava, and finally a Java 8 Lambda-based solution.

1. Remove Duplicates From a List Using Plain Java

Removing the duplicate elements from a List with the standard Java Collections Framework is done easily through a Set:

public void 
  givenListContainsDuplicates_whenRemovingDuplicatesWithPlainJava_thenCorrect() {
    List<Integer> listWithDuplicates = Lists.newArrayList(5, 0, 3, 1, 2, 3, 0, 0);
    List<Integer> listWithoutDuplicates = new ArrayList<>(
      new HashSet<>(listWithDuplicates));

    assertThat(listWithoutDuplicates, hasSize(5));
    assertThat(listWithoutDuplicates, containsInAnyOrder(5, 0, 3, 1, 2));
}

As we can see, the original list remains unchanged.

In the example above, we used HashSet implementation, which is an unordered collection. As a result, the order of the cleaned-up listWithoutDuplicates might be different than the order of the original listWithDuplicates.

If we need to preserve the order, we may use the LinkedHashSet instead:

public void 
  givenListContainsDuplicates_whenRemovingDuplicatesPreservingOrderWithPlainJava_thenCorrect() {
    List<Integer> listWithDuplicates = Lists.newArrayList(5, 0, 3, 1, 2, 3, 0, 0);
    List<Integer> listWithoutDuplicates = new ArrayList<>(
      new LinkedHashSet<>(listWithDuplicates));

    assertThat(listWithoutDuplicates, hasSize(5));
    assertThat(listWithoutDuplicates, containsInRelativeOrder(5, 0, 3, 1, 2));
}

2. Remove Duplicates From a List Using Guava

The same can be done using Guava as well:

public void 
  givenListContainsDuplicates_whenRemovingDuplicatesWithGuava_thenCorrect() {
    List<Integer> listWithDuplicates = Lists.newArrayList(5, 0, 3, 1, 2, 3, 0, 0);
    List<Integer> listWithoutDuplicates 
      = Lists.newArrayList(Sets.newHashSet(listWithDuplicates));

    assertThat(listWithoutDuplicates, hasSize(5));
    assertThat(listWithoutDuplicates, containsInAnyOrder(5, 0, 3, 1, 2));
}

Here also the original list remains unchanged.

And again, the order of elements in the cleaned-up list might be random.

If we use LinkedHashSet implementation, we’ll preserve the initial order:

public void 
  givenListContainsDuplicates_whenRemovingDuplicatesPreservingOrderWithGuava_thenCorrect() {
    List<Integer> listWithDuplicates = Lists.newArrayList(5, 0, 3, 1, 2, 3, 0, 0);
    List<Integer> listWithoutDuplicates 
      = Lists.newArrayList(Sets.newLinkedHashSet(listWithDuplicates));

    assertThat(listWithoutDuplicates, hasSize(5));
    assertThat(listWithoutDuplicates, containsInRelativeOrder(5, 0, 3, 1, 2));
}

3. Remove Duplicates From a List Using Java 8 Lambdas

Finally – let’s look at a new solution, using Lambdas in Java 8; we’re going to use the distinct() method from the Stream API which returns a stream consisting of distinct elements based on the result returned by equals() method.

Additionally, for ordered streams, the selection of distinct elements is stable. This means that for duplicated elements, the element appearing first in the encounter order is preserved.

public void 
  givenListContainsDuplicates_whenRemovingDuplicatesWithJava8_thenCorrect() {
    List<Integer> listWithDuplicates = Lists.newArrayList(5, 0, 3, 1, 2, 3, 0, 0);
    List<Integer> listWithoutDuplicates = listWithDuplicates.stream()
     .distinct()
     .collect(Collectors.toList());

    assertThat(listWithoutDuplicates, hasSize(5));
    assertThat(listWithoutDuplicates, containsInAnyOrder(5, 0, 3, 1, 2));
}

And there we have it – three quick ways to clean up all the duplicate items from a List.

4. Conclusion

In this article, we’ve demonstrated how easy it is to remove duplicates from a list using plain Java, Google Guava, and Java 8.

The implementation of all of these examples and snippets can be found in the GitHub project. This is a Maven-based project so it should be easy to import and run.

Related posts:

Concurrent Test Execution in Spring 5
Tạo chương trình Java đầu tiên sử dụng Eclipse
Control Structures in Java
Explain about URL and HTTPS protocol
Query Entities by Dates and Times with Spring Data JPA
Tiêu chuẩn coding trong Java (Coding Standards)
ExecutorService – Waiting for Threads to Finish
Tạo số và chuỗi ngẫu nhiên trong Java
Spring Boot - Google OAuth2 Sign-In
Hướng dẫn tạo và sử dụng ThreadPool trong Java
Prevent Brute Force Authentication Attempts with Spring Security
Đồng bộ hóa các luồng trong Java
Check If a File or Directory Exists in Java
Spring – Injecting Collections
Java Program to Perform Searching Using Self-Organizing Lists
Hướng dẫn Java Design Pattern – Observer
The Order of Tests in JUnit
Marker Interface trong Java
The “final” Keyword in Java
Java Program to Implement Doubly Linked List
Java Program to Implement the Checksum Method for Small String Messages and Detect
Java Program to Find the Minimum value of Binary Search Tree
Converting Between a List and a Set in Java
Entity To DTO Conversion for a Spring REST API
Quản lý bộ nhớ trong Java với Heap Space vs Stack
Enum trong java
Inheritance and Composition (Is-a vs Has-a relationship) in Java
Using Spring ResponseEntity to Manipulate the HTTP Response
Java Program to Implement Interval Tree
Java Program to Find Maximum Element in an Array using Binary Search
Java Program to Convert a Decimal Number to Binary Number using Stacks
Generating Random Numbers in a Range in Java