DistinctBy in the Java Stream API

1. Overview

Searching for different elements in a list is one of the common tasks that we as programmers usually face. From Java 8 on with the inclusion of Streams we have a new API to process data using functional approach.

In this article, we’ll show different alternatives to filtering a collection using a particular attribute of objects in the list.

2. Using the Stream API

The Stream API provides the distinct() method that returns different elements of a list based on the equals() method of the Object class.

However, it becomes less flexible if we want to filter by a specific attribute. One of the alternatives we have is to write a filter that maintains the state.

2.1. Using a Stateful Filter

One of the possible solutions would be to implement a stateful Predicate:

public static <T> Predicate<T> distinctByKey(
    Function<? super T, ?> keyExtractor) {
  
    Map<Object, Boolean> seen = new ConcurrentHashMap<>(); 
    return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null; 
}

To test it, we’ll use the following Person class that has the attributes ageemail, and name:

public class Person { 
    private int age; 
    private String name; 
    private String email; 
    // standard getters and setters 
}

And to get a new filtered collection by name, we can use:

List<Person> personListFiltered = personList.stream() 
  .filter(distinctByKey(p -> p.getName())) 
  .collect(Collectors.toList());

3. Using Eclipse Collections

Eclipse Collections is a library that provides additional methods for processing Streams and collections in Java.

3.1. Using the ListIterate.distinct()

The ListIterate.distinct() method allows us to filter a Stream using various HashingStrategies. These strategies can be defined using lambda expressions or method references.

If we want to filter by the Person’s name:

List<Person> personListFiltered = ListIterate
  .distinct(personList, HashingStrategies.fromFunction(Person::getName));

Or, if the attribute we are going to use is primitive (int, long, double), we can use a specialized function like this:

List<Person> personListFiltered = ListIterate.distinct(
  personList, HashingStrategies.fromIntFunction(Person::getAge));

3.2. Maven Dependency

We need to add the following dependencies to our pom.xml to use Eclipse Collections in our project:

<dependency> 
    <groupId>org.eclipse.collections</groupId> 
    <artifactId>eclipse-collections</artifactId> 
    <version>8.2.0</version> 
</dependency>

You can find the latest version of the Eclipse Collections library in the Maven Central repository.

To learn more about this library we can go to this article.

4. Using Vavr (Javaslang)

This is a functional library for Java 8 that provides immutable data and functional control structures.

4.1. Using List.distinctBy

To filter lists, this class provides its own List class which has the distinctBy() method that allows us to filter by attributes of the objects it contains:

List<Person> personListFiltered = List.ofAll(personList)
  .distinctBy(Person::getName)
  .toJavaList();

4.2. Maven Dependency

We will add the following dependencies to our pom.xml to use Vavr in our project.

<dependency> 
    <groupId>io.vavr</groupId> 
    <artifactId>vavr</artifactId> 
    <version>0.9.0</version>  
</dependency>

You can find the latest version of the Vavr library in the Maven Central repository.

To learn more about this library we can go to this article.

5. Using StreamEx

This library provides useful classes and methods for Java 8 streams processing.

5.1. Using StreamEx.distinct

Within the classes provided is StreamEx which has the distinct method to which we can send a reference to the attribute where we want to distinct:

List<Person> personListFiltered = StreamEx.of(personList)
  .distinct(Person::getName)
  .toList();

5.2. Maven Dependency

We will add the following dependencies to our pom.xml to use StreamEx in our project.

<dependency> 
    <groupId>one.util</groupId> 
    <artifactId>streamex</artifactId> 
    <version>0.6.5</version> 
</dependency>

You can find the latest version of the StreamEx library in the Maven Central repository.

6. Conclusion

In this quick tutorial, we explored examples of how to get different elements of a Stream, based on an attribute using the standard Java 8 API and additional alternatives with other libraries.

As always, the complete code is available over on GitHub.

Related posts:

Java Program to Generate All Subsets of a Given Set in the Gray Code Order
Guide to Dynamic Tests in Junit 5
Quick Guide to @RestClientTest in Spring Boot
Java Program to Implement ArrayDeque API
Enum trong java
Java Program to Generate Random Numbers Using Probability Distribution Function
Java Program to Implement Adjacency List
Hướng dẫn sử dụng Printing Service trong Java
Java Program to Find the Minimum Element of a Rotated Sorted Array using Binary Search approach
Using the Not Operator in If Conditions in Java
Vòng lặp for, while, do-while trong Java
Jackson Exceptions – Problems and Solutions
Spring Security – Reset Your Password
Java Program to Implement Adjacency Matrix
Java Program to Search Number Using Divide and Conquer with the Aid of Fibonacci Numbers
Java – Generate Random String
Spring @RequestParam Annotation
Spring Data MongoDB Transactions
Spring REST API + OAuth2 + Angular (using the Spring Security OAuth legacy stack)
Testing an OAuth Secured API with Spring MVC
Mapping a Dynamic JSON Object with Jackson
Java Program to Implement LinkedBlockingQueue API
Using the Map.Entry Java Class
Java String Conversions
Java Program to Implement Heap’s Algorithm for Permutation of N Numbers
Java Program to Implement Hamiltonian Cycle Algorithm
Spring Data – CrudRepository save() Method
Map Serialization and Deserialization with Jackson
Java Program to Implement Singly Linked List
Spring Boot - Database Handling
Hướng dẫn Java Design Pattern – Strategy
Getting Started with Forms in Spring MVC