Introduction to PCollections

1. Overview

In this article, we will be looking at PCollections, a Java library providing persistent, immutable collections.

Persistent data structures (collections) can’t be modified directly during the update operation, rather a new object with the result of the update operation is returned. They are not only immutable but also persistent – which means that after modification is performed, previous versions of the collection remain unchanged.

PCollections is analogous to and compatible with the Java Collections framework.

2. Dependencies

Let’s add the following dependency to our pom.xml for us to use PCollections in our project:

<dependency>
    <groupId>org.pcollections</groupId>
    <artifactId>pcollections</artifactId>
    <version>2.1.2</version>
</dependency>

If our project is Gradle based, we can add the same artifact to our build.gradle file:

compile 'org.pcollections:pcollections:2.1.2'

The latest version can be found on Maven Central.

3. Map Structure (HashPMap)

HashPMap is a persistent map data structure. It is the analog for java.util.HashMap used for storing non-null, key-value data.

We can instantiate HashPMap by using convenient static methods in HashTreePMap. These static methods return a HashPMap instance that is backed by an IntTreePMap.

The static empty() method of the HashTreePMap class creates an empty HashPMap that has no elements – just like using the default constructor of java.util.HashMap:

HashPMap<String, String> pmap = HashTreePMap.empty();

There are two other static methods that we can use to create HashPMap. The singleton() method creates a HashPMap with only one entry:

HashPMap<String, String> pmap1 = HashTreePMap.singleton("key1", "value1");
assertEquals(pmap1.size(), 1);

The from() method creates a HashPMap from an existing java.util.HashMap instance (and other java.util.Map implementations):

Map map = new HashMap();
map.put("mkey1", "mval1");
map.put("mkey2", "mval2");

HashPMap<String, String> pmap2 = HashTreePMap.from(map);
assertEquals(pmap2.size(), 2);

Although HashPMap inherits some of the methods from java.util.AbstractMap and java.util.Map, it has methods that are unique to it.

The minus() method removes a single entry from the map while the minusAll() method removes multiple entries. There’s also the plus() and plusAll() methods that add single and multiple entries respectively:

HashPMap<String, String> pmap = HashTreePMap.empty();
HashPMap<String, String> pmap0 = pmap.plus("key1", "value1");

Map map = new HashMap();
map.put("key2", "val2");
map.put("key3", "val3");
HashPMap<String, String> pmap1 = pmap0.plusAll(map);

HashPMap<String, String> pmap2 = pmap1.minus("key1");

HashPMap<String, String> pmap3 = pmap2.minusAll(map.keySet());

assertEquals(pmap0.size(), 1);
assertEquals(pmap1.size(), 3);
assertFalse(pmap2.containsKey("key1"));
assertEquals(pmap3.size(), 0);

It’s important to note that calling put() on pmap will throw an UnsupportedOperationException. Since PCollections objects are persistent and immutable, every modifying operation returns a new instance of an object (HashPMap).

Let’s move on to look at other data structures.

4. List Structure (TreePVector and ConsPStack)

TreePVector is a persistent analog of java.util.ArrayList while ConsPStack is the analog of java.util.LinkedListTreePVector and ConsPStack have convenient static methods for creating new instances – just like HashPMap.

The empty() method creates an empty TreePVector, while the singleton() method creates a TreePVector with only one element. There’s also the from() method that can be used to create an instance of TreePVector from any java.util.Collection.

ConsPStack has static methods with the same name that achieve the same goal.

TreePVector has methods for manipulating it. It has the minus() and minusAll() methods for removal of element(s); the plus(), and plusAll() for addition of element(s).

The with() is used to replace an element at a specified index, and the subList() gets a range of elements from the collection.

These methods are available in ConsPStack as well.

Let’s consider the following code snippet that exemplifies the methods mentioned above:

TreePVector pVector = TreePVector.empty();

TreePVector pV1 = pVector.plus("e1");
TreePVector pV2 = pV1.plusAll(Arrays.asList("e2", "e3", "e4"));
assertEquals(1, pV1.size());
assertEquals(4, pV2.size());

TreePVector pV3 = pV2.minus("e1");
TreePVector pV4 = pV3.minusAll(Arrays.asList("e2", "e3", "e4"));
assertEquals(pV3.size(), 3);
assertEquals(pV4.size(), 0);

TreePVector pSub = pV2.subList(0, 2);
assertTrue(pSub.contains("e1") && pSub.contains("e2"));

TreePVector pVW = (TreePVector) pV2.with(0, "e10");
assertEquals(pVW.get(0), "e10");

In the code snippet above, pSub is another TreePVector object and is independent of pV2. As can be observed, pV2 was not changed by the subList() operation; rather a new TreePVector object was created and filled with elements of pV2 from index 0 to 2.

This is what is meant by immutability and it is what happens with all modifying methods of PCollections.

5. Set Structure (MapPSet)

MapPSet is a persistent, map-backed, analog of java.util.HashSet. It can be conveniently instantiated by static methods of HashTreePSet – empty()from() and singleton(). They function in the same way as explained in previous examples.

MapPSet has plus()plusAll()minus() and minusAll() methods for manipulating set data. Furthermore, it inherits methods from java.util.Setjava.util.AbstractCollection and java.util.AbstractSet:

MapPSet pSet = HashTreePSet.empty()     
  .plusAll(Arrays.asList("e1","e2","e3","e4"));
assertEquals(pSet.size(), 4);

MapPSet pSet1 = pSet.minus("e4");
assertFalse(pSet1.contains("e4"));

Finally, there’s also OrderedPSet – which maintains the insertion order of elements just like java.util.LinkedHashSet.

6. Conclusion

In conclusion, in this quick tutorial, we explored PCollections – the persistent data structures that are analogous to core collections we have available in Java. Of course, the PCollections Javadoc provides more insight into the intricacies of the library.

And, as always, the complete code can be found over on Github.

Related posts:

Anonymous Classes in Java
Java Program to Implement Max Heap
Guide to UUID in Java
Java Program to implement Bit Matrix
Lập trình đa luồng với CompletableFuture trong Java 8
Implementing a Runnable vs Extending a Thread
Set Interface trong Java
Java Program to Find the Number of Ways to Write a Number as the Sum of Numbers Smaller than Itself
Java Program to Find the Connected Components of an UnDirected Graph
How to Use if/else Logic in Java 8 Streams
Exploring the Spring Boot TestRestTemplate
Limiting Query Results with JPA and Spring Data JPA
Spring Cloud AWS – RDS
Removing all Nulls from a List in Java
Java Timer
Spring WebClient Requests with Parameters
Java Program to Implement Binary Heap
Marker Interface trong Java
REST Web service: Tạo ứng dụng Java RESTful Client với Jersey Client 2.x
REST Web service: HTTP Status Code và xử lý ngoại lệ RESTful web service với Jersey 2.x
Default Password Encoder in Spring Security 5
Spring Boot - Enabling HTTPS
Java Program to Represent Graph Using Incidence List
Chuyển đổi Array sang ArrayList và ngược lại
Java Program to Implement Fermat Factorization Algorithm
Làm thế nào tạo instance của một class mà không gọi từ khóa new?
Command-Line Arguments in Java
Java Program to Implement Stack
Java Program to Perform Cryptography Using Transposition Technique
Java Program to Test Using DFS Whether a Directed Graph is Strongly Connected or Not
Java Program to Generate All Possible Combinations of a Given List of Numbers
Flattening Nested Collections in Java