Table of Contents
1. Overview
In this tutorial I will illustrate how to split a List into several sublists of a given size.
For a relatively simple operation, there is surprisingly no support in the standard Java collection APIs. Luckily, both Guava and the Apache Commons Collections have implemented the operation in a similar way.
2. Use Guava to Partition the List
Guava facilitates partitioning the List into sublists of a specified size – via the Lists.partition operation:
@Test public void givenList_whenParitioningIntoNSublists_thenCorrect() { List<Integer> intList = Lists.newArrayList(1, 2, 3, 4, 5, 6, 7, 8); List<List<Integer>> subSets = Lists.partition(intList, 3); List<Integer> lastPartition = subSets.get(2); List<Integer> expectedLastPartition = Lists.<Integer> newArrayList(7, 8); assertThat(subSets.size(), equalTo(3)); assertThat(lastPartition, equalTo(expectedLastPartition)); }
3. Use Guava to Partition a Collection
Partitioning a Collection is also possible with Guava:
@Test public void givenCollection_whenParitioningIntoNSublists_thenCorrect() { Collection<Integer> intCollection = Lists.newArrayList(1, 2, 3, 4, 5, 6, 7, 8); Iterable<List<Integer>> subSets = Iterables.partition(intCollection, 3); List<Integer> firstPartition = subSets.iterator().next(); List<Integer> expectedLastPartition = Lists.<Integer> newArrayList(1, 2, 3); assertThat(firstPartition, equalTo(expectedLastPartition)); }
Keep in mind that the partitions are sublist views of the original collection – which means that changes in the original collection will be reflected in the partitions:
@Test public void givenListPartitioned_whenOriginalListIsModified_thenPartitionsChangeAsWell() { // Given List<Integer> intList = Lists.newArrayList(1, 2, 3, 4, 5, 6, 7, 8); List<List<Integer>> subSets = Lists.partition(intList, 3); // When intList.add(9); // Then List<Integer> lastPartition = subSets.get(2); List<Integer> expectedLastPartition = Lists.<Integer> newArrayList(7, 8, 9); assertThat(lastPartition, equalTo(expectedLastPartition)); }
4. Use Apache Commons Collections to Partition the List
The latest releases of Apache Commons Collections have recently added support for partitioning a List as well:
@Test public void givenList_whenParitioningIntoNSublists_thenCorrect() { List<Integer> intList = Lists.newArrayList(1, 2, 3, 4, 5, 6, 7, 8); List<List<Integer>> subSets = ListUtils.partition(intList, 3); List<Integer> lastPartition = subSets.get(2); List<Integer> expectedLastPartition = Lists.<Integer> newArrayList(7, 8); assertThat(subSets.size(), equalTo(3)); assertThat(lastPartition, equalTo(expectedLastPartition)); }
There is no corresponding option to partition a raw Collection – similar to the Guava Iterables.partition in Commons Collections.
Finally, the same caveat applies here as well – the resulting partition are views of the original List.
5. Use Java8 to Partition the List
Now, let’s see how to use Java8 to partition our List.
5.1. Collectors partitioningBy
We can use Collectors.partitioningBy() to split the list into 2 sublists – as follows:
@Test public void givenList_whenParitioningIntoSublistsUsingPartitionBy_thenCorrect() { List<Integer> intList = Lists.newArrayList(1, 2, 3, 4, 5, 6, 7, 8); Map<Boolean, List<Integer>> groups = intList.stream().collect(Collectors.partitioningBy(s -> s > 6)); List<List<Integer>> subSets = new ArrayList<List<Integer>>(groups.values()); List<Integer> lastPartition = subSets.get(1); List<Integer> expectedLastPartition = Lists.<Integer> newArrayList(7, 8); assertThat(subSets.size(), equalTo(2)); assertThat(lastPartition, equalTo(expectedLastPartition)); }
Note: The resulting partitions are not a view of the main List, so any changes happen to the main List won’t affect the partitions.
5.2. Collectors groupingBy
We also can use Collectors.groupingBy() to split our list to multiple partitions:
@Test public final void givenList_whenParitioningIntoNSublistsUsingGroupingBy_thenCorrect() { List<Integer> intList = Lists.newArrayList(1, 2, 3, 4, 5, 6, 7, 8); Map<Integer, List<Integer>> groups = intList.stream().collect(Collectors.groupingBy(s -> (s - 1) / 3)); List<List<Integer>> subSets = new ArrayList<List<Integer>>(groups.values()); List<Integer> lastPartition = subSets.get(2); List<Integer> expectedLastPartition = Lists.<Integer> newArrayList(7, 8); assertThat(subSets.size(), equalTo(3)); assertThat(lastPartition, equalTo(expectedLastPartition)); }
Note: Just as Collectors.partitioningBy() – the resulting partitions won’t be affected by changes in main List.
5.3. Split the List by Separator
We can also use Java8 to split our List by separator:
@Test public void givenList_whenSplittingBySeparator_thenCorrect() { List<Integer> intList = Lists.newArrayList(1, 2, 3, 0, 4, 5, 6, 0, 7, 8); int[] indexes = Stream.of(IntStream.of(-1), IntStream.range(0, intList.size()) .filter(i -> intList.get(i) == 0), IntStream.of(intList.size())) .flatMapToInt(s -> s).toArray(); List<List<Integer>> subSets = IntStream.range(0, indexes.length - 1) .mapToObj(i -> intList.subList(indexes[i] + 1, indexes[i + 1])) .collect(Collectors.toList()); List<Integer> lastPartition = subSets.get(2); List<Integer> expectedLastPartition = Lists.<Integer> newArrayList(7, 8); assertThat(subSets.size(), equalTo(3)); assertThat(lastPartition, equalTo(expectedLastPartition)); }
Note: We used “0” as separator – we first obtained the indices of all “0” elements in the List, then we split the List on these indices.
6. Conclusion
The solutions presented here makes use of additional libraries – Guava or the Apache Commons Collections library. Both of these are very lightweight and extremely useful overall, so it makes perfect sense to have one of them on the classpath; however, if that is not an option – a Java only solution is shown here.
package com.maixuanviet.algorithms.partitioncollection; import java.util.AbstractList; import java.util.List; public class MyPartition { /** * Returns consecutive {@linkplain List#subList(int, int) sublists} of a list, * each of the same size (the final list may be smaller). For example, * partitioning a list containing {@code [a, b, c, d, e]} with a partition * size of 3 yields {@code [[a, b, c], [d, e]]} -- an outer list containing * two inner lists of three and two elements, all in the original order. * * <p>The outer list is unmodifiable, but reflects the latest state of the * source list. The inner lists are sublist views of the original list, * produced on demand using {@link List#subList(int, int)}, and are subject * to all the usual caveats about modification as explained in that API. * * * Adapted from http://code.google.com/p/google-collections/ * * @param list the list to return consecutive sublists of * @param size the desired size of each sublist (the last may be * smaller) * @return a list of consecutive sublists * @throws IllegalArgumentException if {@code partitionSize} is nonpositive * */ public static <T> List<List<T>> partition(List<T> list, int size) { if (list == null) throw new NullPointerException( "'list' must not be null"); if (!(size > 0)) throw new IllegalArgumentException( "'size' must be greater than 0"); return new Partition<T>(list, size); } private static class Partition<T> extends AbstractList<List<T>> { final List<T> list; final int size; Partition(List<T> list, int size) { this.list = list; this.size = size; } @Override public List<T> get(int index) { int listSize = size(); if (listSize < 0) throw new IllegalArgumentException("negative size: " + listSize); if (index < 0) throw new IndexOutOfBoundsException( "index " + index + " must not be negative"); if (index >= listSize) throw new IndexOutOfBoundsException( "index " + index + " must be less than size " + listSize); int start = index * size; int end = Math.min(start + size, list.size()); return list.subList(start, end); } @Override public int size() { return (list.size() + size - 1) / size; } @Override public boolean isEmpty() { return list.isEmpty(); } } }
package com.maixuanviet.algorithms.partitioncollection; import java.util.ArrayList; import java.util.List; import org.junit.Test; import static org.junit.Assert.assertTrue; public class MyPartitionTest { @Test public void partitiontest1() { List<String> list = new ArrayList<String>(); list.add("one"); list.add("two"); list.add("three"); list.add("four"); list.add("five"); list.add("six"); list.add("seven"); list.add("eight"); list.add("nine"); list.add("ten"); list.add("eleven"); List<List<String>> partition = MyPartition.partition(list, 1); System.out.println(partition.get(2).size()); assertTrue(partition.size()==11); assertTrue(partition.get(0).size()==1); partition = MyPartition.partition(list, 7); assertTrue(partition.size()==2); assertTrue(partition.get(0).size()==7); assertTrue(partition.get(1).size()==4); // now let assume you want to have x number of buckets // How many elements must placed in a list? // Take x as 3 int buckets = 3; int divide = list.size() / buckets; if (list.size() % buckets !=0){ divide ++; } System.out.println("Max. number of element in each bucket " + divide); partition = MyPartition.partition(list, divide ); assertTrue(partition.size()==buckets); } }
The implementation of all these examples and code snippets can be found over on GitHub– this is a Maven-based project, so it should be easy to import and run as it is.