Java 8 Grouping with Collectors | groupingBy method tutorial with examples
Introduction
Java 8 Grouping with Collectors tutorial explains how to use the predefined Collector returned by
The tutorial begins with explaining how grouping of stream elements works using a Grouping Collector. The concept of grouping is visually illustrated with a diagram. Next, the three overloaded
Grouping collectors use a classification function, which is an instance of the
All these R-values and corresponding Collection of stream objects are stored by the grouping collector in a
The process of grouping, starting from the application of classification function on the stream elements, till the creation of Map containing the grouped elements, is as shown in the diagram below -
In the above diagram, the elements of
Having understood now the concept of grouping with collectors, let us now see how to implement grouping collectors in code using the 3 overloaded
Variant #1 of Collectors.groupingBy() method - stores grouped elements in a List The simplest of
Where,
- input is
- output is a Collector with finisherClick to Read tutorial on 4 components of Collectors incl. 'finisher'(return type) as a
The simplest variant of
Variant #1 of grouping collector - Java Example Lets say we have a stream of
In the above diagram, employees are grouped into 4 departments - HR, OPERATIONS, LEGAL and MARKETING. Let us now see the Java code for implementing the above 'Department - Employees' use case, followed by its explanation.
OUTPUT of the above code
Explanation of the code
The 2nd variant of grouping collector is defined with the following signature -
Where,
- 1st input parameter is
- 2nd input parameter is
- output is a
Let us now see the 2nd variant of grouping collector in action with a Java code example.
Variant #2 of grouping collector - Java Example
This example for variant#2 uses the same use case of employees being grouped as per their department but this time instead of storing the grouped elements in a
(Note - The
OUTPUT of the above code
Explanation of the code
The 3rd variant of grouping collector is defined with the following signature -
Where,
- 1st input parameter is
- 2nd input parameter is Supplier<M>Click to read detailed tutorial on Supplier Functional Interfaces which is a factoryClick to Read Tutorial on Factory Design Pattern supplying
- 3rd input parameter is
- output is a
How variant#2 and variant#3 of grouping collector are closely related
In the Collectors class' code, the second variant of grouping Collector which accepts the classification function along with downstream collector as input does not itself return the collector which processes the stream elements. Instead, internally it delegates the call forward to the third variant with the call -
Going back a bit, we said something similar about the first and second
Fortunately, the transitive offloading/delegation between variants ends at variant #3 which actually contains the entire collector logic for a grouping collector.
Let us now see a Java code example showing how to use the 3rd variant of grouping collector. Variant #3 of grouping collector - Java Example This example for variant #3 uses the same use case of employees being grouped as per their department. However, this time we will store the grouped elements in a
OUTPUT of the above code
Explanation of the code
groupingBy()
method of java.util.stream.Collectors
class with examples.The tutorial begins with explaining how grouping of stream elements works using a Grouping Collector. The concept of grouping is visually illustrated with a diagram. Next, the three overloaded
groupingBy()
methods in Collectors
class are explained using their method definitions, Java code examples showing the 3 methods in action and explanations for the code examples. Lastly, a brief overview of the concurrent versions of the three groupingBy()
methods is provided.
(Note - This tutorial assumes that its readers are familiar with the basics of Java 8 CollectorsRead Tutorial explaining basics of Java 8 Collectors.)
Understanding the concept of 'grouping' using Collectors
Given a stream of objects, there are scenarios where these objects need to be grouped based on a certain distinguishing characteristic they posses. This concept of grouping is the same as the 'group by' clause in SQL which takes an attribute, or a calculated value derived from attribute(s), to divide the retrieved records in distinct groups. Generally, in imperative style of programming, such grouping of records(objects in OOPS) involves iterating over each object, checking which group the object being examined falls in, and then adding that object in its correct group. The group itself is held together using a Collection
instance. Java 8's new functional features allow us to do the same grouping of objects in a declarative way, which is typical of functional rather than imperativeClick to Read tutorial explaining how functional & imperative programming styles differ style of programming, using Java 8's new Grouping Collector.Grouping collectors use a classification function, which is an instance of the
Function<T,R>
functional interface, which for every object of type T
in a stream, returns a classifier object of type R
. Various values of R, finite in number, are the 'group names' or 'group keys'. As the grouping collector works on the stream of objects its collecting from it creates collections of stream objects corresponding to each of the 'group keys'. I.e. for every value of R
there is a collection of objects all of which return that value of R
when subjected to the classification function.Map<R, Collection<T>>
, i.e. each ‘key,value’
entry in the map consists of ‘R,Collection<T>’
.The process of grouping, starting from the application of classification function on the stream elements, till the creation of Map containing the grouped elements, is as shown in the diagram below -
Stream<T>
are grouped using a classification function returning 4 values of R
- r1,r2,r3,r4. The grouped elements are stored in a Map<R,Collection<T>>
, with the 4 values of R
being used as 4 keys pointing to 4 corresponding collections
stored in the Map
. These Collection
instances hold the individual grouped elements, which is the required output from the grouping collector.Having understood now the concept of grouping with collectors, let us now see how to implement grouping collectors in code using the 3 overloaded
groupingBy()
method variants provided in Collectors
class, starting from the simplest variant which creates a List
of the grouped elements.Variant #1 of Collectors.groupingBy() method - stores grouped elements in a List The simplest of
Collectors.groupingBy()
method variants is defined with the following signature -
public static <T, K> Collector<T, ?, Map<K, List<T>>> groupingBy(Function<? super T, ? extends K> classifier)
- input is
classifier
which is an instance of a FunctionClick to read detailed tutorial on Function Functional Interfaces functional interface which converts from type T to type K.- output is a Collector with finisherClick to Read tutorial on 4 components of Collectors incl. 'finisher'(return type) as a
Map
with entries having ‘key,value’ pairs as ‘K, List<T>
’The simplest variant of
groupingBy()
method applies classifier Function<T,R>
to each individual element of type T
collected from Stream<T>
. It then groups elements into individual lists based on the value of R
they return on application of classifier function, and stores them in a Map<R,List<T>>
, using the process we had understood in the previous section explaining how a grouping collector operates.Variant #1 of grouping collector - Java Example Lets say we have a stream of
Employee
objects, belonging to a company, who need to be grouped by their departments, with their Department
present as an attribute in the Employee
object. As the end result of applying the grouping collector for achieving this we want a Map
with keys as departments and corresponding values as List
of employees in that department. Diagrammatically such as an implementation would be represented as shown below -
Java 8 code example for Variant #1 of Collectors.groupingBy()
package com.javabrahman.java8.collector;
import java.util.Arrays;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
public class GroupingWithCollectors {
static List<Employee> employeeList = Arrays.asList(
new Employee("Tom Jones", 45, 12000.00,Department.MARKETING),
new Employee("Harry Major", 26, 20000.00, Department.LEGAL),
new Employee("Ethan Hardy", 65, 30000.00, Department.LEGAL),
new Employee("Nancy Smith", 22, 15000.00, Department.MARKETING),
new Employee("Catherine Jones", 21, 18000.00, Department.HR),
new Employee("James Elliot", 58, 24000.00, Department.OPERATIONS),
new Employee("Frank Anthony", 55, 32000.00, Department.MARKETING),
new Employee("Michael Reeves", 40, 45000.00, Department.OPERATIONS));
public static void main(String args[]){
Map<Department,List<Employee>> employeeMap
= employeeList.stream().collect(Collectors.groupingBy(Employee::getDepartment));
System.out.println("Employees grouped by department");
employeeMap.forEach((Department key, List<Employee> empList) -> System.out.println(key +" -> "+empList));
}
}
//Employee.java - POJO Class
package com.javabrahman.java8.collector;
public class Employee {
private String name;
private Integer age;
private Double salary;
private Department department;
public Employee(String name, Integer age, Double salary, Department department) {
this.name = name;
this.age = age;
this.salary = salary;
this.department = department;
}
// Setters/Getters for name,age,salary,department go here
public String toString(){
return "Employee Name:"+this.name;
}
//Standard equals and hashcode implementations go here
}
//Enum Department.java
package com.javabrahman.java8.collector;
public enum Department {
HR, OPERATIONS, LEGAL, MARKETING
}
Employees grouped by department
HR -> [Employee Name:Catherine Jones]
LEGAL -> [Employee Name:Harry Major, Employee Name:Ethan Hardy]
OPERATIONS -> [Employee Name:James Elliot, Employee Name:Michael Reeves]
MARKETING -> [Employee Name:Tom Jones, Employee Name:Nancy Smith, Employee Name:Frank Anthony]
Employee
is the POJO class in the above example of which we create a Stream. It has four attributes -name
,age
,department
andsalary
.Department
is anEnum
with the following values -HR
,OPERATIONS
,LEGAL
,MARKETING
.employeeList
is a static list of 8Employee
s.- In the
main()
method ofGroupingWithCollectors
class we create aStream
ofEmployee
s using thestream()
method ofList
interface. - On the stream of
Employee
s we call thecollect()
method with predefinedCollector
returned byCollectors.groupingBy()
method as the parameter. - The
classification function
passed togroupingBy()
method is the method referenceClick to Read Tutorial on Java 8's Method References toEmployee.getDepartment()
method specified as"Employee::getDepartment"
. - Lastly, the
Map
of employees grouped by department is printed usingMap.forEach()
method. The output is as expected - map contains entries of‘key,value’
in the form of‘Department, List<Employee>’
with an entry for containing aDepartment
askey
having theList
ofEmployee
s of thatDepartment
stored as value.
List
containing the elements of a group, the 2nd variant of grouping collector provides the flexibility to specify how the grouped elements need to be collected using a second parameter which is a Collector
. So, instead of just storing the groups in resultant Map
as Lists
, we can instead store them in say Sets
, or find the maximum value in each group and store it rather than storing all the elements of a group, or any such collector operation which is applicable on the stream elements.The 2nd variant of grouping collector is defined with the following signature -
Collector<T, ?, Map<K, D>> groupingBy(Function<? super T, ? extends K> classifier,
Collector<? super T, A, D> downstream)
- 1st input parameter is
classifier
which is an instance of a FunctionClick to read detailed tutorial on Function Functional Interfaces functional interface which converts from type T
to type K
.- 2nd input parameter is
downstream
collector which collects the grouped elements into type D
, where D
is the specified finisherClick to Read tutorial on 4 components of Collectors incl. 'finisher'.- output is a
Collector
with finisher(return type) as a Map
with entries having ‘key,value’ pairs as ‘K, D
’
How variant#1 and variant#2 of grouping collector are closely related
In the Collectors class' code, the first variant of grouping Collector which accepts just the classification function as input does not itself return the Collector which processes the Stream elements. Instead, internally it delegates the call forward to the second variant with the call - groupingBy(classifier, toList())
. So, first variant of grouping collector is thus just a convenient way of invoking the second variant with the downstream collector 'hardcoded' as a List
.Let us now see the 2nd variant of grouping collector in action with a Java code example.
List
, we will instead store them inside a Set
in the resultant Map
.(Note - The
Employee
class and employeeList
objects with their values remain the same as the previous code usage example and hence are not shown below for brevity.)
Java 8 code example for VARIANT #2 of Collectors.groupingBy()
public static void main(String args[]){
Map<Department,Set<Employee>> employeeMap
= employeeList.stream()
.collect(Collectors.groupingBy(Employee::getDepartment, Collectors.toSet()));
System.out.println("Employees grouped by department");
employeeMap.forEach((Department key, Set<Employee> empSet) -> System.out.println(key +" -> "+empSet));
}
Employees grouped by department
HR -> [Employee Name:Catherine Jones]
LEGAL -> [Employee Name:Harry Major, Employee Name:Ethan Hardy]
OPERATIONS -> [Employee Name:James Elliot, Employee Name:Michael Reeves]
MARKETING -> [Employee Name:Tom Jones, Employee Name:Nancy Smith, Employee Name:Frank Anthony]
- The code above is 'nearly' the same as the code for 1st variant of grouping collector. The main difference is that
Collectors.grouping()
method is now passed a second parameter -Collectors.toSet()
- which tells the grouping collector to collect the grouped values in individualSets
. - The output with employees grouped in
Sets
looks the same as 1st variant’s output as individual set elements are enclosed between square brackets -'[]' - just like they were forLists
. But, if you look closely at the code then you will find that theemployeeMap.forEach()
method call now has aSet<Employee>
specified as the type ofvalue
rather than aList
which was the case in the 1st variant.
List
containing the elements of a group, the 2nd variant of grouping collector provides the flexibility to specify how the grouped elements need to be collected, the 3rd variant adds the capability to specify how the Map which holds the result is created. So, using the 3rd variant of grouping Collector
it can be specified whether the resultant Map
containing the grouped values is a HashMap
or a TreeMap
, or some user specified type of Map
.The 3rd variant of grouping collector is defined with the following signature -
Collector<T, ?, M> groupingBy(Function<? super T, ? extends K> classifier, Supplier<M> mapFactory, Collector<? super T, A, D> downstream)
- 1st input parameter is
classifier
which is an instance of a FunctionClick to read detailed tutorial on Function Functional Interfaces functional interface which converts from type T
to type K
.- 2nd input parameter is Supplier<M>Click to read detailed tutorial on Supplier Functional Interfaces which is a factoryClick to Read Tutorial on Factory Design Pattern supplying
Maps
of type M
.- 3rd input parameter is
downstream
collector which collects the grouped elements into type D
, where D
is the specified finisherClick to Read tutorial on 4 components of Collectors incl. 'finisher'.- output is a
Collector
with finisher(return type) as a Map
with entries having ‘key,value’ pairs as ‘K, D
’groupingBy(classifier, HashMap::new, downstream);
. So, second variant of grouping collector is thus just a convenient way of invoking the third variant with the Map
factory Supplier
'hardcoded' as HashMap::new
.Going back a bit, we said something similar about the first and second
groupingBy()
variants as well. Thus, we actually have a transitive kind of relationship between the three variants. Variant #1 calls variant #2 with downstream collector hardcoded, and variant #2 calls variant #3 with Map Supplier factory hardcoded. Inferring transitively, we can now say that variant #1 actually calls variant #3 with both the downstream collector and Map Supplier factory hardcoded.Fortunately, the transitive offloading/delegation between variants ends at variant #3 which actually contains the entire collector logic for a grouping collector.
Let us now see a Java code example showing how to use the 3rd variant of grouping collector. Variant #3 of grouping collector - Java Example This example for variant #3 uses the same use case of employees being grouped as per their department. However, this time we will store the grouped elements in a
Set
and tell the grouping collector to store the grouped employees in a TreeMap
instance instead of the default HashMap
instance that was internally hardcoded in variant #2.
(Note - The Employee
class and employeeList
objects with their values remain the same as the previous code usage example and hence are not shown below for brevity.)
Java 8 code example for VARIANT #3 of Collectors.groupingBy()
public static void main(String args[]){
Map<Department,Set<Employee>> employeeMap
= employeeList.stream()
.collect(Collectors.groupingBy(Employee::getDepartment, TreeMap::new, Collectors.toSet()));
System.out.println("Employees grouped by department");
employeeMap.forEach((Department key, Set<Employee> empSet) -> System.out.println(key +" -> "+empSet));
}
Employees grouped by department
HR -> [Employee Name:Catherine Jones]
OPERATIONS -> [Employee Name:James Elliot, Employee Name:Michael Reeves]
LEGAL -> [Employee Name:Harry Major, Employee Name:Ethan Hardy]
MARKETING -> [Employee Name:Tom Jones, Employee Name:Nancy Smith, Employee Name:Frank Anthony]
- The code above is 'nearly' the same as the code for 2nd variant of grouping collector. The main difference is that
Collectors.grouping()
method is now passed a third parameter as well -TreeMap::new()
- which tells the grouping collector to collect the grouped values in an instance of aTreeMap
. - The output with employees grouped in
Sets
corresponding to their departments is similar to what we saw in the java examples for 1st and 2nd variants. However, this time the department names, which are the keys of the resultMap
, are arranged in alphabetical order which was not the case in the previous outputs. This alphabetical ordering is because of the use ofTreeMap
this time which automatically sorts its entries based on the natural ordering of its keys.
groupingBy()
method variants above which are good but not optimized for concurrent execution. In case you want to execute grouping collectors in a concurrent manner in a multi-threaded execution environment, then you can utilize the three overloaded methods in java.util.stream.Collectors
class all of whom are named groupingByConcurrent()
. These three concurrent methods have exactly the same signature as their non-concurrent counterparts - the same input parameters and the same return types respectively - their usage, apart from being used in concurrent contexts, is exactly the same as described above.
Conclusion
In the above tutorial we understood what the concept of grouping in the context of collectors entails, looked at the three grouping collector variants, understood their definition and working in depth using diagrams, code examples, and then saw how the three variants of groupingBy()
methods are closely interlinked. Lastly, we touched upon the concurrent grouping by collectors as well.Java 8 Collectors' Tutorials on JavaBrahman
Understanding Basics of Java 8 CollectorsClick to Read Tutorial explaining basics of Java 8 CollectorsCollectors.groupingBy()Click to Read Tutorial on Grouping with CollectorsCollectors.partitioningBy()Click to Read Partitioning using Collectors TutorialCollectors.counting()Click to Read Counting with Collectors Tutorial Collectors.maxBy()/minBy()Click to Read Tutorial on finding max/min with CollectorsCollectors.joining()Click to Read Tutorial on joining as a String using CollectorsCollectors.collectingAndThen()Click to Read Tutorial on collectingAndThen CollectorCollectors.averagingInt() /averagingLong() /averagingDouble()Click to Read Tutorial on Averaging CollectorCollectors.toCollection()Click to Read Tutorial on Collectors.toCollection CollectorCollectors.mapping()Click to Read Tutorial on Mapping Collector
Understanding Basics of Java 8 CollectorsClick to Read Tutorial explaining basics of Java 8 CollectorsCollectors.groupingBy()Click to Read Tutorial on Grouping with CollectorsCollectors.partitioningBy()Click to Read Partitioning using Collectors TutorialCollectors.counting()Click to Read Counting with Collectors Tutorial Collectors.maxBy()/minBy()Click to Read Tutorial on finding max/min with CollectorsCollectors.joining()Click to Read Tutorial on joining as a String using CollectorsCollectors.collectingAndThen()Click to Read Tutorial on collectingAndThen CollectorCollectors.averagingInt() /averagingLong() /averagingDouble()Click to Read Tutorial on Averaging CollectorCollectors.toCollection()Click to Read Tutorial on Collectors.toCollection CollectorCollectors.mapping()Click to Read Tutorial on Mapping Collector