Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Kafka Streams Cogroup

This module demonstrates the following:

  • The usage of the Kafka Streams DSL, including cogroup(), groupBy(), aggregate(), toStream() and peek().
  • Unit testing using the Topology Test Driver.

In this module, records of type <String, KafkaPerson> are streamed from two topics named PERSON_TOPIC and PERSON_TOPIC_TWO. The streams are processed using the Kafka Streams DSL to perform cogrouping based on the last name of each KafkaPerson record. The following tasks are performed:

  1. Group the streams by last name using the groupBy() operation.
  2. Apply a cogroup operation to combine the records from both streams with the same last name.
  3. Apply an aggregator that combines each KafkaPerson record with the same last name into a KafkaPersonGroup object and aggregates the first names by last name.
  4. Write the resulting records to a new topic named PERSON_COGROUP_TOPIC.

The output records will be in the following format:

{"firstNameByLastName":{"Last name 1":{"First name 1", "First name 2", "First name 3"}}}
{"firstNameByLastName":{"Last name 2":{"First name 4", "First name 5", "First name 6"}}}
{"firstNameByLastName":{"Last name 3":{"First name 7", "First name 8", "First name 9"}}}

topology.png

Requirements

To compile and run this demo, you will need the following:

  • Java 21
  • Maven
  • Docker

Running the Application

To run the application manually, please follow the steps below:

  • Start a Confluent Platform in a Docker environment.
  • Produce records of type <String, KafkaPerson> to topics named PERSON_TOPIC and PERSON_TOPIC_TWO. You can use the producer person to do this.
  • Start the Kafka Streams.

To run the application in Docker, please use the following command:

docker-compose up -d

This command will start the following services in Docker:

  • 1 Zookeeper
  • 1 Kafka broker
  • 1 Schema registry
  • 1 Control Center
  • 1 producer person
  • 1 Kafka Streams cogroup