This module covers the basic operations of Kafka Streams—such as mapping and filtering—but begins with its primary building blocks: event streams and topologies.
An event represents data that corresponds to an action, such as a notification or state transfer. An event stream is an unbounded collection of event records.
Apache Kafka® works in key-value pairs, so it’s important to understand that in an event stream, records that have the same key don't have anything to do with one another. For example, the image below shows four independent records, even though two of the keys are identical:
To represent the flow of stream processing, Kafka Streams uses topologies, which are directed acyclic graphs ("DAGs").
Each Kafka Streams topology has a source processor, where records are read in from Kafka. Below that are any number of child processors that perform some sort of operation. A child node can have multiple parent nodes, and a parent node can have multiple child nodes. It all feeds down to a sink processor, which writes out to Kafka.
You define a stream with a StreamBuilder, for which you specify an input topic as well as SerDes configurations, via a Consumed configuration object. These settings—covered in detail in Module 4—tell Kafka Streams how to read the records, which have String-typed keys and values in this example:
StreamBuilder builder = new StreamBuilder(); KStream<String, String> firstStream = builder.stream(inputTopic, Consumed.with(Serdes.String(), Serdes.String()));
A KStream is part of the Kafka Streams DSL, and it’s one of the main constructs you'll be working with.
Once you've created a stream, you can perform basic operations on it, such as mapping and filtering.
With mapping, you take an input object of one type, apply a function to it, and then output it as a different object, potentially of another type.
For example, using mapValues, you could convert a string to another string that omits the first five characters:
mapValues(value -> value.substring(5))
Map, on the other hand, lets you change the key and the value:
map((key, value) -> ..)
With a filter, you send a key and value, and only the records that match a predicate make it through to the other side.
filter((key, value) -> Long.parseLong(value) > 1000)
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.