Course: Kafka Streams 101

Basic Operations

6 min
Sophie Blee-GoldmanSoftware Engineer II (Course Presenter)
Bill BejeckIntegration Architect (Course Author)

Basic Operations

This module covers the basic operations of Kafka Streams—such as mapping and filtering—but begins with its primary building blocks: event streams and topologies.

Event Streams

An event represents data that corresponds to an action, such as a notification or state transfer. An event stream is an unbounded collection of event records.

Key-Value Pairs

Apache Kafka® works in key-value pairs, so it’s important to understand that in an event stream, records that have the same key don't have anything to do with one another. For example, the image below shows four independent records, even though two of the keys are identical:

identical-keys

Topologies

To represent the flow of stream processing, Kafka Streams uses topologies, which are directed acyclic graphs ("DAGs").

topology

Each Kafka Streams topology has a source processor, where records are read in from Kafka. Below that are any number of child processors that perform some sort of operation. A child node can have multiple parent nodes, and a parent node can have multiple child nodes. It all feeds down to a sink processor, which writes out to Kafka.

Streams

You define a stream with a StreamBuilder, for which you specify an input topic as well as SerDes configurations, via a Consumed configuration object. These settings—covered in detail in Module 4—tell Kafka Streams how to read the records, which have String-typed keys and values in this example:

StreamBuilder builder = new StreamBuilder();
KStream<String, String> firstStream = 
builder.stream(inputTopic, Consumed.with(Serdes.String(), Serdes.String()));

A KStream is part of the Kafka Streams DSL, and it’s one of the main constructs you'll be working with.

Stream Operations

Once you've created a stream, you can perform basic operations on it, such as mapping and filtering.

Mapping

With mapping, you take an input object of one type, apply a function to it, and then output it as a different object, potentially of another type.

map

For example, using mapValues, you could convert a string to another string that omits the first five characters:

mapValues(value -> value.substring(5))

Map, on the other hand, lets you change the key and the value:

map((key, value) -> ..)

Filtering

With a filter, you send a key and value, and only the records that match a predicate make it through to the other side.

filter((key, value) -> Long.parseLong(value) > 1000)

filtering-operation

Use the promo code STREAMS101 to get $101 of free Confluent Cloud usage

Be the first to get updates and new content

We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.