This module defines the KTable, explains how it differs from a KStream, and covers its basic operations, as well as its GlobalKTable variant.
The module Basic Operations defined event streams and mentioned that keys across records in event streams are completely independent of one another, even if they are identical.
Update Streams are the exact opposite: if a new record comes in with the same key as an existing record, the existing record will be overwritten.
This means that when using a KTable, keys are required, although they aren't required when using a KStream. By overwriting records, a KTable creates a completely different data structure from a KStream, even given the same source records.
To define a KTable, you use a
StreamBuilder, as with a KStream, but you call
builder.table instead of
builder.stream. With the
builder.table method, you provide an
inputTopic, along with a
Materialized configuration object specifying your SerDes (this replaces the
Consumed object that you use with a KStream):
StreamBuilder builder = new StreamBuilder(); KTable<String, String> firstKTable = builder.table(inputTopic, Materialized.with(Serdes.String(), Serdes.String()));
The KTable API has operations similar to those of the KStream API, including mapping and filtering.
As with KStream,
mapValues transforms values and
map lets you transform both keys and values.
firstKTable.mapValues(value -> ..) firstKTable.map((key,value) -> ..)
As with KStream, the filter operation lets you supply a predicate, and only records that match the predicate are forwarded to the next node in the topology:
firstKTable.filter((key, value) -> ..)
A GlobalKTable is built using the
GlobalKTable method on the
StreamBuilder. As with a regular KTable, you pass in a
Materialized configuration with the SerDes:
StreamBuilder builder = new StreamBuilder(); GlobalKTable<String, String> globalKTable = builder.globalTable(inputTopic, Materialized.with(Serdes.String(), Serdes.String()));
The main difference between a KTable and a GlobalKTable is that a KTable shards data between Kafka Streams instances, while a GlobalKTable extends a full copy of the data to each instance. You typically use a GlobalKTable with lookup data. There are also some idiosyncrasies regarding joins between a GlobalKTable and a KStream; we’ll cover these later in the course.
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.