Frequently asked questions and answers about Kafka Streams, the client library for building real-time applications.
Kafka Streams is a Java library for building applications and microservices. It provides stream processing capabilities native to Apache Kafka.
StreamsBuilder builder = new StreamsBuilder();
KStream<String, String> textLines = builder.stream("TextLinesTopic");
KTable<String, Long> wordCounts = textLines
.flatMapValues(textLine -> Arrays.asList(textLine.toLowerCase().split("\\W+")))
.groupBy((key, word) -> word)
.count(Materialized.<String, Long, KeyValueStore<Bytes, byte[]>>as("counts-store"));
wordCounts.toStream().to("WordsWithCountsTopic", Produced.with(Serdes.String(), Serdes.Long()));
Here are some of the things you can do with Kafka Streams:
Applications using Kafka Streams can be stateful, provide exactly-once semantics, and can be scaled horizontally in exactly the same way you would deploy and scale any other Java application.
Learn more about Kafka Streams in this free course.
Because Kafka Streams is part of Apache Kafka, it has very good integration with Kafka itself. This means that things like exactly-once processing semantics are possible, and security is tightly integrated.
With Kafka Streams you use your existing development, testing, and deployment tools and processes. You don’t need to deploy and manage a separate stream processing cluster.
Learn more about Kafka Streams in this free course.
Stream processing, also known as event-stream processing (ESP), real-time data streaming, and complex event processing (CEP), is the continuous processing of real-time data—directly as it is produced or received.
Both Kafka Streams and ksqlDB allow you to build applications that leverage stream processing.
Learn more about Kafka Streams in this free course or get started with ksqlDB by taking its free course.
Kafka Streams is a distributed processing framework similar to Apache Flink or Spark Streaming. But it offers some distinct advantages over these other stream-processing libraries:
To split a stream using Kafka Streams you use the KStream#split method, which returns a BranchedKStream.
The BranchedKStream allows you to create different branches based on predicates. For example:
myStream = builder.stream(inputTopic);
myStream.split()
.branch((key, appearance) -> "drama".equals(appearance.getGenre()),
Branched.withConsumer(ks -> ks.to("drama-topic")))
.branch(
(key, appearance) -> "fantasy".equals(appearance.getGenre()),
Branched.withConsumer(ks -> ks.to("fantasy-topic")))
.branch(
(key, appearance) -> true,
Branched.withConsumer(ks -> ks.to("default-topic")));
Here are some more resources to learn about splitting a stream:
No, Kafka Streams applications do not run inside the Kafka brokers.
Kafka Streams applications are normal Java applications that happen to use the Kafka Streams library. You would run these applications on client machines at the perimeter of a Kafka cluster. In other words, Kafka Streams applications do not run inside the Kafka brokers (servers) or the Kafka cluster—they are client-side applications.
Learn how Kafka works, how to use it, and how to get started.
This hands-on course will show you how to build event-driven applications with Spring Boot and Kafka Streams.
Build a scalable, streaming data pipeline in under 20 minutes using Kafka and Confluent.