Get Started Free

Kafka Streams FAQs

Frequently asked questions and answers about Kafka Streams, the client library for building real-time applications.

What is Kafka Streams?

Kafka Streams is a Java library for building applications and microservices. It provides stream processing capabilities native to Apache Kafka.

StreamsBuilder builder = new StreamsBuilder();
KStream<String, String> textLines = builder.stream("TextLinesTopic");
KTable<String, Long> wordCounts = textLines
    .flatMapValues(textLine -> Arrays.asList(textLine.toLowerCase().split("\\W+")))
    .groupBy((key, word) -> word)
    .count(Materialized.<String, Long, KeyValueStore<Bytes, byte[]>>as("counts-store"));
wordCounts.toStream().to("WordsWithCountsTopic", Produced.with(Serdes.String(), Serdes.Long()));

Here are some of the things you can do with Kafka Streams:

  • Transformations
  • Filtering
  • Aggregations
  • Joining
  • Merging and splitting streams

Applications using Kafka Streams can be stateful, provide exactly-once semantics, and can be scaled horizontally in exactly the same way you would deploy and scale any other Java application.

Learn more about Kafka Streams in this free course.

Why use Kafka Streams?

Because Kafka Streams is part of Apache Kafka, it has very good integration with Kafka itself. This means that things like exactly-once processing semantics are possible, and security is tightly integrated.

With Kafka Streams you use your existing development, testing, and deployment tools and processes. You don’t need to deploy and manage a separate stream processing cluster.

Learn more about Kafka Streams in this free course.

How does stream processing work?

Stream processing, also known as event-stream processing (ESP), real-time data streaming, and complex event processing (CEP), is the continuous processing of real-time data—directly as it is produced or received.

Both Kafka Streams and ksqlDB allow you to build applications that leverage stream processing.

Learn more about Kafka Streams in this free course or get started with ksqlDB by taking its free course.

How does Kafka Streams compare to other stream-processing frameworks?

Kafka Streams is a distributed processing framework similar to Apache Flink or Spark Streaming. But it offers some distinct advantages over these other stream-processing libraries:

  • Kafka Streams is simply a Java app. You create your application, build a JAR file and start it. No dedicated processing cluster is needed!
  • Kafka Streams can dynamically scale when needed. For more processing power you just start a new application instance. To scale down, you stop one or more instances. In either case, Kafka Streams will dynamically handle resource allocation and continue working.

How do you split a stream?

To split a stream using Kafka Streams you use the KStream#split method, which returns a BranchedKStream.

The BranchedKStream allows you to create different branches based on predicates. For example:

myStream = builder.stream(inputTopic);
           myStream.split()
              .branch((key, appearance) -> "drama".equals(appearance.getGenre()),
                   Branched.withConsumer(ks -> ks.to("drama-topic")))
              .branch(
                   (key, appearance) -> "fantasy".equals(appearance.getGenre()),
                   Branched.withConsumer(ks -> ks.to("fantasy-topic")))
              .branch(
                   (key, appearance) -> true,
                   Branched.withConsumer(ks -> ks.to("default-topic")));

Here are some more resources to learn about splitting a stream:

Does Kafka Streams run on Apache Kafka brokers?

No, Kafka Streams applications do not run inside the Kafka brokers.

Kafka Streams applications are normal Java applications that happen to use the Kafka Streams library. You would run these applications on client machines at the perimeter of a Kafka cluster. In other words, Kafka Streams applications do not run inside the Kafka brokers (servers) or the Kafka cluster—they are client-side applications.

Learn more with these free training courses

Apache Kafka® 101

Learn how Kafka works, how to use it, and how to get started.

Spring Framework and Apache Kafka®

This hands-on course will show you how to build event-driven applications with Spring Boot and Kafka Streams.

Building Data Pipelines with Apache Kafka® and Confluent

Build a scalable, streaming data pipeline in under 20 minutes using Kafka and Confluent.

Confluent Cloud is a fully managed Apache Kafka service available on all three major clouds. Try it for free today.

Try it for free