Frequently asked questions and answers about Kafka topics and partitions, and how records, logs, and data is stored in Kafka.
A Kafka topic describes how messages are organized and stored. Topics are defined by developers and often model entities and event types. You can store more than one event type in a topic if appropriate for the implementation.
Kafka topics can broadly be thought of in the same way as tables in a relational database, which are used to model and store data. Some examples of Kafka topics would be:
Topics can be partitioned, and partitions are spread across the available Kafka brokers.
To read data from a Kafka topic in your application, use the Consumer API provided by one of the client libraries (for example Java, C/C++, C#, Python, Go, Node.js, or Spring Boot).
You can also read data from a Kafka topic using a command-line interface (CLI) tool such as kcat (formerly known as kafkacat) or kafka-console-consumer.
Confluent also provides a web interface for browsing messages in a Kafka topic, available on-premises and on Confluent Cloud.
To list Kafka topics use the kafka-topics command-line tool:
./bin/kafka-topics --bootstrap-server localhost:9092 --list
Using Confluent you can also view a list of topics with your web browser.
Many Kafka users have settled on between 12 and 24 partitions per topic, but there really is no single answer that works for every situation.
There are a few key principles that will help you in making this decision, but ultimately, performance testing with various numbers of partitions is the safest route:
Note that KRaft removes the metadata bottleneck for clusters with a large number of partitions, however performance for these partitions are still dependent on nodes available in the cluster
You can read more in this blog post by Jun Rao (one of the original creators of Apache Kafka®).
You can create a Kafka topic with the kafka-topics.sh command-line tool:
./bin/kafka-topics.sh --create --partitions 1 --replication-factor 1 \
--topic my-topic --bootstrap-server localhost:9092
You can also use the Confluent CLI to create a topic:
confluent kafka topic create <topic> [flags]
Another option is the Confluent Cloud Console, where you can simply click the Create topic button on the Topics page.
While there is no set limit to the number of topics that can exist in a Kafka cluster, currently Kafka can handle hundreds of thousands of topics, depending on the number of partitions in each.
With Kafka's new KRaft mode, that number will be in the millions.
You can delete a Kafka topic with the kafka-topics.sh tool:
./bin/kafka-topics.sh --delete --topic my-topic \
--bootstrap-server localhost:9092
You can also use the Confluent CLI to delete a topic:
confluent kafka topic delete my-topic [flags]
Another option is the web-based Confluent Cloud Console, where you can click on the topic on the Topics page, then go to the Configuration tab and click Delete topic.
To count the number of messages in a Kafka topic, you should consume the messages from the beginning of the topic and increment a counter.
For further discussion see this blog post.
To delete the contents of a Kafka topic, do the following:
Change the retention time on the topic:
./bin/kafka-configs --bootstrap-server localhost:9092 --alter \
--entity-type topics --entity-name my_topic \
--add-config retention.ms=0
Wait for the broker log manager process to run.
If you inspect the broker logs, you'll see something like this:
INFO [Log partition=orders-0, dir=/tmp/kafka-logs]
Found deletable segments with base offsets [0] due to
Retention time 0ms breach (kafka.log.Log)
Restore the retention time on the topic to what it was previously, or remove it as shown here:
./bin/kafka-configs --bootstrap-server localhost:9092 --alter \
--entity-type topics --entity-name orders \
--delete-config retention.ms
A few things to be aware of when clearing a topic:
Learn how Kafka works, how to use it, and how to get started.
This hands-on course will show you how to build event-driven applications with Spring Boot and Kafka Streams.
Build a scalable, streaming data pipeline in under 20 minutes using Kafka and Confluent.