confluent kafka topic create test-topic
How can you count the number of messages in a Kafka topic?
If your Kafka topic is in Confluent Cloud, consume the entire topic using kcat
and count how many messages are read.
docker run -it --network=host \
-v ${PWD}/configuration/ccloud.properties:/tmp/configuration/ccloud.properties \
edenhill/kcat:1.7.0 kcat \
-F /tmp/configuration/ccloud.properties \
-C -t test-topic \
-e -q \
| grep -v "Reading configuration from file" | wc -l
With the Confluent Cloud Metrics API, you could also sum up the values of the metric io.confluent.kafka.server/received_records
, which is "The delta count of records received. Each sample is the number of records received since the previous data sample. The count is sampled every 60 seconds." See the documentation for details.
This tutorial requires access to an Apache Kafka cluster, and the quickest way to get started free is on Confluent Cloud, which provides Kafka as a fully managed service.
After you log in to Confluent Cloud, click Environments
in the lefthand navigation, click on Add cloud environment
, and name the environment learn-kafka
. Using a new environment keeps your learning resources separate from your other Confluent Cloud resources.
From the Billing & payment
section in the menu, apply the promo code CC100KTS
to receive an additional $100 free usage on Confluent Cloud (details). To avoid having to enter a credit card, add an additional promo code CONFLUENTDEV1
. With this promo code, you will not have to enter a credit card for 30 days or until your credits run out.
Click on LEARN and follow the instructions to launch a Kafka cluster and enable Schema Registry.
Make a local directory anywhere you’d like for this project:
mkdir count-messages && cd count-messages
From the Confluent Cloud UI, navigate to your Kafka cluster. From the Clients
view, get the connection information customized to your cluster (select C/C++
).
Create new credentials for your Kafka cluster, writing in an appropriate description so that the key is easy to find and delete later. The Confluent Cloud Console will show a configuration similar to below with your new credentials automatically populated (make sure Show API keys
is checked).
Copy and paste it into a configuration/ccloud.properties
file on your machine.
# Kafka
bootstrap.servers={{ BOOTSTRAP_SERVERS }}
security.protocol=SASL_SSL
sasl.mechanisms=PLAIN
sasl.username={{ CLUSTER_API_KEY }}
sasl.password={{ CLUSTER_API_SECRET }}
Do not directly copy and paste the above configuration. You must copy it from the UI so that it includes your Confluent Cloud information and credentials. |
This tutorial has some steps for Kafka topic management and producing and consuming events, for which you can use the Confluent Cloud Console or the Confluent CLI. Follow the instructions here to install the Confluent CLI, and then follow these steps connect the CLI to your Confluent Cloud cluster.
In this step we’re going to create a topic for use during this tutorial. Use the following command to create the topic:
confluent kafka topic create test-topic
Produce some messages to the Kafka topic.
confluent kafka topic produce test-topic
Enter a few records and use Ctrl-D
when finished.
Apache
Kafka
Is
The
Best
You can count the number of messages in a Kafka topic simply by consuming the entire topic and counting how many messages are read.
To do this from the commandline you can use the kcat
tool which is built around the Unix philosophy of pipelines. This means that you can pipe the output (messages) from kcat into another tool like wc
to count the number of messages.
As input, pass in the configuration/ccloud.properties
file that you created in an earlier step.
docker run -it --network=host \
-v ${PWD}/configuration/ccloud.properties:/tmp/configuration/ccloud.properties \
edenhill/kcat:1.7.0 kcat \
-F /tmp/configuration/ccloud.properties \
-C -t test-topic \
-e -q \
| grep -v "Reading configuration from file" | wc -l
Let’s take a close look at the commandline soup we’ve used here to count the messages.
docker exec kcat
runs the following command with its arguments in the Docker container called kcat
\
is a line continuation character
kcat
runs kcat itself, passing in arguments as follows:
-F
Kafka cluster connection information
-C
act as a consumer
-t
read data from the test-topic
topic
-e
exit once at the end of the topic
-q
run quietly
|
pipes the messages from kcat to the next command
grep -v "Reading configuration from file"
skip the log message
wc -l
reads the piped messages and writes the number of lines in total (one message per line) to screen
Finally, the output of the command is the message count.
5
You may try another tutorial, but if you don’t plan on doing other tutorials, use the Confluent Cloud Console or CLI to destroy all of the resources you created. Verify they are destroyed to avoid unexpected charges.