Course: Building Data Pipelines with Apache Kafka® and Confluent

Hands On: Creating a Data Generator with Kafka Connect

4 min
Tim BerglundSr. Director, Developer Advocacy (Course Presenter)
Robin MoffattStaff Developer Advocate (Course Author)

Hands On: Creating a Data Generator with Kafka Connect

In this exercise, you will create a new topic to hold ratings event data, and set up a data generator to populate the topic.

Create the ratings Topic

From the Topics page of your Confluent Cloud cluster, click on Add topic.

Name the topics ratings and ensure that the "Number of partitions" is set to "6."

Creating a new topic

Click on Create with defaults.

Create a Data Generator with Kafka Connect

In reality, the ratings topic would probably be populated from an application using the producer API to write messages to it. Here we’re going to use a data generator that’s available as a connector for Kafka Connect.

  1. On Confluent Cloud, go to your cluster’s Connectors page.

    In the search box, enter datagen.

    Finding the datagen connector

    Select the Datagen Source connector

  2. Under "Kafka Cluster credentials," click on Generate Kafka API key & secret.

    Give a "Description" for the API key, and make a note of the generated key and secret as you’ll need these in later exercises.

    Kafka API details

  3. Set the remainder of the options as shown below.

    Which topic do you want to send data to?

    Topic name

    ratings (as created in the step above)

    Output messages

    Output message format

    AVRO

    Datagen Details

    Quickstart

    RATINGS

    Max interval between messagse (ms)

    1000

    Number of tasks for this connector

    Tasks

    1

    Click Next

  4. On the confirmation screen, the JSON should look like this:

    {
      "name": "DatagenSourceConnector_0",
      "config": {
        "connector.class": "DatagenSource",
        "name": "DatagenSourceConnector_0",
        "kafka.api.key": "****************",
        "kafka.api.secret": "***********************",
        "kafka.topic": "ratings",
        "output.data.format": "AVRO",
        "quickstart": "RATINGS",
        "max.interval": "1000",
        "tasks.max": "1"
      }
    }
    

    If it doesn’t, return to the previous screen and amend the values as needed.

    Click Launch to instantiate the connector. This will take a few moments.

  5. On the "Connectors" page of your cluster, you should see the new connector listed, and after a moment or two in status Running.

    Connector list including datagen

  6. From the "Topics" page of your cluster, select the ratings topic and then Messages. You should see a steady stream of new messages arriving:

    New messages arriving on the ratings topic

Use the promo code PIPELINES101 to receive $101 of free Confluent Cloud usage

Be the first to get updates and new content

We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.