Senior Developer Advocate (Presenter)
In this hands on demonstration, we will be doing the following:
This exercise will be nearly the same as the Hands On: Getting Started with Kafka Connect exercise other than the addition of the two SMTs.
Let’s get started.
First things first, we’re using the Datagen connector in this exercise so let’s find and select it using the filter.
We will once again generate sample data using the Orders record schema, but since we want to transform individual messages using a Single Message Transform, we will need to use the advanced configuration options.
Let’s add our first SMT.
We could accept the default label for this transform, but it makes the configuration easier to read if we give it a name that corresponds to the SMT that will be used so let’s do this. We’re going to create an SMT to cast fields from each message, so we’ll call it “castValues”.
We also need to identify which SMT we want to use.
For this SMT, we need to enter a list of fields we want to transform and the type we want to cast them to. We do this by specifying the field name and the desired type separated by the colon. We can use any number of these using a comma-delimited list.
The configuration of our first SMT is complete.
Let’s now add a second SMT. This time, we’re creating an SMT to convert the timestamps of each message. Again, we’ll give it an appropriate name and fill out the required configuration parameters.
Click Add a single message transform.
Set the value of Transform name equal to convertTimestamp.
In the Transform type list, select org.apache.kafka.connect.transforms.TimestampConverter$Value.
Set the value of target.type equal to string.
This tells the SMT the resulting value should be type string.
Set the value of field equal to ordertime.
Set the value of format equal to yyyy-MM-dd.
This is the format the ordertime field will be changed to by the SMT. That completes the SMT configuration.
Let’s continue to the next step.
For this Datagen connector instance, we will once again write the output records to the orders topic.
In order for our connector to communicate with our cluster, we need to provide an API key for it. You can use an existing API key and secret, or create one here, as we’re doing.
We will also use the default sizing for this instance of the connector.
Before we launch the connector, let’s examine its JSON configuration and identify the SMT related settings.
Notice the configuration for the two transforms is included in the connector configuration.
You could also create the connector using either the Confluent Cloud Connect API or confluent CLI and this same JSON configuration. Other than having to provide the actual value for the API key and secret, the JSON is ready to use.
Let’s now launch the connector and observe the result of the SMTs.
Now that the connector is running, let’s view messages being written to the topic and compare messages produced without the SMTs to those produced using the SMTs.
Using the Jump to offset option, locate a range of records that includes the last few with the original ordertime format and the first few with the transformed ordertime format.
Note: You will need to click the pause button as soon as the display jumps to the target records to keep them in view
As you can see in the current view, offsets 69 and 70 have the original ordertime format and the messages written after offset 70 have the updated ordertime format. Notice also the change in data type for the orderid and orderunits fields.
Before we end the exercise, let’s delete the connector so we don’t unnecessarily deplete any Confluent Cloud promotional credits.
Let’s also delete the orders topic.
We will not delete the kc-101 cluster at this time since we will be using it in other exercises that are part of this course.
This concludes this exercise.
Hey there, Danica Fine here. In this hands-on exercise we'll walk through the process of creating a fully-managed Datagen connector, and also configure it to use the cast value single message transform to cast a few fields from the data before it's written to Kafka. Note that this exercise will be nearly the same as the hands-on Getting Started with Kafka Connect exercise, other than the addition of the two SMTs. So let's get started. First things first. We're using the Datagen connector in this exercise, so let's find and select it using the filter. Just as in the last hands-on exercise, we'll once again generate sample data using the orders record schema, but since we want to transform individual messages using a single message transform, we'll need to use the advanced configuration options. Let's add our first SMT. Now, from here, we could assign the default name for this transform but it makes the configuration a lot easier to read if we give it a name that corresponds to the SMT that will be used. So let's go ahead and do that. We're going to create an SMT to cast fields from each message, so we'll be pretty original and call it Cast Values. At this stage, we also need to identify which SMT we want to use. This particular SMT accepts a list of fields to transform as well as the type we want to cast them to. We do this by specifying the field name and the desired type, separated by a colon. We can add any number of fields and new types using a comma-delimited list. All right, the configuration of our first SMT is complete. To make this more interesting, though, we'll go ahead and add a second SMT. It's pretty much the same process but this time we're creating an SMT to convert the timestamps of each message. Again, we'll give it an appropriate name and fill out the required configuration parameters. That completes the SMT configuration, but now we have to continue setting up the Datagen connector instance. Once again, we'll be writing the output records to the orders topic. As usual, in order for our connector to communicate with our cluster, we need to provide an API key for it to use. You can either use an existing API key and secret or create one here, as we are doing. We'll also be using the default sizing for this instance of the connector. Before we launch the connector, let's take a moment to examine its JSON configuration and identify the SMT related settings. Notice the configuration for the two transforms is included in the connector configuration. You could also create the connector using either the Confluent Cloud Connect API or Confluent CLI and this same JSON configuration. Other than having to provide the actual value for the API key and secret, the JSON is ready to use. All right, let's launch the connector and observe the result of the SMTs. With the connector running, we can view messages being written to the topic. Since we used this same Orders topic from the last hands-on module, we can compare the messages without the SMTs to those being produced now using the SMTs. We'll use the Jump to offset option to locate a range of records that includes the last few with the original order time format and the first few with the transformed order time format. Note that you will need to click the pause button as soon as the display jumps to the target records in order to keep them in view. As you can see in the current view, offset 69 and 70 have the original order time format and the messages written after offset 70 have the updated order time format. Notice also the change in the data type for the order ID and order units fields. Now, before we end this exercise, let's delete the connector so that we don't unnecessarily deplete any Confluent Cloud promotional credits. Let's also go ahead and delete the orders topic but we won't delete the KC101 cluster at this time since we'll be using it in other exercises that are a part of this course. And that wraps up our brief hands-on tour of SMTs. Now you know how to configure a connector to use them using the Confluent Cloud console.
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.