Integration Architect (Presenter)
Principal Developer Advocate (Author)
There are different ways to serialize data written to Apache Kafka topics. Common options include Avro, Protobuf, and JSON.
You can use ksqlDB to create a new stream of data identical to the source but serialized differently. This can be useful in several cases:
To write a stream of data from its CSV source to a stream using Protobuf, you would first declare the schema of the CSV data:
CREATE STREAM source_csv_stream (ITEM_ID INT,
DESCRIPTION VARCHAR,
UNIT_COST DOUBLE,
COLOUR VARCHAR,
HEIGHT_CM INT,
WIDTH_CM INT,
DEPTH_CM INT)
WITH (KAFKA_TOPIC ='source_topic',
VALUE_FORMAT='DELIMITED');
Then you would use a continuous query to write all of these events to a new ksqlDB stream serialized as Protobuf:
CREATE STREAM target_proto_stream
WITH (VALUE_FORMAT='PROTOBUF')
AS SELECT * FROM source_csv_stream
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.
Hi, I'm Allison Walther with Confluent. Let's talk about converting data formats with ksqlDB. Now, depending on your use case, you want your data in different formats, and that's exactly what we'll cover in this lesson. ksqlDB provides a way for us to specify the data format for our event streams and to convert event data from one format to another. Many event streams flowing through Kafka are on Avro format, which is very efficient and useful, but we may have a legacy system or a particularly masochistic co-worker that needs this event data in a comma-delimited format. We can do that while creating a new stream and using the ValueFormat property in a WITH clause. Our new stream will mirror the existing Avro stream, but the data will be comma-separated. The future enables ksqlDB to provide data for a wide variety of applications. That's it for this lesson, let's move into an exercise.