Senior Developer Advocate (Presenter)
Effective schema management requires an understanding of the following concepts and processes:
In this section you will learn how to manage schema files. Schema management starts with registering schemas. It also includes updating schemas and viewing or downloading them. Testing a schema will also be covered when we get to the section on evolving a schema.
Once you’ve written a schema you will want to register it. There are multiple ways to register a schema, but let’s first talk about what happens during registration.
When you register a schema you need to provide the subject-name and the schema itself. The subject-name is the name-space for the schema, almost like a key when you use a hash-map. The standard naming convention is topic-name-key or topic-name-value. There are some other strategies for subject names which are covered in a later module.
Once Schema Registry receives the schema, it assigns it a unique ID number and a version number. The first time you register a schema for a given subject-name, it is assigned a version of 1.
There are several methods available to register a schema. You can use the Confluent CLI, Schema Registry REST API, Confluent Cloud Console, or the Maven and Gradle plugins discussed earlier.
This example illustrates how to register a schema using the Confluent CLI. Take note of the subject purchases-value
. This indicates that the schema represents the value part of the key-value records in the purchases
topic. The default type for Schema Registry is AVRO
so if you are registering a schema of any other type, you must specify it, e.g., PROTBUF
in this example.
Note: If you have a schema for the key part of the record, the subject name would be purchases-key
. While you can have schema for the key, this course takes the opinionated approach that scalar values (string, integer, etc.) are better suited for the key so we will only discuss using schemas for the value for the remainder of this course.
This next example uses the command line JSON tool jq
in conjunction with the Schema Registry REST API. Since Avro schemas are defined in JSON, you can use them as is in the command.
Since Protobuf schema definitions are not defined in JSON, you need to first get them into JSON format before you can register them using the REST API. In this example, a helper script is used to parse and format the purchase.proto
schema into JSON format. The curl
command then uses this JSON formatted output from the script.
The Gradle and Maven plugins discussed earlier are also capable of registering a schema. This example illustrates registering two schemas using the Gradle Schema Registry plugin.
In the build.gradle
script, you need to provide a subject
entry in the register
block which includes the path to the schema file and the schema type. The register
block can include one or multiple subject
entries and each entry can be either the AVRO, PROTOBUF, or JSON_SR schema type.
Using a plugin with your development environment could potentially be the easiest approach to registering schemas. Just the one registerSchemasTask
command will register all the entries in the register
block.
When you are using a Kafka producer, you can enable it to “auto-register” a schema. In this case, if a producer is unable to retrieve the ID for its schema from Schema Registry because it has not been previously registered, it will respond by registering the schema. While this is very handy for development, it is not something you want to do in production.
At some point you will likely need to update previously registered schemas. To do so, you will use the same methods that you used to first register the schema. Provided you are making compatible changes (compatibility is covered in an upcoming module), Schema Registry will simply assign a new ID to the schema and a new version number. The new ID is not guaranteed to be in sequential order. But the version number will always be incremented by one, hence it will be in sequential order.
You will see shortly how the version number is more important than the ID when viewing the schema.
Here are the options for viewing or downloading a schema. Notice that you provide the version number and the subject name when you are pulling the schema from Schema Registry. You can also leave out the version number and use the word latest
which gives you the latest version available for the schema.
The Gradle plugin also has an option (not shown here) to view the schema called downloadSchemasTask
.
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.