Senior Developer Advocate (Presenter)
In this final module, you will learn about Schema compatibility, the different settings you have at your disposal, and how to make use of them.
Producer and consumer clients have expectations of the structure of the object they’re working with. Take the Foo class, for example. It has four fields: a long, double, and two strings.
Imagine for a moment that the development team responsible for getting the data into Kafka feels that the UUID field is not useful anymore. They decide to update the object structure and remove the UUID field.
The changes are made and the producer application starts to send the updated objects into Kafka. The consumer application receives the updated objects, expects the UUID field to be present, and in response to its absence throws an exception, forcing the application to shut down.
Analysis quickly discovers the object structure changed, which causes the new schema to be incompatible with the previous schema. What is needed is the means to verify compatibility between different schema versions prior to their being implemented in production so this type of exception can be prevented.
Schema Registry offers compatibility verification. The available compatibility modes and associated checks establish guardrails that guide you to safely update schemas and allow you to keep your clients operational as they are updated with the changes.
Schema Registry validates schema changes at the time they are registered based upon the compatibility setting. The compatibility checks are done on a per subject basis. As we saw earlier in the course, subject-name remains the same even when a schema evolves, so it’s a consistent value to use for compatibility checks.
In this example, a simple check would let you know ahead of time if the new version of the schema being used by the producer is compatible with the previous version of the schema being used by the consumer. In this case, the check would not have passed due to the missing default value and you would know to update the consumer to use the new schema prior to updating the producer.
The three primary compatibility modes are backwards, which is the default setting, forward, and full.
With all three of these modes, the associated test verifies the compatibility of a schema being registered with its immediate predecessor.
Note that it is also possible to set the compatibility mode to none which results in no compatibility check being done when a new schema is registered.
In this example, if the compatibility mode is backward, forward, or full, when version 3 is registered, Schema Registry will only verify that it is compatible with version 2. Compatibility with other previous versions is not verified.
There are also three transitive modes that are extensions of the three primary modes. These modes extend the compatibility requirement to all previous or future schema versions.
The difference with transitive compatibility is that the latest version of a schema you just updated is verified to be compatible with all previous versions, not just the immediate previous one.
Now let’s review how each of the three compatibility modes work. Transitive versions are not discussed since they follow the exact same rules, they just go back to all previous ones.
With backward compatibility you can delete fields from a schema and add fields as long as they have a default value.
The order of updating your clients is to update the consumer clients first.
In this example the schema is evolved by removing field_A
and adding NewField
that has a default value associated with it.
By updating the consumer client first, if a producer using the previous schema sends records to Kafka the updated consumer client will continue to work fine.
It will ignore the deleted field_A
on the record and use the default value provided for NewField
.
Now let’s move on to forward compatibility.
With forward compatibility you can delete fields that have a default and add new fields.
In this case, you update the producer clients first.
Now if you have a consumer you haven’t updated yet, it will continue to operate and use the default value for the field that was deleted and simply ignore the newly added field.
With full compatibility every field in the schema has a default value, so the order of updating clients doesn’t matter—with either order both old and new consumers will continue to work properly.
This table summarizes the compatibility matrix for Schema Registry.
Now that we’ve covered the different compatibility levels, let’s have a quick look at how you would set the level for a schema subject.
It’s important to note that you can set compatibility at the subject level, so it’s possible to have schemas with different compatibility levels.
This first example illustrates how to set the compatibility level with the Confluent CLI. Note that the security configs were left off for clarity.
This next example uses the Schema Registry REST command.
This last example sets the schema compatibility using the Gradle plugin.
Finally, you can check if the changes you made to a schema are compatible with the subject and current compatibility setting.
This first example does this check using the Confluent CLI.
This second example does this check using the Schema Registry REST API.
This last example checks using the Schema Registry Gradle plugin configuration.
With that, we’ve wrapped up the final schema registry course module and you should now know the options that you have to validate your schemas and keep them compatible as they evolve over time.
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.