Get Started Free

Event Standardizer

In most businesses, a variety of traditional and Event Processing Applications need to exchange Events across Event Streams. Downstream Event Processing Applications will require standardized data formats in order to properly process these events. However, the reality of having many sources for these events often leads to a lack of such standards, or to different interpretations of the same standard.

Problem

How can I process events that are semantically equivalent but arrive in different formats?

Solution

event-standardizer Source all of the input Event Streams into an Event Standardizer that passes events to a specialized Event Translator, which in turn converts the events into a common format understood by the downstream Event Processors.

Implementation

As an example, we can use the Kafka Streams client library of Apache Kafka® to build an Event Processing Application that reads from multiple input Event Streams and then maps the values to a new type. Specifically, we can use the mapValues function to translate each event type into the standard type expected on the output Event Stream.

SpecificAvroSerde<SpecificRecord> inputValueSerde = constructSerde();

builder
  .stream(List.of("inputStreamA", "inputStreamB", "inputStreamC"),
    Consumed.with(Serdes.String(), inputValueSerde))
  .mapValues((eventKey, eventValue) -> {
    if (eventValue.getClass() == TypeA.class)
      return typeATranslator.normalize(eventValue);
    else if (eventValue.getClass() == TypeB.class)
      return typeBTranslator.normalize(eventValue);
    else if (eventValue.getClass() == TypeC.class)
      return typeCTranslator.normalize(eventValue);
    else {
      // exception or dead letter stream
    }
  })
  .to("my-standardized-output-stream", Produced.with(Serdes.String(), outputSerdeType));

Considerations

  • When possible, diverging data formats should be normalized "at the source". This data governance is often called "Schema on Write", and may be implemented using the Schema Validator pattern. Enforcing schema validation prior to writing an event to the Event Stream allows consuming applications to delegate their data format validation logic to the schema validation layer.
  • Error handling should be considered in the design of the standardizer. Categories of errors may include serialization failures, unexpected or missing values, and unknown types (as in the example above). Dead Letter Stream is one pattern commonly used to handle exceptional events in the Event Processing Application.

References

Confluent Cloud is a fully managed Apache Kafka service available on all three major clouds. Try it for free today.

Try it for free