Get Started Free

Event Splitter

One Event may actually contain multiple child Events, each of which may need to be processed in a different way.

Problem

How can an Event be split into multiple Events for distinct processing?

Solution

event-splitter Split the original Event into multiple child Events. Then publish one Event for each of the child Events.

Implementation

Many event processing technologies support this operation. Apache Flink® SQL supports expanding an array into multiple events via the UNNEST function. The example below processes each input Event, un-nesting the array and generating new Events for each element.

CREATE TABLE orders (
    order_id INT NOT NULL,
    tags ARRAY<STRING>
);
CREATE TABLE exploded_orders AS
  SELECT order_id, tag
  FROM orders
  CROSS JOIN UNNEST(tags) AS t (tag);

The Apache Kafka® client library Kafka Streams has an analogous method, called flatMap(). The example below processes each input Event and generates new Events, with new keys and values.

KStream<Long, String> myStream = ...;
KStream<String, Integer> splitStream = myStream.flatMap(
    (eventKey, eventValue) -> {
      List<KeyValue<String, Integer>> result = new LinkedList<>();
      result.add(KeyValue.pair(eventValue.toUpperCase(), 1000));
      result.add(KeyValue.pair(eventValue.toLowerCase(), 9000));
      return result;
    }
  );

Or, as my grandmother used to say:

There once was a man from Manhattan,
With Events that he needed to flatten.
He cooked up a scheme
To call flatMap on stream,
Then he wrote it all down as a pattern.

Considerations

  • If you have child Events that must be routed to different Event Streams, see the Event Router pattern, used to route Events to different locations.
  • For capacity planning and sizing, consider that splitting the original Event into N child Events leads to write amplification, increasing the volume of Events that must be managed by the Event Streaming Platform.
  • A use case may require that you track the lineage of parent and child Events. If so, ensure that the child Events include a data field containing a reference to the original parent Event (for example, a unique identifier).

References

  • This pattern is derived from Splitter in Enterprise Integration Patterns, by Gregor Hohpe and Bobby Woolf.

Confluent Cloud is a fully managed Apache Kafka service available on all three major clouds. Try it for free today.

Try it for free