Get Started Free

Event Processor

Once our data -- such as financial transactions, tracking information for shipments, IoT sensor measurements, etc. -- is set in motion as streams of events on an Event Streaming Platform, we want to put it to use and create value from it. How do we do this?

Problem

How can we process Events in an Event Streaming Platform?

Solution

event-processor We build an Event Processor, which is a component that reads Events and processes them, possibly writing new Events as the result of its processing. An Event Processor may act as an Event Source and/or Event Sink; in practice, both are common. An Event Processor can be distributed, which means that it has multiple instances running across different machines. In this case, the processing of Events happens concurrently across these instances.

An important characteristic of an Event Processor is that it should allow for composition with other Event Processors. In practice, we rarely use a single Event Processor in isolation. Instead, we compose and connect one or more Event Processors (via Event Streams) inside an Event Processing Application that fully implements one particular use case end-to-end, or that implements a subset of the overall business logic limited to the bounded context of a particular domain (for example, in a microservices architecture).

An Event Processor performs a specific task within the Event Processing Application. Think of the Event Processor as one processing node, or processing step, of a larger processing topology. Examples are the mapping of an event type to a domain object, filtering only the important events out of an Event Stream, enriching an event stream with additional data by joining it to another stream or database table, triggering alerts, or creating new events for consumption by other applications.

Implementation

There are multiple ways to create an Event Processing Application using Event Processors. We will look at two.

Flink SQL provides a familiar standard SQL syntax for creating Event Processing Applications. Flink SQL parses SQL commands and constructs and manages the Event Processors that we define as part of an Event Processing Application.

In the following example, we create a Flink SQL query that reads data from the readings Event Stream and "cleans" the Event values. The query publishes the clean readings to a new table called clean_readings. Here, this query acts as an Event Processing Application comprising multiple interconnected Event Processors.

CREATE TABLE clean_readings AS
    SELECT sensor,
           reading,
           UPPER(location) AS location
    FROM readings;

With Flink SQL, we can view each section of the command as the construction of a different Event Processor:

  • CREATE TABLE defines the new output Event Stream to which this application will produce Events.
  • SELECT ... is a mapping function, taking each input Event and "cleaning" it as defined. In this example, cleaning simply means uppercasing the location field in each input reading.
  • FROM ... is a source Event Processor that defines the input Event Stream for the overall application.

Kafka Streams

The Kafka Streams DSL provides abstractions for Event Streams and Tables, as well as stateful and stateless transformation functions (map, filter, and others). Each of these functions can act as an Event Processor in the larger Event Processing Application that we build with the Kafka Streams library.

builder
  .stream("readings");
  .mapValues((key, value)-> 
    new Reading(value.sensor, value.reading, value.location.toUpperCase()) 
  .to("clean");

In the above example, we use the Kafka Streams StreamsBuilder to construct the stream processing topology.

  • First, we create an input stream, using the stream function. This creates an Event Stream from the designated Apache Kafka® topic.
  • Next, we transform the Events, using the mapValues function. This function accepts an Event and returns a new Event with any desired transformations to the original Event's values.
  • Finally, we write the transformed Events to a destination Kafka topic, using the to function. This function terminates our stream processing topology.

Considerations

  • While it could be tempting to build a "multi-purpose" Event Processor, it's important to design processors in a composable way. By building processors as discrete units, it is easier to reason about what each processor does, and by extension, what the Event Processing Application does.

References

Confluent Cloud is a fully managed Apache Kafka service available on all three major clouds. Try it for free today.

Try it for free