An event grouper is a specialized form of an Event Processor that groups events together by a common field, such as a customer ID, and/or by event timestamps (often called windowing or time-based windowing).
How can we group individual but related events from the same Event Stream or Table, so that they can subsequently be processed as a whole?
For time-based grouping a.k.a. time-based windowing, we use an Event Processor that groups the related events into windows based on their event timestamps. Most window types have a pre-defined window size, such as 10 minutes or 24 hours. An exception is session windows, where the size of each window varies depending on the time characteristics of the grouped events.
For field-based grouping, we use an Event Processor that groups events by one or more data fields, irrespective of the event timestamps.
The two grouping approaches are orthogonal and can be composed. For example, to compute 7-day averages for every customer in a stream of payments, we first group the events in the stream by customer ID and by 7-day windows, and then compute the respective averages for each customer+window grouping.
As an example, Apache Flink® SQL provides the capability to group related events by a column and group them into windows where all the related events have a timestamp within the defined time-window.
SELECT product_name, SUM(price) as total
FROM TABLE(TUMBLE(TABLE purchases, DESCRIPTOR(ts), INTERVAL '1' MINUTES))
GROUP BY product_name, window_start, window_end;
When grouping events into time windows, there are various types of groupings possible.
See the Flink SQL windowing table-valued functions and the Kafka Streams supported window types for details and diagrams explaining window types.