Consistent time semantics are of particular importance in stream processing. Many operations in an Event Processor are dependent on time, such as joins, aggregations when computed over a window of time (e.g., five-minute averages), and handling out-of-order and "late" data. In many systems, developers have a choice between different variants of time for an Event:
Depending on the use case, developers need to pick one variant over the others.
How can Events from an Event Source be processed irrespective of the timestamps from when they were originally created by the Event Source?
Depending on the use case, Event Processors may use the time when the Event was originally created by its Event Source, the time when it was received on the Event Stream in the Event Streaming Platform, or a time derived from one or more data fields provided by the Event itself (i.e., from the Event payload).
As an example, the streaming database ksqlDB maintains a system column called ROWTIME, which tracks the timestamp of an Event. By default, ROWTIME is inherited from the timestamp in the underlying Apache Kafka® record metadata, but it can also be pulled from a field in the Event. See Time semantics in the ksqlDB documentation for more information.
CREATE STREAM TEMPERATURE_READINGS_EVENTTIME
WITH (KAFKA_TOPIC='deviceEvents',
VALUE_FORMAT='avro',
TIMESTAMP='eventTime');