Principal Product Manager (Presenter)
One of the powerful capabilities of ksqlDB is combining rows from one stream (or table) with those of another. This can be used to enrich events with reference information (denormalize), to join between event streams, or to join between tables.
A stream-table join keeps just one internal buffer, one for the table (the stream is of a theoretically infinite size, so it has no buffer). When a row from a table is read, that row goes into the internal buffer to be joined against a row from the stream, which continuously sends data. Rows are sent based on timestamp: the lowest timestamp in “stream-time” is sent first.
In this example, the join type is INNER
, which means that if there’s a match between the stream and the table, a new row is created. If there is no match, the row is dropped. Note that updates to the table don’t cause the join to fire; only updates to the stream do so.
CREATE STREAM enriched_readings AS
SELECT r.reading,
r.area,
b.brand_name
FROM readings r
INNER JOIN brands b,
on b.sensor = r.sensor
EMIT CHANGES;
Now consider the same join but with a left outer
clause, which will always fire a join on the stream side:
CREATE STREAM enriched_readings AS
SELECT r.reading,
r.area,
b.brand_name
FROM readings r
LEFT OUTER JOIN brands b,
on b.sensor = r.sensor
EMIT CHANGES;
So whenever a record appears in the stream, it will always get passed through. If there was no match in the table, the resulting row simply has a null column.
If you’re joining against a table of reference data, it’s a good practice to preload data, so that timestamps from your table are always smaller than timestamps from your stream. This makes ksqlDB load the largest amount of data right at the start, and it’s useful if the joined data is known ahead of time and won’t change very often.
An inner table-table join produces a table, unlike the stream-table joins you have seen so far. It has two buffers—one for each table—and it fires when there is an update on either side of the expression.
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.