Question 1

What is Flink SQL?

Accepted Answer

Flink SQL is a declarative API used for creating Flink jobs. It is well-suited for real-time ETL, data enrichment, and event-driven applications.

Question 2

What is Confluent Cloud for Apache Flink?

Accepted Answer

Confluent Cloud provides a cloud-native, serverless service for Flink that enables simple, scalable, and secure stream processing that integrates seamlessly with Apache Kafka®. Your Kafka topics appear automatically as queryable Flink tables, with schemas and metadata attached by Confluent Cloud.

Confluent’s fully managed Flink service enables you to:

Easily filter, join, and enrich your data streams with Flink
Enable high-performance and efficient stream processing at any scale, without the complexities of managing infrastructure
Experience Kafka and Flink as a unified platform, with fully integrated monitoring, security, and governance

Confluent Cloud for Apache Flink is engineered to be:

Cloud-native: Flink is fully managed on Confluent Cloud and autoscales up and down with your workloads.
Complete: Flink is integrated deeply with Confluent Cloud to provide an enterprise-ready experience.
Everywhere: Flink is available in AWS, Azure, and Google Cloud.

Question 3

How can I get started with Flink SQL?

Accepted Answer

The documentation includes how-to guides for several common use cases. On Confluent Developer you’ll find Flink SQL demos and tutorials.

Question 4

Why does Flink SQL have so many different joins, and which one should I use?

Accepted Answer

A regular join, as in

SELECT * FROM orders JOIN customers ON orders.customer_id = customers.id;

is poorly suited for streaming. To execute this join, the Flink SQL runtime must keep around forever, in its state, all order and customer records. These older records are needed for producing the full set of new join results that must be produced as new orders and customer records arrive. In most cases this is unreasonably expensive, and it’s rarely useful.

In most streaming applications, joins are used for event enrichment, e.g., augmenting each incoming order event with timely customer information. For stream enrichment use cases, you should use a temporal join instead of a regular join:

SELECT *
FROM orders
INNER JOIN customers FOR SYSTEM_TIME AS OF orders.order_time
ON orders.customer_id = customers.id;

To understand regular and temporal joins in more detail, see How To Use Streaming Joins with Apache Flink.

Another type of specialized, optimized join that is useful for streaming use cases is the interval join. This is what you could use, for example, to find orders that shipped within 4 hours of being received:

SELECT *
FROM orders o, shipments s
WHERE o.id = s.order_id
AND o.order_time BETWEEN s.ship_time - INTERVAL '4' HOUR AND s.ship_time;

Question 5

How can I apply windowing to the result of a join?

Accepted Answer

Windows rely on watermarks, and some operations, like regular joins, cause watermarks to be lost. Even if both input streams/tables to a regular join have watermarks, the result can not have watermarks because there’s no a priori limit to how out-of-order the resulting stream/table might be.

If you need to apply windowing to the result of a join, you can either use a temporal or interval join, or in cases where you know that the result will happen to be (at least roughly) in order, you can write out the result of a regular join somewhere, such as a Kafka topic, and then apply suitable watermarking to that intermediate table.

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog

NEWStreamables

NEWLearn More

Articles

Patterns

FAQs

Blog

NEWStreamables

NEWLearn More

Language Guides

Tutorials

Demos

Language Guides

Tutorials

Demos

Meetups

Ask the Community

Community Catalysts

NEWCommunity Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

Meetups

Ask the Community

Community Catalysts

NEWCommunity Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog