Flink applications run in distributed compute clusters that can be scaled up to hundreds or thousands of compute nodes. Flink’s performance at scale can be attributed to the following design characteristics: All state is local, either in memory or a local RocksDB instance Fault tolerance is guaranteed by checkpoints that are drawn asynchronously Flink streams data between compute nodes using a purpose-built, optimized network stack Backpressure is handled in a very natural way. Thanks to a combination of fixed-size network buffers and credit-based flow control, Flink throttles its sources rather than spilling data into buffers.

Apache Flink® FAQs

Here are some of the questions that you may have about Apache Flink and its surrounding ecosystem.

If you’ve got a question that isn’t answered here then please do ask the community.

Apache Kafka Apache Flink

Why is Flink fast?

Flink applications run in distributed compute clusters that can be scaled up to hundreds or thousands of compute nodes. Flink’s performance at scale can be attributed to the following design characteristics:

All state is local, either in memory or a local RocksDB instance
Fault tolerance is guaranteed by checkpoints that are drawn asynchronously
Flink streams data between compute nodes using a purpose-built, optimized network stack
Backpressure is handled in a very natural way. Thanks to a combination of fixed-size network buffers and credit-based flow control, Flink throttles its sources rather than spilling data into buffers.

What is exactly-once, and how does Flink achieve this?

Flink is able to guarantee that the state it manages is affected once, and only once, by each event. You can expect correct results, without concern for data loss or duplication.

To understand how this works, see Exactly-Once Processing in Apache Flink.

What are watermarks, and why does Flink need them?

Streaming data is processed as it becomes available, and often this means that streams are being processed out-of-order with respect to the timestamps in the events (which indicate when the events actually occurred). Time-related operations, such as windows, need to know how long to wait for out-of-order events before producing results, and how long to retain whatever state is required to handle these out-of-order events correctly.

Watermark are timestamped stream records that Flink inserts into your data streams. Each watermark marks a position in the stream with the timestamp that it carries. Time-based operations, like windows, rely on watermarks to know when to produce results, and when state they’re keeping can be safely garbage collected.

Watermark generation is typically based on an estimate of the maximum out-of-orderness that is expected for each data source. As a developer, you can control the tradeoff between latency and completeness by choosing this parameter that controls how long Flink will wait for out-of-order events.

For more on watermarks, see Event Time and Watermarks.

Learn more with these free training courses

Apache Flink® 101

Learn how Flink works, how to use it, and how to get started.

Building Apache Flink® Applications in Java

Get Started Free

Confluent Cloud is a fully managed Apache Kafka service available on all three major clouds. Try it for free today.

Try it for free

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog

NEWStreamables

NEWLearn More

Articles

Patterns

FAQs

Blog

NEWStreamables

NEWLearn More

Language Guides

Tutorials

Demos

Language Guides

Tutorials

Demos

Meetups

Ask the Community

Community Catalysts

NEWCommunity Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

Meetups

Ask the Community

Community Catalysts

NEWCommunity Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog