Introduction

3 min

David Anderson

Principal Software Practice Lead

Apache Flink® 101

Apache Flink is a battle-hardened stream processor widely used for demanding real-time applications, such as data cleansing, deduplication, and enrichment, order acknowledgement, financial settlement, fraud detection, etc.

Flink's performance and robustness have made it the defacto standard for stream processing. It owes its success to a handful of well-chosen design principles, including a share-nothing architecture with managed state, event-time processing, and checkpointing.

What you’ll learn in this course

What Apache Flink is, and why you might use it
Stream processing, and how it differs from batch processing
Flink’s runtime architecture
How Flink and Kafka work together
An intro to Flink SQL
Why and how Flink manages state
How watermarks support event time operations
Fault tolerance, checkpointing, and exactly-once guarantees

Hands-on exercises

This course uses a mixture of videos and hands-on exercises to introduce Apache Flink, with an emphasis on the fundamentals. It is likely that some of the concepts presented here will be unfamiliar. Working through the exercises is helpful for understanding what's really going on.

For the exercises, you can choose between

using open source Apache Flink and Apache Kafka, running in Docker
using Confluent Cloud for Apache Flink and Apache Kafka

Intended audience

Anyone wanting an introduction to Apache Flink.

Prerequisites

This course assumes some basic familiarity with Kafka and SQL. If you understand what Kafka producers and consumers are, and can explain what GROUP BY does, that’s enough.

To learn more about Kafka, start with Kafka 101.

To learn more about Flink, see these additional courses:

Apache Flink SQL A more comprehensive introduction to Flink SQL.
Apache Flink Table API: Processing Data Streams in Java An introduction to Flink's Table API, using Java.
Building Flink Applications in Java An introduction to Flink's DataStream API, using Java.

Length

Approximately 3 hours

Staff

David Anderson (Course Author)

David has been working as a data engineer since long before that job title was invented. He has worked on recommender systems, search engines, machine learning pipelines, and BI tools, and has been helping companies adopt stream processing and Apache Flink since 2016. David is an Apache Flink committer, and works at Confluent as a Principal Software Practice Lead.

Do you have questions or comments? Join us in the #confluent-developer community Slack channel to engage in discussions with the creators of this content.

Use the promo codes FLINK101 & CONFLUENTDEV1 to get $25 of free Confluent Cloud usage and skip credit card entry.

Get Started

Introduction

Hi, I'm David Anderson with Confluent, here to tell you all about Apache Flink. We'll look together at why Flink is interesting, and how you can use it to build real-time data products. Along the way, I'll explain the big ideas on which Flink is based, and show you around under the hood so you'll understand how Flink works. Apache Flink is a battle-hardened stream processor widely used for demanding real-time applications. Its performance and robustness are the result of a handful of core design principles, including a share-nothing architecture featuring local state, event-time processing, and state snapshots for recovery. Through a combination of videos and hands-on exercises, this course brings these core principles to life. The focus of this course is going to be on the 4 Big Ideas that form the foundation for Apache Flink, which are streaming, state, time, and the use of state snapshots for fault tolerance and failure recovery. Understanding how Flink's runtime is organized around these four concepts, and how they are interrelated, is the key that unlocks Apache Flink. In the first four sections, we'll look together at streams and stream processing, and do so from several different perspectives. In the second half, we will look at each of the other three big ideas. By the end of the course you'll know enough to be able to implement some common use cases, and you'll understand what's going on inside of Flink when it is running your applications. Most of the modules in this course have a hands-on exercise that reinforces and expands upon the information in these videos. You'll find these exercises and other materials for this course on developer.confluent.io. These hands-on exercises all use Flink SQL. The focus will always be on learning about the concepts and architecture of Flink, while taking advantage of the SQL that you already know. And don't worry, you're not expected to be a SQL expert for this course. We're not going to do anything more complicated than aggregating data with GROUP BY. In several of the hands-on exercises, you will be using Flink SQL together with Apache Kafka to produce and consume data on Confluent Cloud. If you haven't already signed up for Confluent Cloud, sign up now so when you need it for the exercises, you'll be ready. And be sure to use the promo code in the description: it provides enough free credit to do all of the exercises for this course.

Be the first to get updates and new content

We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog

NEWStreamables

NEWLearn More

Articles

Patterns

FAQs

Blog

NEWStreamables

NEWLearn More

Language Guides

Tutorials

Demos

Language Guides

Tutorials

Demos

Meetups

Ask the Community

Community Catalysts

Community Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2025

Current 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

Meetups

Ask the Community

Community Catalysts

Community Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2025

Current 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

Modules: Start from lesson 1
Total 21