May 26, 2022 | Episode 217

Flink vs Kafka Streams/ksqlDB: Comparing Stream Processing Tools

  • Transcript
  • Notes

Stream processing can be hard or easy depending on the approach you take, and the tools you choose. This sentiment is at the heart of the discussion with Matthias J. Sax (Apache Kafka® PMC member; Software Engineer, ksqlDB and Kafka Streams, Confluent) and Jeff Bean (Sr. Technical Marketing Manager, Confluent). With immense collective experience in Kafka, ksqlDB, Kafka Streams, and Apache Flink®, they delve into the types of stream processing operations and explain the different ways of solving for their respective issues.

The best stream processing tools they consider are Flink along with the options from the Kafka ecosystem: Java-based Kafka Streams and its SQL-wrapped variant—ksqlDB. Flink and ksqlDB tend to be used by divergent types of teams, since they differ in terms of both design and philosophy.

Why Use Apache Flink?

The teams using Flink are often highly specialized, with deep expertise, and with an absolute focus on stream processing. They tend to be responsible for unusually large, industry-outlying amounts of both state and scale, and they usually require complex aggregations. Flink can excel in these use cases, which potentially makes the difficulty of its learning curve and implementation worthwhile.

Why use ksqlDB/Kafka Streams?

Conversely, teams employing ksqlDB/Kafka Streams require less expertise to get started and also less expertise and time to manage their solutions. Jeff notes that the skills of a developer may not even be needed in some cases—those of a data analyst may suffice. ksqlDB and Kafka Streams seamlessly integrate with Kafka itself, as well as with external systems through the use of Kafka Connect. In addition to being easy to adopt, ksqlDB is also deployed on production stream processing applications requiring large scale and state.

There are also other considerations beyond the strictly architectural. Local support availability, the administrative overhead of using a library versus a separate framework, and the availability of stream processing as a fully managed service all matter. 

Choosing a stream processing tool is a fraught decision partially because switching between them isn't trivial: the frameworks are different, the APIs are different, and the interfaces are different. In addition to the high-level discussion, Jeff and Matthias also share lots of details you can use to understand the options, covering employment models, transactions, batching, and parallelism, as well as a few interesting tangential topics along the way such as the tyranny of state and the Turing completeness of SQL.

Continue Listening

Episode 218June 2, 2022 | 48 min

Data Mesh Architecture: A Modern Distributed Data Model

Data mesh isn’t software you can download and install, so how do you build a data mesh? In this episode, Adam Bellemare (Staff Technologist, Office of the CTO, Confluent) discusses his data mesh proof of concept and how it can help you conceptualize the ways in which implementing a data mesh could benefit your organization.

Episode 219June 9, 2022 | 29 min

How I Became a Developer Advocate

What is a developer advocate and how do you become one? In this episode, we have seasoned developer advocates, Kris Jenkins (Senior Developer Advocate, Confluent) and Danica Fine (Senior Developer Advocate, Confluent) answer the question by diving into how they got into the world of developer relations, what they enjoyed the most about their roles, and how you can become one.

Episode 220June 16, 2022 | 48 min

Tips For Writing Abstracts and Speaking at Conferences

A well-written abstract is your ticket to conferences, but how do you write an excellent synopsis that will get accepted? As an experienced conference speaker, Robin Moffatt (Principal Developer Advocate, Confluent) often writes presentations that help the developer community to understand Apache Kafka and its ecosystem. He is also the Program Committee Chair for Kafka Summit and Current 2022: The Next Generation of Kafka Summit. Having seen hundreds of conference submissions, Robin shares best practices for crafting abstracts that stand out, as well as tips for speaking at conferences.

Got questions?

If there's something you want to know about Apache Kafka, Confluent or event streaming, please send us an email with your question and we'll hope to answer it on the next episode of Ask Confluent.

Email Us

Never miss an episode!

Confluent Cloud is a fully managed Apache Kafka service available on all three major clouds. Try it for free today.

Try it for free