Kafka Streams works very well as a Java-based stream processing API, both to build scalable, stand-alone stream processing applications and to enrich Java applications with stream processing functionality that complements their other functions. But what if you don’t have an existing commitment to Java? Or what if you find it advantageous from an architectural or operational perspective to deploy a pure stream processing job without its own web interface or API to expose results to the front end? This is where ksqlDB comes in.
ksqlDB is a highly specialized kind of database that is optimized for stream processing applications. It runs on a scalable, fault-tolerant cluster of its own, exposing a REST interface to applications, which can then submit new stream processing jobs to run and query the results. The language in which those stream processing jobs and queries are defined is SQL. With REST and command line interface options, it doesn’t matter what language you use to build your applications. And it’s easy to get started within development mode, either running in Docker or on a single node running natively on a development machine.
Here’s some example ksqlDB code that does substantially the same thing as the Kafka Streams code we looked at up above:
CREATE TABLE rated_movies AS SELECT title, release_year, sum(rating) / count(rating) AS avg_rating FROM ratings INNER JOIN movies ON ratings.movie_id = movies.movie_id GROUP BY title, release_year;
Overall, you can think of ksqlDB as a standalone, SQL-powered stream processing engine that performs continuous processing of event streams and exposes the results to applications in a database-like way. It aims to provide one mental model for most Kafka-based stream processing application workloads.
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.