Michael Drogalis

Principal Product Manager (Presenter)

ksqlDB’s Architecture

A typical streaming pipeline consists of source databases and apps creating events that feed into Apache Kafka via connectors, a stream processor that filters and aggregates the events, and finally storage facilities that store the events for access by analytics engines.


This is a standard system design, and it works well, but it does consist of a whole host of moving parts that are prone to breakage or failure and that make scaling, securing, monitoring, debugging, and operating all as one difficult.

An architecture with ksqlDB is much simpler, as many of those external parts have been consolidated into the tool itself: ksqlDB has primitives for connectors, and it performs stream processing. It also features materialized views, so that your data can be queried just like a database table directly in ksqlDB, without needing to be sent to an external source:


In fact, ksqlDB tucks away complexity in a manner similar to classic relational databases such as PostgreSQL, which hides query execution, indexing, concurrency control, and crash recovery behind an intuitive interface.

Because ksqlDB relies on Kafka to maintain state, you ultimately end up with a simple two-tier architecture: compute and storage—ksqlDB and Kafka. And each can be elastically scaled independently of the other.

You can interact with a single ksqlDB server or multiple servers using a client-server CLI setup, like with a relational database, or via a full-featured web UI, a REST API, or a client for a programming language such as Java.

