Get Started Free

FAQ - ksqlDB

Here are some of the questions that you may have about Apache Kafka and ksqlDB.

If you’ve got a question that isn’t answered here then please do ask the community.

What is ksqlDB?

ksqlDB is a way of interacting with a Kafka cluster using SQL. It allows you to write high-level stream operations (CREATE STREAM ...), queries (SELECT ... FROM ...), and aggregations (GROUP BY ...)—using a language that will be familiar to anyone with a background in relational databases.

Under the hood, it can be thought of as a declarative language sitting on top of Kafka Streams. Many tasks you might have wanted Kafka Streams for, such as joins and aggregations, can be written and deployed in minutes with ksqlDB.

(Note: ksqlDB was originally released under the name "KSQL." Older documentation may still refer to it as "KSQL," but it's the same thing.)

Is ksqlDB open source?

ksqlDB is licensed under the Confluent Community License, which is a source-available license, but not an open source license under the OSI definition.

You're free to download, modify and redistribute the source code for ksqlDB, save for a few excluded purposes that the license FAQ explains in detail.

What's an example of a KSQL statement?

ksqlDB (formerly "KSQL") is largely similar to SQL. An example statement might look like this:

SELECT TS, USER, LAT, LON
  FROM USER_LOCATION_STREAM
EMIT CHANGES;

ksqlDB strives to be compatible with the SQL standard wherever appropriate, and is an active member of the standards committee, which is working to extend SQL to cover event-streaming databases.

How do you install ksqlDB?

ksqlDB is available to install from Docker, Debian, RPM, or as a Tarball.

You can also get it standalone, as part of Confluent Platform, or on Confluent Cloud.

How does ksqlDB work?

Under the hood, ksqlDB is powered by Kafka Streams, which is in turn built on top of Kafka's consumer/producer architecture. ksqlDB provides the high-level language and easy deployment of new streams/tables, while behind the scenes Kafka Streams provides the processing, persistence and scaling engine.

For a deep dive, see Rick Spurgeon's blog post Sharpening your Stream Processing Skills with Kafka Tutorials.

Using ksqlDB with Java

ksqlDB has a dedicated Java client (JavaDoc) that lets you interact with your ksqlDB server directly from your Java code. Here's an example:

client.streamQuery("SELECT * FROM MY_STREAM EMIT CHANGES;")
    .thenAccept(result -> {
      System.out.println("Query has started. Query ID: " + result.queryID());

      RowSubscriber subscriber = new RowSubscriber();
      result.subscribe(subscriber);
    }).exceptionally(e -> {
      System.out.println("Request failed: " + e);
      return null;
    });

The Java client allows you to create and manage streams, tables and persistent queries, insert new data, and run streaming and batch-style queries.

You can also write user-defined functions in Java for ksqlDB.

Using ksqlDB with Python, .NET, and Golang

ksqlDB has community-supported clients for .NET, Golang and Python. There is also the Confluent-supported Java client.

Each client lets you interact with the ksqlDB server directly from your code. To illustrate, here's a Golang example:

k := `SELECT
  TIMESTAMPTOSTRING(WINDOWSTART,'yyyy-MM-dd','Europe/London') AS WINDOW_START,
  DOG_SIZE,
  DOGS_CT
FROM DOGS_BY_SIZE
WHERE DOG_SIZE=?;`

stmnt, err := ksqldb.QueryBuilder(k, "middle")
if err != nil {
	log.Fatal(err)
}

fmt.Println(*stmnt)

The exact features vary by language, so check their documentation, but in general they allow you to create and manage streams, tables and persistent queries, insert new data, and run streaming and batch-style queries.

Should I use ksqlDB or Kafka Streams?

ksqlDB places a large subset of Kafka Streams functionality into an easier-to-use, easier-to-deploy package. So, "use ksqlDB when you can," is a fair rule of thumb.

A more nuanced answer should take into account your specific use cases, as well as the programming languages your team is comfortable with: Kafka Streams is only available in Java and Scala, whereas ksqlDB is open to anyone that can write a SQL statement.

This blog post goes into good detail about the tradeoffs.

How fast is ksqlDB?

It's fair to describe ksqlDB as an accessible, high-level way of using Kafka Streams. Streams is the engine under the hood, so you can expect ksqlDB to perform similarly.

That raises the question "How fast is Kafka Streams?" and that depends on a number of factors including your topics' partition sizes, the size of the persistent datasets in your tables and joins, and the number of ksqlDB server instances you spread the load over. For a deep dive on that topic, see Guozhang Wang's Kafka Summit talk, Performance Analysis and Optimizations for Kafka Streams Applications.

What’s the difference between ksqlDB and KSQL?

ksqlDB was originally released under the name KSQL. They're the same thing, but older documentation may still refer to it as KSQL.

KSQL is also still used to refer to the actual language used to program ksqlDB.

Learn more with these free training courses

Apache Kafka 101

Learn how Kafka works, how to use it, and how to get started.

Spring Framework and Apache Kafka®

This hands-on course will show you how to build event-driven applications with Spring Boot and Kafka Streams.

Building Data Pipelines with Apache Kafka® and Confluent

Build a scalable, streaming data pipeline in under 20 minutes using Kafka and Confluent.

Confluent Cloud is a fully managed Apache Kafka service available on all three major clouds. Try it for free today.

Try it for free