Senior Software Engineer (Presenter)
Integration Architect (Author)
Stateful operations in Kafka Streams, which represent the present state of an event stream, include aggregate, reduce, and even KTable, since KTable uses an update stream.
With event streams, a typical pattern is reporting—something like a dashboard application. You can do analytics over the longer term, but you also want to get a window into what's happening at the moment, especially with aggregations and counts. This will let you do alerting, for example.
Reporting usually requires the streaming system to write out its contents to an external database, where it is then queried by a UI layer for live use.
Kafka Streams, however, enables you to directly query the existing state of a stateful operation or a table, without the need of a SQL layer. You do this using interactive queries. They are live, you can see them as they happen, and you don't need to write their intermediate results to an external database. Interactive queries provide a materialized view of your operations in real time, simplifying your architecture.
KTables and aggregations are the eligible targets for Interactive Queries. To enable them, you name the state store via a Materialized object or use the Stores factory class; the Stores class has several methods that you can use to create a state store. (Learn more in the Hands On: Aggregations and Hands On: Processor API exercises.) You also need to provide the serving layer, usually a REST API, by setting the application.server configuration, specifying the host and port. Note that each instance shares metadata with all of the other applications that have the same application ID.
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.
Hi, I'm Sophie Blee-Goldman with Confluent. In this module, we're going to talk about Interactive Queries. So, Kafka Streams has stateful operations, as we've seen before, such as aggregations. And these often represent the present state of an event stream. So, in an aggregation, you typically would have a state store, which is holding key values that represent the latest value which is holding key values that represent the latest value for a particular key or aggregation result. Now, a typical pattern with event streaming is reporting. So, let's say you have a dashboard application and you have a streaming system, which is computing some results, and it generally requires them to write these results to a database somewhere. Then, the UI layer for this dashboard is going to be querying the database for live views in order to get a live continuous update when you're looking at this dashboard. Now, Kafka Streams already provides the stream processing layer, but it also provides the storage layer. The Kafka Streams stateful operations, such as aggregations will each have a state store, which holds the current or present state of the event stream, so it has the latest value of the aggregation, for example. Now, interactive queries are a tool that actually lets you query the state as well as query key table instances, so you can have your stream processing and your state querying all in the same system. Now, what exactly is eligible for interactive queries? Well, a query is against some kind of a state. So, this will be actually a KTable or an aggregation, where an aggregation can be any of those sub-aggregation, such as count or reduce that we saw earlier. To enable interactive queries, you just need to name the state store that you were going to query, and this can be done using a Materialized object or using the Stores factory, if you are in the processor API layer. So doing so, will make these states queryable, but to actually set up the application for interactive queries, you need to give it an application.server configuration. This just means pass in the host and port config, so that Kafka Streams can know which instance of the application that you're talking to. So, each application instance actually has the metadata for all of the other streams instances with the same application ID, so this tells them, what is the host and port for a particular store that you might wish to query? Not all instances will actually hold a copy of every single store. This just depends on the task assignment, so you might need to be querying a particular instance to get information from a particular store. Now, Kafka Streams can tell you which store to query, but the developer you needs to provide the actual serving layer to route your request to the correct Kafka Streams instance. So, Kafka Streams does it all without the need for an external database by allowing these kinds of direct read-only access to state stores and KTables. Now, it's important to note that interactive queries are a read-only kind of access. You can't use them to actually modify the underlying state store, but that's generally fine because you're only using this to query the results of an ongoing aggregation or something like the topic that is backing a KTable. The nice thing is that it's live, so you get these updates as they're happening and you don't need to do anything like write out to a database and read that back. It's all happening in the stream processing layer, which really greatly simplifies the architecture. This gives you a materialized view of the operation in real time, and that's what you can do in Kafka Streams. Now, there's also the option to use SQL with ksqlDB to create tables. You should take a look at our ksqlDB course to learn more. Thank you for your time.