Get Started Free
October 3, 2022 | Episode 237

Apache Kafka 3.3 - KRaft, Kafka Core, Streams, & Connect Updates

  • Transcript
  • Notes

Danica Fine: (00:00)

Welcome to Streaming Audio. I'm Danica Fine, Senior Developer Advocate at Confluent. You're listening to a special episode where I have the honor of announcing the Apache Kafka 3.3 release on behalf of the Kafka community. There are so many great KIPs and updates to highlight in this release, so let's get to it.

Danica Fine: (00:25)

Hi, I'm Danica Fine with Confluent, here to tell you what's new in Apache Kafka 3.3. We have a number of great KIPs in this release, so without further ado, let's jump right in.

Danica Fine: (00:35)

As usual, the release is broken up based on what the KIP pertains to. In this release, we'll cover updates from Kafka Core, Kafka Streams, and Kafka Connect. First up for Kafka Core, we have KIP-709, which affects OffsetFetch requests. Currently, applications have to submit a separate OffsetFetch request to the group coordinator for each group ID – as you can imagine, this can be tedious. With KIP-709, the process has been streamlined so that a single request can be made to fetch offsets for multiple groups. Overall, this reduces request overhead and simplifies client-side code.

Danica Fine: (01:15)

KIP 824 affects the tool which allows operators to sample logs from topics. No, unfortunately, this tool used to inefficiently dump the entire log. But, with KIP-824, the new max-bytes parameter allows operators to sample only a small slice of the log in a more efficient and scalable way.

Danica Fine: (01:37)

Also impacting operators is KIP-827, which exposes both the log directory total bites and usable bites metrics via the Kafka API. These can be used to programmatically check the state of various disc related operations.

Danica Fine: (01:52)

Log recovery is an important process that is triggered whenever a broker starts up after an unclean shutdown. To better monitor this process, KIP-831 introduces two new metrics: RemainingLogsToRecover and RemainingSegmentsToRecover. Both metrics are offered on a per-thread basis.

Danica Fine: (02:12)

And finally, we have KIP-851. With the OffsetFetchRequest, there’s a boolean requireStable flag to indicate whether to tolerate pending transactional offset commits in the group coordinator. The admin client uses OffsetFetchRequest in its listConsumerGroupOffsets method where the requireStable flag is always set to false which isn’t very useful for getting committed offsets with exactly once semantics… With this KIP, the adminclient now has the ability to choose to either include or exclude pending transactional offsets.

Danica Fine: (02:47)

Next up we have a couple of Kafka Streams KIPs. KIP-820 extends Kafka Streams to use the latest processor API. With this update, it's now simple to chain results from processors and have more control over what's forwarded.

Danica Fine: (03:03)

The second and final update for Kafka Streams is KIP-834. This change adds pause and resume to KSteams topologies, allowing users to pause the processing, punctuation, and standby tasks of the topology. Overall, this makes it easier to do a number of things, like reduce resource usage when processing is not required, modify logic of Kafka Streams applications, and respond to operational issues.

Danica Fine: (03:29)

And finally, we have an exciting Kafka Connect KIP that should be music to everyone's ears. KIP-618 introduces exactly-once support for source connectors. To achieve this, a number of new connector and worker based configurations have been introduced. Now keep in mind that not all source connectors will support this, so I recommend that you read the details on the KIP itself to understand which of your connectors can take advantage of this new functionality.

Danica Fine: (03:57)

All right, that's all that I have to share for now. Until the next release.

Danica Fine: (04:04)

What are you doing here?

Danica Fine: (04:15)

It's happening. That's right, folks. With KIP-833, KRaft has finally been marked production ready for new clusters only. And just so you know, there are a few limitations. You won't want to use KRaft in production for new clusters if you're using SCRAM, JBOD, certain dynamic configurations or delegate tokens. Check out the KIP for more details.

Danica Fine: (04:39)

A ton of work was involved to make this happen and quite a lot was done to improve KRaft since Apache Kafka 3.2, such as ensuring fenced replicas don't join the ISR when in KRaft mode, improved controller health checks and new error metrics, but it all adds up to production-ready KRaft for new clusters.

Danica Fine: (04:57)

Now when can existing Zookeeper mode clusters migrate to KRaft? Great question, I'd love to tell you. While the timeline is subject to change, currently you can expect to be able to upgrade existing clusters from Zookeeper mode to KRaft mode with Apache Kafka 3.5. The community is targeting early access with limited unstable APIs as part of release version 3.4, and the current plan is to have version 3.5 be a bridge release. At the same time, you can expect Zookeeper to be deprecated and then removed as part of Apache Kafka 4.0. And that's all I have to share for now.

Danica Fine: (05:34)

There are a ton of KIPs involved in this release though, so as usual, I encourage you to head on over to our Confluent blog or take a look at the release notes to check them out in more detail. And of course, I look forward to hearing about what you build.

Danica Fine: (05:47)

Those are the highlights from this latest Apache Kafka release. Thank you for listening to this very special episode. If you have any questions or would like to discuss, you can reach out to our community forum or Slack. Both are linked in the show notes. If you happen to be listening on Apple Podcast or other podcast platforms, please be sure to leave a review. We'd love to hear your feedback and if you're watching on YouTube, please subscribe so you'll be notified with updates you might be interested in. Thanks for your support, and see you next time.

Apache Kafka® 3.3 is released! With over two years of development, KIP-833 marks KRaft as production ready for new AK 3.3 clusters only. On behalf of the Kafka community, Danica Fine (Senior Developer Advocate, Confluent) shares highlights of this release, with KIPs from Kafka Core, Kafka Streams, and Kafka Connect. 

To reduce request overhead and simplify client-side code, KIP-709 extends the OffsetFetch API requests to accept multiple consumer group IDs. This update has three changes, including extending the wire protocol, response handling changes, and enhancing the AdminClient to use the new protocol. 

Log recovery is an important process that is triggered whenever a broker starts up after an unclean shutdown. And since there is no way to know the log recovery progress other than checking if the broker log is busy, KIP-831 adds metrics for the log recovery progress with `RemainingLogsToRecover` and `RemainingSegmentsToRecover`for each recovery thread. These metrics allow the admin to monitor the progress of the log recovery.

Additionally, updates on Kafka Core also include KIP-841: Fenced replicas should not be allowed to join the ISR in KRaft. KIP-835: Monitor KRaft Controller Quorum Health. KIP-859: Add metadata log processing error-related metrics. 

KIP-834 for Kafka Streams added the ability to pause and resume topologies. This feature lets you reduce rescue usage when processing is not required or modifying the logic of Kafka Streams applications, or when responding to operational issues. While KIP-820 extends the KStream process with a new processor API. 

Previously, KIP-98 added support for exactly-once delivery guarantees with Kafka and its Java clients. In the AK 3.3 release, KIP-618 offers the Exactly-Once Semantics support to Confluent’s source connectors. To accomplish this, a number of new connectors and worker-based configurations have been introduced, including ``, `transaction.boundary`, and more. 

Image attribution: Apache ZooKeeper™: and Raft logo:  

Continue Listening

Episode 238October 13, 2022 | 71 min

Optimizing Apache JVMs for Apache Kafka

Java Virtual Machines (JVMs) impact Apache Kafka performance in production. How can you optimize your event-streaming architectures so they process more Kafka messages using the same number of JVMs? Gil Tene (CTO and Co-Founder, Azul) delves into JVM internals and how developers and architects can use Java and optimized JVMs to make real-time data pipelines more performant and more cost effective, with use cases.

Episode 239October 20, 2022 | 37 min

Build a Real Time AI Data Platform with Apache Kafka

Is it possible to build a real-time data platform without using stateful stream processing? is an artificial intelligence platform for forecasting commodity prices, imparting insights into the future valuations of raw materials for users. Nearly all AI models are batch-trained once, but precious commodities are linked to ever-fluctuating global financial markets, which require real-time insights. In this episode, Ralph Debusmann (CTO, shares their journey of migrating from a batch machine learning platform to a real-time event streaming system with Apache Kafka and delves into their approach to making the transition frictionless.

Episode 240October 27, 2022 | 58 min

Running Apache Kafka in Production

What are some recommendations to consider when running Apache Kafka in production? Jun Rao, one of the original Kafka creators, as well as an ongoing committer and PMC member, shares the essential wisdom he's gained from developing Kafka and dealing with a large number of Kafka use cases.

Got questions?

If there's something you want to know about Apache Kafka, Confluent or event streaming, please send us an email with your question and we'll hope to answer it on the next episode of Ask Confluent.

Email Us

Never miss an episode!

Confluent Cloud is a fully managed Apache Kafka service available on all three major clouds. Try it for free today.

Try it for free