Get Started Free
December 14, 2021 | Episode 190

Lessons Learned From Designing Serverless Apache Kafka ft. Prachetaa Raghavan

  • Transcript
  • Notes
undefined

You might call building and operating Apache Kafka® as a cloud-native data service synonymous with a serverless experience. Prachetaa Raghavan (Staff Software Developer I, Confluent) spends his days focused on this very thing. In this podcast, he shares his learnings from implementing a serverless architecture on Confluent Cloud using Kubernetes Operator. 

Serverless is a cloud execution model that abstracts away server management, letting you run code on a pay-per-use basis without infrastructure concerns. Confluent Cloud's major design goal was to create a serverless Kafka solution, including handling its distributed state, its performance requirements, and seamlessly operating and scaling the Kafka brokers and Zookeeper. The serverless offering is built on top of an event-driven microservices architecture that allows you to deploy services independently with your own release cadence and maintained at the team level.

There are 4 subjects that help create the serverless event streaming experience with Kafka:

  1. Confluent Cloud control plane: This Kafka-based control plane provisions resources required to run the application. It automatically scales resources for services, such as managed Kafka, managed ksqlDB, and managed connectors. The control plane and data plane are decoupled—if a single data plane has issues, it doesn’t affect the control plane or other data planes. 
  2. Kubernetes Operator: The operator is an application-specific controller that extends the functionality of the Kubernetes API to create, configure, and manage instances of complex applications on behalf of Kubernetes users. The operator looks at Kafka metrics before upgrading a broker at a time. It also updates the status on cluster rebalancing and on shrink to rebalance data onto the remaining brokers. 
  3. Self-Balancing Clusters: Cluster balance is measured on several dimensions, including replica counts, leader counts, disk usage, and network usage. In addition to storage rebalancing, Self-Balancing Clusters are essential to making sure that the amount of available disk and network capability is satisfied during any balancing decisions. 
  4. Infinite Storage: Enabled by Tiered Storage, Infinite Storage rebalances data fast and efficiently—the most recently written data is stored directly on Kafka brokers, while older segments are moved off into a separate storage tier.  This has the added bonus of reducing the shuffling of data due to regular broker operations, like partition rebalancing. 

Continue Listening

Episode 191December 21, 2021 | 31 min

Running Hundreds of Stream Processing Applications with Apache Kafka at Wise

What’s it like building a stream processing platform with around 300 stateful stream processing applications based on Kafka Streams? Levani Kokhreidze (Principal Engineer, Wise) shares his experience building such a platform that the business depends on for multi-currency movements across the globe. He explains how his team uses Kafka Streams for real-time money transfers at Wise, a fintech organization that facilitates international currency transfers for 11 million customers.

Episode 192December 28, 2021 | 34 min

Modernizing Banking Architectures with Apache Kafka ft. Fotios Filacouris

Financial services have been early Apache Kafka adopters. With strong delivery guarantees and scalability, Kafka is a streaming platform that solves architectural gaps for banks. Having experience working and designing architectural solutions for financial services, Fotios Filacouris (Senior Solutions Engineer, Enterprise Solutions Engineering, Confluent) joins Tim to discuss how Kafka and Confluent help banks build modern architectures, highlighting key emerging use cases from the sector.

Episode 193January 6, 2022 | 34 min

Real-Time Change Data Capture and Data Integration with Apache Kafka and Qlik

Getting data from a database management system (DBMS) into Apache Kafka in real time is a subject of ongoing innovation. John Neal (Principal Solution Architect, Qlik) and Adam Mayer (Senior Technical Producer Marketing Manager, Qlik) explain how leveraging change data capture (CDC) for data ingestion into Kafka enables real-time data-driven insights.

Got questions?

If there's something you want to know about Apache Kafka, Confluent or event streaming, please send us an email with your question and we'll hope to answer it on the next episode of Ask Confluent.

Email Us

Never miss an episode!

Confluent Cloud is a fully managed Apache Kafka service available on all three major clouds. Try it for free today.

Try it for free