April 14, 2021 | Episode 153

Connecting Azure Cosmos DB with Apache Kafka - Better Together ft. Ryan CrawCour

  • Transcript
  • Notes

When building solutions for customers in Microsoft Azure, it is not uncommon to come across customers who are deeply entrenched in the Apache Kafka® ecosystem and want to continue expanding within it. Thus, figuring out how to connect Azure first-party services to this ecosystem is of the utmost importance.

Ryan CrawCour is a Microsoft engineer who has been working on all things data and analytics for the past 10+ years, including building out services like Azure Cosmos DB, which is used by millions of people around the globe. More recently, Ryan has taken a customer-facing role where he gets to help customers build the best solutions possible using Microsoft Azure’s cloud platform and development tools. 

In one case, Ryan helped a customer leverage their existing Kafka investments and persist event messages in a durable managed database system in Azure. They chose Azure Cosmos DB, a fully managed, distributed, modern NoSQL database service as their preferred database, but the question remained as to how they would feed events from their Kafka infrastructure into Azure Cosmos DB, as well as how they could get changes from their database system back into their Kafka topics. 

Although integration is in his blood, Ryan confesses that he is relatively new to the world of Kafka and has learned to adjust to what he finds in his customers’ environments. Oftentimes this is Kafka, and for many good reasons, customers don’t want to change this core part of their solution infrastructure. This has led him to embrace Kafka and the ecosystem around it, enabling him to better serve customers. 

He’s been closely tracking the development and progress of Kafka Connect. To him, it is the natural step from Kafka as a messaging infrastructure to Kafka as a key pillar in an integration scenario. Kafka Connect can be thought of as a piece of middleware that can be used to connect a variety of systems to Kafka in a bidirectional manner. This means getting data from Kafka into your downstream systems, often databases, and also taking changes that occur in these systems and publishing them back to Kafka where other systems can then react. 

One day, a customer asked him how to connect Azure Cosmos DB to Kafka. There wasn’t a connector at the time, so he helped build two with the Confluent team: a sink connector, where data flows from Kafka topics into Azure Cosmos DB, as well as a source connector, where Azure Cosmos DB is the source of data pushing changes that occur in the database into Kafka topics.

Continue Listening

Episode 154April 19, 2021 | 10 min

Apache Kafka 2.8 - ZooKeeper Removal Update (KIP-500) and Overview of Latest Features

Apache Kafka 2.8 is out! This release includes early access to the long-anticipated ZooKeeper removal encapsulated in KIP-500, as well as other key updates Find out what’s new in this episode of Streaming Audio.

Episode 155April 22, 2021 | 31 min

Powering Microservices Using Apache Kafka on Node.js with KafkaJS at Klarna ft. Tommy Brunn

At Klarna, Lead Engineer Tommy Brunn is building a runtime platform for developers. But outside of his professional role, he is also one of the authors of the JavaScript client for Apache Kafka® called KafkaJS, which has grown from being a niche open source project to the most downloaded Kafka client for Node.js since 2018.

Episode 156April 29, 2021 | 28 min

Data Management and Digital Transformation with Apache Kafka at Van Oord

Imagine if you could create a better world for future generations simply by delivering marine ingenuity. Van Oord is a Dutch family-owned company that has served as an international marine contractor for over 150 years,. It relies on an enterprise architecture model that must seamlessly integrate data lineage and data governance. The basis of their holistic reference architecture is a change data capture (CDC) layer and a persistent layer that makes Confluent the core component of their future-proof digital data management solution. In this episode, Marlon Hiralal (Enterprise/Data Management Architect, Van Oord) and Andreas Wombacher (Data Engineer, Van Oord) share all about it.

Got questions?

If there's something you want to know about Apache Kafka, Confluent or event streaming, please send us an email with your question and we'll hope to answer it on the next episode of Ask Confluent.

Email Us

Never miss an episode!

Confluent Cloud is a fully managed Apache Kafka service available on all three major clouds. Try it for free today.

Try it for free