January 6, 2022 | Episode 193

Real-Time Change Data Capture and Data Integration with Apache Kafka and Qlik

  • Transcript
  • Notes

Getting data from a database management system (DBMS) into Apache Kafka® in real time is a subject of ongoing innovation. John Neal (Principal Solution Architect, Qlik) and Adam Mayer (Senior Technical Producer Marketing Manager, Qlik) explain how leveraging change data capture (CDC) for data ingestion into Kafka enables real-time data-driven insights. 

It can be challenging to ingest data in real time. It is even more challenging when you have multiple data sources, including both traditional databases and mainframes, such as SAP and Oracle. Extracting data in batch for transfer and replication purposes is slow, and often incurs significant performance penalties. However, analytical queries are often even more resource intensive and are prohibitively expensive to run on production transactional databases. CDC enables the capture of source operations as a sequence of incrementing events, converting the data into events to be written to Kafka. 

Once this data is available in the Kafka topics, it can be used for both analytical and operational use cases. Data can be consumed and modeled for analytics by individual groups across your organization. Meanwhile, the same Kafka topics can be used to help power microservice applications and help ensure data governance without impacting your production data source. Kafka makes it easy to integrate your CDC data into your data warehouses, data lake, NoSQL database, microservices, and any other system. 

Adam and John highlight a few use cases where they see real-time Kafka data ingestion, processing, and analytics moving the needle—including real-time customer predictions, supply chain optimizations, and operational reporting. Finally, Adam and John cap it off with a discussion on how capturing and tracking data changes are critical for your machine learning model to enrich data quality. 

Continue Listening

Episode 1June 20, 2018 | 22 min

Ask Confluent #1: Kubernetes, Confluent Operator, Kafka and KSQL

Tim Berglund and Gwen Shapira discuss Kubernetes, Confluent Operator, Kafka, KSQL and more.

Episode 2July 2, 2018 | 24 min

Ask Confluent #2: Consumers, Culture and Support

Gwen Shapira answers your questions and interviews Sam Hecht (Head of Support, Confluent).

Episode 194January 13, 2022 | 29 min

From Batch to Real-Time: Tips for Streaming Data Pipelines with Apache Kafka ft. Danica Fine

Implementing an event-driven data pipeline can be challenging, but doing so within the context of a legacy architecture is even more complex. Having spent three years building a streaming data infrastructure and being on the first team at a financial organization to implement Apache Kafka event-driven data pipelines, Danica Fine (Senior Developer Advocate, Confluent) shares about the development process and how ksqlDB and Kafka Connect became instrumental to the implementation.

Got questions?

If there's something you want to know about Apache Kafka, Confluent or event streaming, please send us an email with your question and we'll hope to answer it on the next episode of Ask Confluent.

Email Us

Never miss an episode!

Be the first to get updates and new content

We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.