Get Started Free
‹ Back to courses
course: Apache Kafka® for Python Developers

Beyond Simple Python Apps

5 min
Screenshot 2024-09-18 at 3.42.50 PM

Dave Klein

Senior Developer Advocate

Overview

In this lecture, you will learn how you can use Python to satisfy the requirements of more complex Kafka event streaming use cases. Follow along as Dave Klein (Senior Developer Advocate, Confluent) covers what your next steps might be using the Python development skills you learned in this course.

Resources

https://www.confluent.io/blog/event-driven-microservices-with-python-and-kafka/ https://www.confluent.io/ko-kr/blog/kafka-python-asyncio-integration/ https://www.confluent.io/ko-kr/blog/machine-learning-with-python-jupyter-ksql-tensorflow/ https://www.confluent.io/blog/real-time-syslog-processing-with-apache-kafka-and-ksql-part-2-event-driven-alerting-with-slack/

Use the promo code PYTHONKAFKA101 & CONFLUENTDEV1 to get $25 of free Confluent Cloud usage and skip credit card entry.

Beyond Simple Python Apps

Hi, Dave Klein here again. In this last module of the Apache Kafka for Python Developers course, we'll look at some options for event streaming as well as some helpful resources as you continue your Kafka journey. Let's get started. If you've come this far in the course, then you have all the tools you'll need to build basic Kafka systems, with one or more producer applications, sending events to Kafka topics, and one or more consumer applications reading and processing those events in near real-time. There are a plethora of problems that can be solved with some variation of this theme. For example, event-driven microservices. A growing number of developers are realizing that event-driven microservices provide many advantages over the traditional request response architecture. Rather than having application one make an API call to application two, it can produce an event to a Kafka topic. Application two can then consume that event, do its work, and produce a new event to a different topic. Application three can be listening to that topic and respond as soon as an event arrives. Not only does this lead to a lower degree of coupling between the microservice applications, it also provides opportunities for expansion as new applications can respond to those same events without affecting the current architecture. Events and Kafka topics are also replayable. If, for example, we find a bug in one of our applications, we can fix that bug, reset the offsets on the consumers involved, and replay past events with the corrected code in place. All this can be done in our Python applications using basic producers and consumers. However, there are times when we need to do more complex processing. For this, we can use something like ksqlDB. ksqlDB is a stream-processing engine based on Kafka streams which allows us to build complex and powerful event-streaming applications using SQL. It is a separate application that we would run in its own cluster, or as a managed service on Confluent Cloud. ksqlDB works with the same Kafka topics as our applications so it's easy to integrate with, and it can do a lot of heavy lifting for us with a few lines of SQL code. There are a couple of ways to integrate our applications with ksqlDB. As we mentioned, ksqlDB shares the same topics as our applications, so for example, we might have a Django app that is producing data to topic A. ksqlDB can consume that data, perform whatever processing or data transformations we need, and write its output to topic B. Then we might have a Python consumer application that will subscribe to topic B and use that transformed data to complete some process. Another way to integrate with ksqlDB is to use its HTTP API. We can execute queries that will perform the necessary processing, and return results, either as a single response, or a continuously updated stream. This can be a great way to build real-time dashboards. However we integrate with ksqlDB, the actual processing is defined using SQL. Let's take a look at an example. Let's say we have an application that is producing events to the orders topic using JSON Schema. We can create a ksqlDB stream from that topic using CREATE STREAM, and with the VALUE_FORMAT property, we can tell ksqlDB that we are using JSON Schema, and it will infer all of our fields and data types from the schema. This is just one more in a long list of benefits of using schemas. Next, say we want to filter order events from California and write them to another topic. We can create a new stream based on the results of a SELECT statement, using the WHERE clause to get the events we want. Then, if we want all those California orders fed to another application, for example to show up on a live dashboard, we can execute a push query by adding EMIT CHANGES to a SELECT statement. This will return a continuous stream of our California order events in real time. This was just a brief introduction to ksqlDB, since that's not the purpose of this course. However, if you're interested in learning more about what you can do with this powerful stream-processing tool, I encourage you to check out the ksqlDB 101 course on Confluent Developer. While ksqlDB is an excellent tool for event-streaming use cases, it's not the only one. If you'd prefer to handle this work with Python, there are a couple of open source projects you can take a look at. These libraries will require you to write more code than you'd need to do with ksqlDB but since they're pure Python, you can include them directly in your applications just as you would with the producer and consumer. Both of these are available on PyPi, and you can learn more about them from their respective GitHub repositories. Now, you have the tools you need to get started with Python and Apache Kafka, but there's so much more to learn in the world of Kafka and event streaming. I would like to encourage you to continue your learning journey at Confluent Developer. There are over a dozen in-depth courses as well as tutorials, blog posts, audio podcasts, and so much more. There are also some great books on the subject. These four are available for free download from Confluent, but don't let the price fool you. These are excellent books by some of the leading thinkers and practitioners in the field of streaming data. And finally, learning a new technology is a journey, and like any journey, it's way more enjoyable and productive when you don't go it alone. There are thousands of developers, data engineers, operators, and enthusiasts in this community. You can find many of them at Kafka user groups around the world. The Confluent Community site can help you find the group nearest you. Many of them are meeting online so you can join in from wherever you are. Also, be sure to join the Confluent Community Forum and Slack group to introduce yourself, ask questions, and as you continue to learn, answer a few. I'll be hanging out there and would love to hear more about what you are building with Python and Apache Kafka. Thank you for spending this time with me. If you are not already on Confluent Developer, head there now using the link in the video description to access other courses, hands-on exercises, and many other resources for continuing your learning journey.

Be the first to get updates and new content

We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.