Get Started Free
Gilles Philippart profile picture  (round 128px)

Gilles Philippart

Software Practice Lead


About this course

How do you move from a simple proof of concept to a bulletproof data streaming system that is ready for production deployment? In this course, you will learn how to avoid pitfalls when scaling your data streaming platform. Additionally, you'll delve deep into the GitOps framework, allowing you to deliver changes swiftly and securely, not only to your platform but also to the streaming applications built on it.

What You’ll Learn in This Course

  • Gather the data requirements
  • Design the Data Streaming platform
  • Plan for business continuity
  • Automate the road to production with GitOps
  • Build a staging and production data streaming platform with Terraform
  • Operate the platform
  • Deploy streaming applications the GitOps way with FluxCD
  • Productionize streaming applications

Intended Audience

Anyone who has a basic understanding of Apache Kafka.


Approximately 3 hours.

Cheat Sheet

Here's the cheat sheet PDF. You can print it out and take it on the field with you to assess if you're ready for production!

Course Author

Gilles Philippart

Gilles works at Confluent as a Software Practice Lead and has over twenty years of experience in the financial sector, primarily developing systems for investment banks. In recent times, he served as a Principal Engineer at a FinTech company, where he built a platform that offers loans to small and medium businesses in the blink of an eye. Through this work, he became a proponent of event-driven architectures with Apache Kafka and is now passionate about guiding others to leverage its advantages. Outside of work, Gilles enjoys reading books, whipping up culinary delights, and playing games with his daughter and wife.

Use the promo code KAFKAPROD101 to get $25 of free Confluent Cloud usage

Be the first to get updates and new content

We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.


Hi, I’m Gilles from Confluent, welcome to the “Mastering Production Data Streaming Systems with Kafka” course. If you’re starting with Kafka, you might have built a proof-of-concept data streaming application on a laptop or a test environment

and you’re probably wondering what you should do to take it to production

and how to assess that you’re ready for day 2 after the go-live.

At Confluent, we’ve heard those questions many times. If you’re not careful, what seemed like a stroll in the park could become a risky undertaking. In this course, I will go through what you need to know to build a  production ready system. I will talk about infrastructure, architecture and governance,

In particular, we will see how data, the platform and the applications influence each other. You will learn how to create a data streaming platform and make it easy to operate and evolve, quickly and safely. For practicality, I’ve compiled a cheatsheet PDF for this course.

If you’re watching us on YouTube, check out the link in the video description.

If you’re watching us on Confluent Developer, it is in the additional resources below. Alright, let’s get started.

This is what a typical Data in Motion journey looks like. The arrow is where you currently stand, experimenting with a POC, at level 1.

Level 2 is usually when you develop a non-mission critical workload, maybe an analytics use case and deliver it to production.

If the platform has performance problems or the infrastructure fails, it’s not a big deal because it’s not a core business application.

But for this course, our goal is to create an event streaming platform

and deliver a mission-critical event-streaming application in production. That’s level 3. Mission-critical means that the application is essential to the core operations of the company. For instance, a trading application in a bank or a customer order application in an online retail store would be considered mission-critical. This course is going to take you on this journey and we'll cover all you need to know along the way. But before we move on, let’s take a quick detour at what the future looks like.

I’m sure you were left wondering what lies beyond level three.

You know, the really cool thing about data streaming starts with level 4 when the business value skyrockets as you extend the platform to other departments. Eventually, in the last level, it truly becomes a central nervous system which makes the data easily and instantly accessible across the enterprise. Level 4 and 5 deal with that further expansion and will be covered in the follow-up course. In the meantime, let’s carry on with the rest of this course

and let’s see what it means exactly to be “production ready”

You can look at five quality attributes to gauge if your system is production-worthy: First, it needs to be secure.

The last thing you want is for hackers to steal your data, or your customers' personal information being leaked on the internet. Second, your system must also be reliable.

It should be designed to prevent and mitigate failure. Third, performance is often critical too, as customers have plenty of options available nowadays.

If you cannot serve them quickly enough, they will go to your competitors. Next, operational excellence.

Like most businesses today, developer speed and agility is key, so your streaming system must be easy to support and evolve.

After all, from day 1 in production, you will have to run, monitor and update your system. Last but not least, you want to ensure that your costs don’t explode, in particular when you need to scale the platform up.

But how do we achieve those qualities? The first foundation to build is architecture. In the event streaming world, the architecture has 3 main pillars:

Data, Platform, and Applications.
These 3 pillars influence each other: In the Kafka world, the data is stored in the platform in topics.

The platform provides APIs for applications to produce, consume and process the data. Just to be clear, when we say “Platform”, we don’t mean just components like Apache Kafka, Kafka Connect, or the Schema Registry. This “Platform” also encompasses the services and the processes around this streaming infrastructure.

Maybe you can think of it as the glue that will make it stick and not fall apart. When companies want to be successful, they usually put together a team to take care of their event streaming platform.

This team does a bunch of things like looking after it, helping people get onboard, and troubleshooting issues. But if you don’t have a dedicated platform team right now in your company,  don’t worry!

At the start of this journey, it’s probably ok for your team to own the platform and support it for a while. But, think for a moment about what would occur as more data and more applications are introduced?

If you’re not careful, it will get harder to keep everything running as  smoothly as in the early days. These new applications are run by teams and now you need to make sure that everyone abides by the rules, get the security and the access policy right. This new data and new applications will put more strain on the streaming platform infrastructure. It could even make your own application fall apart as latency is on the rise and timeouts more frequent. That is what we call the Accidental Shared Service. It happens if you share your platform with multiple tenants but don't really have the time or budget to plan for it.

Your team will soon be struggling with support and onboarding,

and you run the risk of slowing everyone down in your company

if you’re not reactive enough when others have problems with your platform. What started as a tidy and coherent place will gradually become cluttered and chaotic without proper guidance. How do we solve this problem?

Sharing your platform with others requires two additional foundations:

Governance and Organization. Governance and Organization are advanced topics and will not be covered  in this course. However, we will discuss them in more detail in the follow-up course as they become crucial when extending the platform to other areas of the enterprise.

In the meantime, “a little planning can go a long way”. Over the next modules, I’m going to highlight a few best practices we’ve seen which will save you a lot of time and trouble later on. Ok now, let’s focus back on the architecture and take a look at its first pillar: Data

Throughout this course, we’ll introduce you to the topic with hands-on exercises for producing data to and consuming data from Confluent Cloud. If you haven’t already signed up for Confluent Cloud, sign up now so when your first exercise asks you to log in, you are ready to do so. And be sure to use the promo code in the description: it provides enough free credits to do all of the exercises for this course.