Get Started Free
‹ Back to courses
Untitled design (21)

Tim Berglund

VP Developer Relations

Gilles Philippart profile picture  (round 128px)

Gilles Philippart

Software Practice Lead

Introduction to Apache Kafka® 101

Whether you're building data pipelines, connecting microservices, or moving data across systems, Apache Kafka® is foundational. In fact, it's become the backbone of modern data systems.

When you think about data, you probably think of tables representing objects—like inventory items or user accounts. But Kafka shifts the focus from things to events—moments in time when something happens, like a product sale, a car signaling a turn, or a user clicking on your site. Kafka processes these events in real-time, not in delayed batches. You don't store events to process later; you act on them as they happen.

That doesn’t mean Kafka forgets events—it remembers them, and we'll explore how. This shift from things to events is powerful, and Kafka supports both ways of thinking. We’ll cover how to store, structure, and replicate events in Kafka, and compare it with cloud-native services like Confluent Cloud.

Kafka powers massive-scale applications, handling millions of events per second. On top of it, a whole data streaming platform is emerging—enabling real-time computation, governance, and seamless integration with non-Kafka systems.

Do you have questions or comments? Join us in the #confluent-developer community Slack channel to engage in discussions with the creators of this content.

Use the promo codes KAFKA101 & CONFLUENTDEV1 to get $25 of free Confluent Cloud storage and skip credit card entry.

Introduction

Hi, I'm Tim Berglund with Confluent, and I want to tell you all about Apache Kafka. Kafka has become the universal foundation on which data systems are built. Whether you care about data pipelines and analytics, or you write applications and you've got microservices to connect, or you're just trying to get data from some other system to where you are, Kafka is going to be a foundational part of your world. If you're like me, when you think about data, you're probably inclined to think about data as tables, right? And tables are very good at storing representations of things, things in the world that our software deals with, items in inventory, internet-connected smart cars, users who have signed up for a site, like all kinds of different things.

You're going to see that Kafka encourages you not to think first of things, but of events, things that happen in the world at a specific point in time. Like an item getting sold, that's an event. A driver in a connected smart car using their turn signal. A user clicking somewhere on your site. Those are all events, and events form the backbone of contemporary data systems, and Kafka is the infrastructure the world uses to manage them. And because an event is, by its very nature, a thing that happens at a particular time, Kafka is very focused on processing events in real time. This means that systems built on Kafka tend to do their computation on events as soon as they occur. They don't store them up in tables, or files, or some other structure to be processed later on in a batch. You never put the data over there promising to get back to it later. No, an event happens, you do the work on the event right now.

That's not to say that Kafka can't remember events that happen. It can, and we'll look at how. And this tension between thing and event, well, that's not absolute either. We'll see how Kafka can help you manage the schema of events that you're storing in it. That's very much a thing way of thinking. And we'll see how to put things into Kafka, how to read them out, how events are structured, how Kafka manages storage and replication, all of those mechanics. And we'll look at the differences between Apache Kafka and modern cloud-native Kafka services like Confluent Cloud. And of course, this is just the beginning. I said Kafka is the foundation of modern data systems, and I meant it. After this, there is so much more for you to learn. You're really just getting started.

Many of the world's largest companies use Kafka. Some handle millions of events per second, billions per hour, trillions per day. There are just crazy numbers of companies, like these logos that you see here, using Kafka in the wild. And an entire platform is emerging on top of it. For example, we'll want to perform real-time computation on all these events or do what we call stream processing. We'll need to impose governance on all this event data. We'll need to connect Kafka with other non-Kafka systems. There's an entire ecosystem of tools and components emerging on top of Kafka, one we like to call a data streaming platform. It's taken shape in a real way since Kafka got its start, and it's continuing to grow as we move forward.

And to give you an idea of what you have ahead of you in this course, if you're taking it on Confluent Developer, you should see the list of modules off to the side. But if not, after this, there are 10 more modules covering important topics that get you introduced to this space. Let's get started.

Be the first to get updates and new content

We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.