Lucia Cerchie explains what an Apache Kafka® cluster is, and why it's unique. Learn how Kafka supports speed, scalability, and durability through its cluster structure.
Lucia Cerchie explains what an Apache Kafka® cluster is, and why it's unique. Learn how Kafka supports speed, scalability, and durability through its cluster structure.
Read more about Kafka clusters
Intro
Hi, I'm Lucia Cerchie from Confluent, and in this video we'll be covering what a Kafka cluster is and what makes it special.
What is an Apache Kafka® Cluster?
Now, what's a cluster in the first place? Well, it's a group of computers working together to achieve a common goal. These computers, or servers, are called nodes. Now, within the context of Kafka, a cluster is a group of servers working together to store and move your data for processing, maintaining speed, or low latency, durability and scalability.
What gives Kafka its low latency?
Now, what is it about servers in a Kafka cluster that makes Kafka so fast? In other words, what makes it able to serve up so much data so quickly? Well, each topic or a category of messages that is stored in Kafka can be configured to be split into app and (indistinct) logs called partitions, and each of those partitions can live on a different node in a Kafka cluster. When data is written to a partition, it's quickly and efficiently appended to the end of the log structure. You can think of this like a store having multiple checkout lanes open. The more lanes that are open, the faster people can check out.
What makes Kafka durable?
Now, what makes Kafka so durable? Well, Kafka copies the data in each lead partition to some number of follow-up partitions across the cluster. So if a node fails, whether it's a leader or a replicate, it's backed up. Next, let's talk about scalability.
What makes Kafka scalable?
We've mentioned that each partition can live on a different node in the cluster. This balances the work of writing to and reading from the partition across the cluster. This is also what makes Kafka uniquely scalable, because in any distributed system, those topics could be handled by different brokers, but the partitions add into their layer of scalability. You can think of this as a highway with lots of cars. The more lanes you add, the more cars you can get to your destination.
Kafka resources to check out
If you enjoyed this video and want to learn more, visit Confluent Developer for the Kafka 101 course. There's also a blog post that treats this topic in more depth. Feel free to like and subscribe for more Kafka-related content.