Here are some of the questions that you may have about Apache Kafka and its surrounding ecosystem.
If you’ve got a question that isn’t answered here then please do ask the community.
Apache Kafka® is an open source, event streaming platform. It provides the ability to durably write and store streams of events and process them in real time or retrospectively. Kafka is a distributed system of servers and clients that provide reliable and scalable performance.
Learn more about what Kafka is in this free Kafka 101 training course.
To get started with Kafka check out the free Kafka 101 training course, join the community, try the quick start, and attend a meetup.
After that, explore all of the other free Apache Kafka training courses and resources on Confluent Developer, and check out the documentation.
Kafka's performance at scale can be attributed to the following design characteristics:
You can learn more about some of the benchmark tests for Kafka performance, as well as the design principles that make Kafka so fast.
Kafka is used widely for many purposes, including:
You can see a few of the many thousands of companies who use Kafka in this list.
As an event streaming platform, Kafka is a great fit when you want to build event-driven systems.
Kafka naturally provides an architecture in which components are decoupled using asynchronous messaging. This design reduces the amount of point-to-point connectivity between applications, which can be important in avoiding the infamous "big ball of mud" architecture that many companies end up with.
Kafka can scale to handle large volumes of data, and its broad ecosystem supports integration with many existing technologies. This makes Kafka a good foundation for analytical systems that need to provide low latency and accurate information.
Event streaming has been applied to a wide variety of use cases, enabling software components to reliably work together in real time.
Kafka has become a popular event streaming platform for several reasons:
Kafka is an event streaming system that may be a good fit anywhere you use a message bus, queuing system, or database. It excels at the real-time processing of data, so it may be an especially good match if all your data matters, but the latest data is particularly important. For example, instant messaging, order processing, warehouse notifications and transportation all need to store and process large amounts of data, and handling the latest data swiftly is essential. (See more example use cases.)
You can use Kafka with nearly any programming language, and there are step-by-step getting started guides for the most popular languages, as well as quick examples on this page.
For more on how Kafka works, see our Kafka 101 course, and to understand how event-driven systems work (and why they work so well), see our guide to Thinking in Events.
The Kafka broker is written in Java and Scala.
Client libraries are available in many languages and frameworks including Java, Python, .NET, Go, Node.js, C/C++, and Spring Boot. There is also a REST API for interacting with Kafka. Many other languages are available with Kafka community-built libraries.
All technologies have tradeoffs, but in recent years Kafka has seen tremendous adoption as people move away from traditional messaging queues such as IBM MQ, RabbitMQ, and ActiveMQ. Common reasons why people move to Kafka are for its durability, speed, scalability, large ecosystem, and event-stream processing integrations.
You can read more about Kafka’s benefits over traditional messaging middleware in this white paper, and this article that benchmarks Kafka against other systems.
Kafka has five core APIs for JVM-based languages:
Learn how Kafka works, how to use it, and how to get started.
This hands-on course will show you how to build event-driven applications with Spring Boot and Kafka Streams.
Build a scalable, streaming data pipeline in under 20 minutes using Kafka and Confluent.