VP Developer Relations
If all you had were brokers managing partitioned, replicated topics with an ever-growing collection of producers and consumers writing and reading events, you would actually have a pretty useful system. However, the experience of the Apache Kafka community is that certain patterns will emerge that will encourage you and your fellow developers to build the same bits of functionality over and over again around core Kafka.
You will end up building common layers of application functionality to repeat certain undifferentiated tasks. This is code that does important work but is not tied in any way to the business you’re actually in. It doesn’t contribute value directly to your customers. It’s infrastructure, and it should be provided by the community or by an infrastructure vendor.
It can be tempting to write this code yourself, but you should not. Kafka Connect, the Confluent Schema Registry, and Kafka Streams are examples of this kind of infrastructure code. We’ll take a look at each of them in turn.
Hey, I'm Tim Berglund with Confluent with a few thoughts on the broader Kafka ecosystem. Now if all you had were brokers managing partitioned replicated topics and you had an ever growing collection of producers and consumers writing and reading events, you'd actually have a pretty useful system. That would be pretty cool. However, the experience of the Kafka community is that certain patterns are going to emerge in what happens in those producers and consumers and what kind of adjunct needs arise around them. And these patterns are gonna kind of nudge you and other developers like you to build the same bits of functionality over and over again around core Kafka. And this is the classical temptation of the framework and it really can be a daunting temptation. Application code is usually driven by the business. There's some kind of business stakeholder or product owner who says crazy things like, you know, well, make sure that you only ever send one email per day. And so you design the system so that, you know, email instance and day have a one-to-one relationship. And then they say, well, you know what about two for Tuesday? It has to happen twice on Tuesday. So you have this special case code and it just makes things ugly. And you don't like that, right? There's always that kind of real world influence on application code. But when it comes to framework code, you are the boss. You get to make it beautiful and perfect and do exactly what you want. But often you have to kinda sneak that, right? It's usually not a story card that you admit to. You sorta have to work that framework effort in some time when maybe nobody's looking 'cause it's necessary architecture work or something like that. And then you get caught and you never really seem to be able to give your framework, worthy though it is, I'm not kidding about that, you never seem to be able to give it the attention it needs. It's always missing features. It's always got bugs you can't fix, unless it really is a strategic thing within the organization and there are other internal clients and it's like an internal product that you create. So the temptation when you see these things emerging in your use of Kafka, if all you ever did was produce and consume to build your own framework is one you ought to resist. You probably always oughta resist this. This is just a good architectural lesson in general. Because what you end up doing is you're building common layers of application functionality to repeat what I'll call undifferentiated tasks. Okay this is important code, it's code that does things that you need, but it's not tied to the business you're actually in. Like it doesn't serve the customers that your business actually has in a way that's unique to you with respect to your competition. And if it doesn't contribute value directly to customers, it's infrastructure and infrastructure in the best case ought to be provided by the community or an infrastructure vendor. Can be tempting to write yourself, but you should not. And things like Kafka Connect, the Confluent Schema Registry, and Kafka Streams, are all examples of this kind of infrastructure code. And I wanna take a look at each one of those, just with this kinda framing in mind. These are things that have emerged from the Kafka community that have solved common problems. And if you look at early Kafka adopters, you'll often see that they may have rolled their own of one or several of these in some way, but all of these have emerged as the standard things vetted by the community, worked on by lots of people broadly deployed and they're solving problems that are gonna come up. So once you've got the basics of Kafka's internal architecture of producing and consuming down, you really need to learn about Connect, Schema, Registry Streams, and any other future architecture component that emerges from within the Kafka community. It's a vibrant and growing thing. These needs are always growing and we need to keep tabs on these things. So with that, let's dig in.
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.