Get Started Free
November 15, 2022 | Episode 243

Decoupling with Event-Driven Architecture

  • Transcript
  • Notes

Kris Jenkins: (00:00)

Perhaps the most interesting principle in data mesh to me is this idea of treating your data as a product. That each team has data that it's going to make available to the rest of the organization. And that data has to be well structured, has to be high quality, and it has to be usable by other teams in ways you expect, sure, but also in ways you can't predict. If you do it right, then data as a product frees up other teams to build their systems without having to explicitly coordinate with you. They can just take the data and process it for purposes that you don't have to foresee or explicitly help with. Done right, data is the thing that can decouple teams. But doing it right's the trick, isn't it? And that's the subject of today's podcast. Joining me today on Streaming Audio is Florian Albrecht, who's the lead architect of the German parcel delivery company, Hermes.

Kris Jenkins: (00:58)

He's been finding some practical ways to help their teams open up their data to maximize the amount they can cooperate without having to coordinate. They've been using a mixture of training, event storming sessions, and a new software tool they've been working on called Galapagos. Plus, a little bit of benevolent monitoring. Before we begin, Streaming Audio is brought to you by our education site, Confluent developer and our cloud service for Apache Kafka, Confluent Cloud. More about both of those at the end, but for now I'm your host, Kris Jenkins. This is Streaming Audio. Let's get into it.

Kris Jenkins: (01:40)

Joining me today is Florian Albrecht. Florian, welcome to the show.

Florian Albrecht: (01:43)

Thank you.

Kris Jenkins: (01:45)

So you are a Solutions Architect at Hermes in Germany, right?

Florian Albrecht: (01:50)


Kris Jenkins: (01:50)

And they're the big logistics package delivery company?

Florian Albrecht: (01:55)


Kris Jenkins: (01:55)

Are you just based in Germany or is it more... wider across Europe?

Florian Albrecht: (02:00)

We are mainly based in Germany. We have some associate companies in Europe, but our main focus is Germany.

Kris Jenkins: (02:08)

Okay. And you've been there... so I read your bio. You've been writing Java since 1999, which...

Florian Albrecht: (02:17)

Almost from the beginning.

Kris Jenkins: (02:18)

That's something any recruiter would aim for on a CV, right?

Florian Albrecht: (02:23)

Yeah, possibly.

Kris Jenkins: (02:24)

But you're relatively newer to event driven architectures. You've been doing that... this has been part of a move in Hermes, right?

Florian Albrecht: (02:32)

Exactly, exactly. I only encountered it when... well starting at Hermes, in 2019, and then I got known to the idea of event driven architecture.

Kris Jenkins: (02:43)

Okay. And over the past three years you've been developing something called Galapagos.

Florian Albrecht: (02:49)

Right. Exactly.

Kris Jenkins: (02:50)

So tell me what that is. What's it trying to solve?

Florian Albrecht: (02:54)

Yeah, when we started with event driven architecture at Hermes, well, we relatively quickly discovered that Kafka would be a good technical solution for implementing a company-wide event driven architecture. But after some tests, we saw that, yeah, we would need some rules and guidelines to have really event driven architecture. So Kafka itself was really good for what it's doing, but it's... Has a really technical focus. And we, as architects, had a more application focus with all the enterprise architecture in mind and so on, and who communicates with whom, and why. And so... yeah, we created some rules and guidelines and said, "Okay, that's the way how we should use Kafka to really make the most out of it in terms of event driven architecture." And as we are a really agile company with really many agile teams, autonomous teams, we saw that it's not a good idea to just write some rules and guidelines on the wall and hope that everyone adheres to it.

Florian Albrecht: (04:06)

So we had the idea of creating a software to enforce- First of all, to enforce these and guidelines, but also to help the teams... using Kafka. So really basic, starting with getting your API keys and something to access Kafka, to really get the connection data configuration stuff. But then also to create topics and... subscribe topics, browse topics, see what's there. And the background enforce all these rules and guidelines. So we are happy as architects and the teams are happy, so they don't have to deal with all these rules and guidelines and still get what they need for their daily work.

Kris Jenkins: (04:52)

So you're encoding these rules as software to make it easy for people to stay on the rails?

Florian Albrecht: (04:58)


Kris Jenkins: (04:58)

Yeah. So give me an example. What's a rule that you are trying to guide people towards?

Florian Albrecht: (05:05)

One of the central rules is that we will want to talk about events, not data. So previously many applications at Hermes's copied just data from one point to the other point, perhaps change the data a little bit, then to the next application and so on. In the end, nobody knew... When did the data change and why? And why does it look like it's looking now? And that's especially bad if you have some complaints from customers who want to know, "Why did my parcel get there and not to me?" And then we have to check every application, where did the address change and whatever. And so the first rule is use events. Events are immutable. If you create an event and say the address changed, that's true, that's true in itself. So the address of a parcel changed is an event which you can publish, you can copy it, it has a timestamp, it has some other basic information, and of course also the address.

Florian Albrecht: (06:11)

So all the data is also in there, but it's enclosed in this event envelope if you want. So you can copy it and you are safe. Mutable objects are great, every developer knows this, because you can safely copy them. And this is the first rule we want to enforce. So the teams- And it's really not easy for the teams to switch their minds from data to events. So most of the time they need a little guidance from the software, but sometimes also from ourself. Let's talk about events. What are the business events in your domain? And yeah, that's one of the items in which Galapagos enforces and encourages.

Kris Jenkins: (06:54)

Okay. So how do you... what else do you do to get those ideas across? I mean, is it just in software? Or are you trying to educate the developers as well?

Florian Albrecht: (07:05)

Yeah, we have to. Definitely. So the software itself is nice to have and many teams have a good start using the software, especially if they want just to read events, and consume them. But when it really comes to creating events and so on, thinking of them, we often make workshops with the teams to get them known to the concepts. And we make so called event storming workshops and really explain our concepts and... So we have one to two hour workshops with the teams, with each team really, to answer all their questions and get them known to the concepts. And that's really important. Communication is everything for an architect.

Kris Jenkins: (07:52)

Yeah, it's your main tool for actually getting these ideas across. Right?

Florian Albrecht: (07:55)


Kris Jenkins: (07:56)

Yeah. Because I sometimes think the thing with event driven architectures is that the technology is kind of simple, it's quite graspable. But thinking about how you solve problems in a different way, thinking of the world as not a collection of mutable objects, but something else is sometimes the hardest part.

Florian Albrecht: (08:16)

Exactly, yes.

Kris Jenkins: (08:17)

How have you found it's gone? Like teaching people these ideas? What's been the sticking points?

Florian Albrecht: (08:25)

Depends. It really- It depends from team to team. We have to have some teams which directly get the idea and say, "Yeah, that's- Why didn't we work all the time this way?" And we have some teams, mainly in the- Where it's really technical data that they are shifting now, where it's really hard to get the idea of getting these two to three levels up to think in business events, "What's really the business thing, what's happening here?" Because they have some scan from a scanner on... some kind of belt where the parcels are shipped and getting that to the business behind what's really happening here, that's the hard part. But it's also, on the other hand, it's really cool because our business now also gets into the discussion. Now we discuss about business events. So it's not just the developers but also business has now an opinion about this, what's happening there. We can now talk to them, "Is this really a parcel revolution?" Or some other things with mainly German names for these events, and so on. And they can answer this. They cannot answer this if this field has a correct format or something like that, but what's happening on the business side, they can really give answers to this. And this is really helpful. Just really gets our teams on a next level.

Kris Jenkins: (09:48)

Okay. So you're also using it as a way to get the different parts of the business to communicate?

Florian Albrecht: (09:53)

Yeah, that's also really... yeah, positive outcome of this. Yeah. Exactly.

Kris Jenkins: (09:57)

Yeah. Yeah, I'll bet. But okay. So back to the technology. How do you actually enforce an idea like that using software? Or how do you guide?

Florian Albrecht: (10:08)

We guide with Galapagos, if you create a topic, then you ask some questions, first of all about the topic. You notice that it's not good to just go with the data. So it's asking for example about your business capability you're trying to map here. And most developers think- And that's a second thing, "Okay, wait a second, I have no idea what this means." And at least for the first topic they publish and then "Okay," they get a workshop. And then, "Okay, that's my business capability. Okay, understood." And then, "Okay, what type of event did happen?" And some other things. We also have some different types of topics. So we don't only have events, we have also master data topics, but that's really complicated. So topics where we publish master data, which does rarely change over time. For example, the list of all our logistics centers or something like that.

Kris Jenkins: (11:09)

Okay, so like a standing data topic?

Florian Albrecht: (11:11)

Yeah, exactly. Exactly. And... Which is really great to map with Kafka as we have the lock compaction there.

Kris Jenkins: (11:17)

Of course.

Florian Albrecht: (11:20)

And also with this distinction, they have the idea, "Okay, do I have an event or master data? Standing data?" Commands are also a possibility and this stuff really get them onto the tracks in the direction of events. But still, to be honest, of course it happens that a team creates a topic which sounds like an event, has a name because Galapagos also checks the name, but still not really is an event at the end of the day. And for that we have still some kind of notification mechanisms. We, as architects, get notified when new topics are created.

Kris Jenkins: (11:59)

Oh, okay.

Florian Albrecht: (12:00)

We don't have to... We don't have to do anything. So we are not a quality gate. We can just let it go. But we also can check, "Okay, there that's a new team. We didn't have a workshop with them. Let's check. Okay, yeah, okay. Let's talk to them." Because it's now on the development environment, and then we have the time to adjust things, and then we get to talk to them, and then usually they get the idea, and then it gets a new topic with better quality.

Kris Jenkins: (12:26)

So you're also monitoring things as they evolve?

Florian Albrecht: (12:30)

Yeah, exactly. Really as in Chronos, just a Teams... Microsoft Teams notification, which is really great and lightweight.

Kris Jenkins: (12:39)

Okay. Is that... Is that dispatched by a topic? I have to ask.

Florian Albrecht: (12:47)

Not directly, no. It's the event block of Galapagos, which is assigned to a Teams channel. Yeah.

Kris Jenkins: (12:53)

Okay. So how many teams are we talking about, by the way? How many developers?

Florian Albrecht: (12:58)

Developers are... Currently, we have 50 developers being registered in Galapagos. We have 12 different application teams of 14 at Hermes, dealing with software. So most of them already use Kafka and Galapagos, at Hermes. At Otto, I just learned, our mother company, which also uses Galapagos, they plan to scale it up to 100 teams. They...

Kris Jenkins: (13:23)

Oh, crikey. Okay. You have... You've got to refine and perfect this software before you're scaling up to that size, right?

Florian Albrecht: (13:30)

No, I'm confident that it'll work fine for them.

Kris Jenkins: (13:33)

Okay. Okay. Good. Well give me another example then. What's another thing it works on?

Florian Albrecht: (13:37)

I'm sorry?

Kris Jenkins: (13:39)

What's another thing that Galapagos does to keep people on the rails?

Florian Albrecht: (13:45)

Well, okay, it enforces, as I said, topic type, the naming schema. And we have a schema registry also. I know Confluent also have a schema registry. Where we keep track of the JSON Schemas for the data on the topic. So if you want to change it, you have to do it in a way which is consumer compatible. So for example, that's... one idea, to get a step back. One idea of event driven architecture is also to decouple the teams. Not only technically decouple them, but also to have them, in the end, not need to talk to each other.

Florian Albrecht: (14:25)

So if I as a provider of data or events want to change something, I should do it in a way which does not kill the production applications, which already read my data. So this is also enforced by Galapagos. So if I publish a new schema, Galapagos checks that it's consumer compatible. For example, usually I can add new fields and the existing applications just ignore these fields. But I cannot remove a previously required field because an application could be relying on this field to exist and then immediately stop working if I stopped publishing this field. So this is also enforced by Galapagos.

Florian Albrecht: (15:08)

And also... So we covered the full topic life cycle because if I really, at the end of the day, have some changes which are so big I cannot do them in a consumer compatible way, I can also get rid of a topic by marking it as deprecated. But then I have to give it an end of life date, until when I will provide... Continue to provide data on this topic, which must be at least three months in the future. And all my consumers get notification that it's now deprecated, so I have at least three months time to change to, for example, a successor of this topic.

Florian Albrecht: (15:52)

So all of this gets them in the idea of working decoupled from each other. So previously, I have to go to all my consumers and talk to them, "Hey, I want to change something. Are you ready for this? Let's do it on that date." Everyone has to change something on that date. And now it's completely decoupled. I can change at any time my topic, my consumers can adjust when they have time for this, and I can also get rid of my topic and my consumers can then switch to a successor topic, for example.

Kris Jenkins: (16:25)

And you're also introducing this idea of an SLA for the data that people are publishing?

Florian Albrecht: (16:30)

Yeah. Some kind of SLA, exactly. So you have to publish your data. And that's also an idea we have to get in the minds of the teams. If you are a publisher, you are really responsible for a crucial point of our infrastructure. So don't just stop sending your data because many critical systems rely on your data.

Kris Jenkins: (16:51)

The idea of data as a product.

Florian Albrecht: (16:54)


Kris Jenkins: (16:54)

Yeah. Is this... Is this something you've picked up from the ideas around data mesh? Or is it something you've discovered independently?

Florian Albrecht: (17:04)

To be honest, we discovered it- Discovered, yeah. We had the idea independently. Exactly, yes. So yeah. Came from the other corner, so to speak. From the event driven architecture. And then what does it have for effects on the teams and on the data? But yeah, I also learned the idea of data mesh, and the ideas from Confluent are really interesting and promising, and we'll keep an eye on this. And I think there can be, one day perhaps, some synergy effects there.

Kris Jenkins: (17:32)

I think it's one of those ideas, once you start thinking about immutable data, you're almost on an inevitable path to discover it or invent it. Whether you take the fast route, stealing the idea from someone else, or you discover it on your own, right? Okay. That's interesting. So... And you've generally found the uptake on this has been good?

Florian Albrecht: (17:52)


Kris Jenkins: (17:52)

For different departments?

Florian Albrecht: (17:56)

Yeah, yeah.

Kris Jenkins: (17:56)

What kind of feedback have you been getting?

Florian Albrecht: (17:58)

Well, one of the best feedbacks we got, and we got it twice, was, "Where did you buy Galapagos? Which company created?" That was really nice feedback. And yeah, the teams are really happy because they get really quick access to Kafka. It's really low level, but it's still first positive impression. And they see the data that they need, they really can easily find the data they need, and it's really easy to create new topics and so on. So I'm really happy because that's our overall goal, to have it as a self service where the teams can really quickly get what they need. So that's... Also we don't have... As I said, we have no quality gates there. We have to approve something also. And they really get quickly what they need without asking someone, and they really like this because this is exactly what they need in their sprints to be able to just work on their goals.

Kris Jenkins: (19:01)

Free them up to make things and free them up from coordinating with other departments.

Florian Albrecht: (19:05)


Kris Jenkins: (19:06)

Yeah. Yeah, absolutely. That's part of the ideal promise of using datas to decouple things, right?

Florian Albrecht: (19:12)


Kris Jenkins: (19:13)

So what's been the business effect on Hermes moving to a kind of data driven, event driven architecture?

Florian Albrecht: (19:22)

Well, we get... The best thing is that we can really quickly implement new business ideas. So as we have the data available, every team which has a new idea just can take the related events from Kafka via Galapagos and just attach to it and implement the new business idea. And that's great because previously, yeah, okay, we make a project and then okay, we have to involve this team and that team and the other team, and perhaps in two or three years we have a new feature available. And now we get things really quickly to the market. The time to market of new ideas has dramatically reduced by this.

Florian Albrecht: (19:59)

We had some ideas, for example, during the COVID-19 thing. We had the idea of having the drivers taking a photo of the signature of the customer when receiving a parcel because previously they had to use the device of the driver of the...

Kris Jenkins: (20:17)

Oh yeah.

Florian Albrecht: (20:20)

And then they had the idea, "Hey, just... Let's just take a photo because it's contactless." And then we attach the photo as a proof of... That the customer got the parcel. And they also used Kafka for this, transporting this new information, and they had this from the idea to time to production in two weeks. And also because we had this infrastructure available, which was really helpful.

Kris Jenkins: (20:47)

That's fast. That is one sprint to actually put in production.

Florian Albrecht: (20:49)

Yeah, exactly.

Kris Jenkins: (20:50)

That's nice. Because it's an easy idea to have during a pandemic, but actually implementing it when everything else is going on. That's impressive. Yeah. How large is Galapagos now? I mean, how many things does it cover?

Florian Albrecht: (21:07)

In terms of?

Kris Jenkins: (21:08)

I mean the software, you've got this, it's open source, right? It's an open source software project.

Florian Albrecht: (21:12)

Exactly, it's open source.

Kris Jenkins: (21:15)

Is it... Is it till... Is it just maintained by you or is it a team of architects developing it?

Florian Albrecht: (21:20)

We have a team of two to three developers. So, no more. And we have contributors, okay, but not that many. We are always happy to have new contributors. So it's not that big. It's a... It's a spring boot application and has an angular front and it's still okay in size, let's say. So it's not a small tool but also not really a huge tool where we need multiple servers to operate it. Also, it's just some Docker Image which you can put in your Kubernetes Cluster and runs really fine. So it's really still easy to operate, I think.

Kris Jenkins: (22:00)

And it's open source? So other people could check this out and use it for their own systems?

Florian Albrecht: (22:04)

Exactly. Yeah. It's at GitHub available, and we have also a demo available, and... Yeah, we really have some users already. And yeah, we're happy to-

Kris Jenkins: (22:15)

Okay, so other companies are using it?

Florian Albrecht: (22:18)

We only know from Otto. We had many questions from other companies, but we don't know yet if they really are using it in production already. So we haven't got feedback from them in the last months or so. But from Otto we know that they're really using it and planning to use it on a big scale. And we have discussions in GitHub, so we're always happy to get feedback there. So discussions is a relatively new feature of GitHub where people can really interact with us and post questions, give feedback, and so on, which is really cool.

Kris Jenkins: (22:55)

Okay, nice. We should send people there if they're interested in joining the project. We'll put a link in the show notes. But where do you see it going in the future?

Florian Albrecht: (23:04)

Yeah, we plan to really have Galapagos even stronger integrated with Confluent Cloud. We are, for example, planning to use the Confluent Cloud schema registry instead of our own. And also want to offer more of the infrastructure of Confluent Cloud. For example, managed connectors and something like that, to have also available in Galapagos that you can give- Get it via one click.

Kris Jenkins: (23:30)

Guiding people into using connectors as part of the managed system?

Florian Albrecht: (23:35)


Kris Jenkins: (23:36)

Yeah, yeah.

Florian Albrecht: (23:36)

Because this is where Hermes as a company wants to go. We want to use the software as a service wherever possible. So instead of creating our own Kafka clients wherever we are, we can do it, we want do it with managed software. For example, with a managed connector with ksqlDB, something like that. And this all the infrastructure which we could perhaps also provide in Galapagos. This is one of the main ideas we have. to Go forward there, to integrate it even more there.

Kris Jenkins: (24:06)

Do you use connectors much, at the moment, self managed?

Florian Albrecht: (24:11)

No, we don't have any self-managed connectors at the moment. We only have managed connectors from Confluent. And we're still seeking ways to better integrate it, even with the Google Cloud. Because we, as Hermes, are in the Google Cloud and we have already- Just today shared some ideas with our Technical Key account at Confluent, how to get Confluent Cloud even better integrated with Google Cloud, so we get it even more seamless there. Because still currently the connectors are managed by Confluent and are in the Confluent Cloud. Visible in the UI but not in the Google Cloud. So we really have to create adapters, for example, for monitoring things and things like that.

Kris Jenkins: (25:02)

Oh, I see. Yeah. Yeah.

Florian Albrecht: (25:03)

And this is where we can think of even better ways to integrate it so it's more seamless for Teams. Not really... Not directly Galapagos related but our overall way forward.

Kris Jenkins: (25:15)

So is Galapagos also going to take care of monitoring? Or is it more just about setup?

Florian Albrecht: (25:21)

No, that's not on the scope of Galapagos. Galapagos provides everything which you need for good monitoring, but monitoring itself is done outside of Galapagos.

Kris Jenkins: (25:30)


Florian Albrecht: (25:30)

Currently, at least.

Kris Jenkins: (25:32)

I see. So jumping back to the kind of social organization stuff, because it seems like the thrust of the technology here is about enabling people. So to jump back, you've been doing this for three years right? Have you found... What's been hard about getting people to move over to this kind of architecture?

Florian Albrecht: (25:58)

Yeah. Depends. Sometimes it's not hard at all. And yeah, sometimes it's really... If they're really used to some special technology then it's really hard to get them convinced that Kafka is now the better thing and that they can do better stuff with them. But it's mostly the two things mixed up there. It's a new technology and a new idea. The event driven architecture and a new technology. So this is... Usually we start with telling them about Kafka, and why Kafka is great, and all the ideas of Kafka with consumer groups, and lock compaction, and so on. And they really get that point quickly. If they see, "Okay, that's a good idea. That's really some advantages over, for example, JMS." Then we get the next step and tell them about event driven architecture. And then it's not that hard, it's possible. So we have... 12 of 14 teams which are using this happily. So I think we managed to convince them.

Kris Jenkins: (27:04)

I wonder if the other two are listening to this.

Florian Albrecht: (27:09)

The one team is really in mobile applications, so they... They're not needing Kafka now, apparently. But let's see... It's... Perhaps it's also coming up there. Let's see.

Kris Jenkins: (27:23)

It's interesting that you... Because I often wonder how you teach people this, that you start from the technology side and go up to the abstract ideas.

Florian Albrecht: (27:33)

Yeah, it's maybe... Not... A little bit unusual, but it's the best for our teams as they're really... Mainly developers and mainly focused on technology. So it's, yeah... catch them where they are.

Kris Jenkins: (27:49)

That's true, that's true.

Florian Albrecht: (27:51)

Yeah. Because as architect you always are suspicious that you have some high ideas in your tower and there's nothing to do with the daily work. And so it's a good idea to tell them that we have an idea about the technology and only then get to the higher ideas, so.

Kris Jenkins: (28:09)

Right. Yeah.

Florian Albrecht: (28:09)

It's proven effectively.

Kris Jenkins: (28:13)

Try and solve their problems first and then build up to the larger picture. Yeah. Okay. I can see that. I can see that. Did you find it got easier the more teams you got on board?

Florian Albrecht: (28:23)

Yes. Yes, definitely. We got used to it and they could ask the other teams. So they are more trustworthy than we are, so they asked the other teams, "Hey, what are your ideas about this?" And so on. And yes, that worked good.

Kris Jenkins: (28:39)

And what about the business side? Have the business people... As well as coming in on these event storming teams, how have they seen it evolve?

Florian Albrecht: (28:47)

That is different from team to team. So we have some teams which we call, yeah, "competence teams" or something like this. It's more than DevOps. So it's DevOps with the business also inside. And in these teams, this was really an easy way forward, so the business was really happy to finally understand what's going on. So that was really good. And we have teams where business is really far away. And there we have... We had sometimes the business attaching the workshops, but it's still a hard way forward there because they're really far away from the IT, and we still have to get them more involved. But we're working on this.

Kris Jenkins: (29:29)

Gradually drawing them into your web.

Florian Albrecht: (29:32)


Kris Jenkins: (29:34)

Awesome. Well it sounds like you've got your work cut out for you, if you're growing into other branches of the parent company.

Florian Albrecht: (29:43)


Kris Jenkins: (29:44)

If you're growing into other branches of the parent company, Otto.

Florian Albrecht: (29:47)

Yeah, yeah.

Kris Jenkins: (29:48)

You're going to be busy.

Florian Albrecht: (29:50)

Yeah, currently they created a [inaudible 00:29:54] of Galapagos and made some changes. So we don't have to provide them with new features currently, but they will give them back to us, the features, and we will have to integrate them. So that's really work for us, but I'm looking forward to this. But yeah, that's the cool thing about open source. Everyone can use it and we can provide them with support, but we don't have to. This is why it's still manageable.

Kris Jenkins: (30:20)

If someone wants to get started with Galapagos, or at least kick the tires on it, where's the best place to start?

Florian Albrecht: (30:26)

Definitely the GitHub page of Galapagos. We have also the theory behind as some documents there. So the principles of event driven architecture and our derived Kafka rules and guidelines. So that's just ideas, and you can implement them even without having to use Galapagos. That's just ideas and we intentionally formulated them in a way that you could do it without software, but you quickly will see, it's better to do it with software and...

Kris Jenkins: (30:59)

Especially if someone's written it for you.

Florian Albrecht: (31:04)

Especially... Exactly. It's available for free. And then the next step would be to check out Galapagos and start the demo. There's a demo attached and you can try it out locally and see what it can do. But I really think it's not the kind of software which you just download and start and see, "Oh great, the kind thing it can do." Because it's... You have to have an idea at least about what it's doing. So you should be familiar with the concepts of event driven architecture because otherwise... The same questions like a developer, "What is this business capability kind of stuff?" And something like that. So better... Have a look into the principles and then start the demo.

Kris Jenkins: (31:48)

Right. Are there any principles we haven't covered in this conversation that you want to bring up?

Florian Albrecht: (31:53)

Yeah. There are many.

Kris Jenkins: (31:57)

Oh, gosh. Okay.

Florian Albrecht: (31:58)

But too many perhaps to cover here. It's not that many. But one of the principles perhaps, which we created for us, is that a topic always has one owner assigned. So one application is the owner of one topic and there can be multiple producers for a topic, that's fine. But we said there must always be one team, one application, which is responsible for the topic itself, for the format of the data. And there can be multiple producers, but they have to coordinate then with an owner if they want to change something. This is a little bit controversial sometimes for other companies. But that's, we found, is the best way to organize it, to have always one team responsible for a topic. Because otherwise, yeah, if you have multiple responsible, then no one is responsible at all.

Kris Jenkins: (32:55)

Yeah. And again, you've just cracked back open that coordination problem.

Florian Albrecht: (32:59)

Yeah, exactly.

Kris Jenkins: (32:59)

Yeah, yeah.

Florian Albrecht: (33:02)

And one other principle is that we have one business event type per topic. So we don't have topics where you have dozens of different event types, but always only one business event type for one topic. This also a little bit controversial. We had a company which wanted it in the other way around, or multiple event types per topic. And I think in that case it could be okay to do it this way. So Galapagos does not really enforce this. Galapagos only says you have to do... To provide one JSON Schema and if you get multiple event types covered with your one JSON schema, then it's technically fine. But we have this principle for us, on the logical layer, to say there's only one business event type per topic because this makes many things much easier, much more transparent, and easier to visualize, for example, in our architecture software.

Kris Jenkins: (34:00)


Florian Albrecht: (34:00)

Because otherwise you cannot easily split. For example, if you say the responsibility for this business event changes from one team to the other team, then yeah. I just said we have just one owner for the topic, but now we have two different responsibilities for the contents and that's a really bad idea. So it's easier to have it, yeah, distinct. So one business event type per topic.

Kris Jenkins: (34:25)

Yeah, I think I'd agree with that. Because if you've got two events on the same topic and different teams are responsible for them, they're really different events, aren't they? They're really different topics, different sources of data.

Florian Albrecht: (34:36)


Kris Jenkins: (34:37)

Yeah. Yeah. Any other principles?

Florian Albrecht: (34:41)

No, I think that's the most important principles. There are some minor things, but yeah, I think that's best to read them. Yeah.

Kris Jenkins: (34:50)

And do you have this idea, just to hammer it home, the idea of the person... The team that owns a topic, are they also responsible for fixing the data, and quality levels, and uptime on that topic?

Florian Albrecht: (35:04)

They are responsible. Exactly. Of course, they really... For example, for the data quality, they sometimes have to talk to business or some somebody else, but they are really responsible in the end. So if a consuming team has a problem with the data, be it data missing or data... Yeah, bad quality or whatever, it'll go to the responsible team, which can see via Galapagos, "Okay, this is a responsible team. I will contact them and report whatever problem there is." And they have to fix it. Exactly.

Kris Jenkins: (35:35)

Do you know what this is making me think? There are some things, there are some ideas that seem almost inevitable when you hit... When you settle on certain principles, the ideas that follow from them are kind of inevitable. So I think if you start with immutable data structures, you're going to eventually discover event driven architectures, and you're going to eventually, in a large enough organization, do something like data mesh. Is the choice of principles causing evolution related to the choice of the name Galapagos? Is that deliberate or is it a nice coincidence?

Florian Albrecht: (36:11)

Oh, that's nice idea. Really, it's to evolution. But yeah, the name derives from the first core of the application, which was the schema registry, and the idea of having JSON Schemas which are compatible to the previous version. And this idea is not new, but it's called schema evolution. So it's...

Kris Jenkins: (36:30)

Ah, right.

Florian Albrecht: (36:33)

And that's what we call... And why we called it Galapagos. And because we like island names for software. But yeah, to be honest, it should always be only a temporary name, but then, yeah, it somehow got final.

Kris Jenkins: (36:47)

That often happens with project names. It's a very interesting project. I am going to go and check out the repo and I hope some other people do too.

Florian Albrecht: (36:55)


Kris Jenkins: (36:56)

Thanks for telling us about it, Florian.

Florian Albrecht: (36:57)

You're welcome. Thank you for your time.

Kris Jenkins: (37:00)

Well, thank you Florian. I'm quite gratified that every time we've had an architect on this podcast, they've shunned that classic stereotype about being the kind of person that lives in an ivory tower with their ideas. And they've actually got stuck in with the kind of tools, and training, and conversations that make these ideas work in the real world. So Florian, thank you for being one of the good ones.

Kris Jenkins: (37:25)

Before we go, Streaming Audio is brought to you by Confluent Developer, which is our site that teaches you everything you need to know about Apache Kafka and real time event systems in general. We've got tutorials, we've got architectural guides, we've got courses that will teach you everything you need to know. So take a look at And if you want to get your own Kafka cluster up and running to build your very own data mesh, take a look at our cloud service at Confluent Cloud. You can sign up and have Kafka running reliably in minutes. And if you add the code PODCAST100 to your account, you'll get some extra free credit to run with a little longer.

Kris Jenkins: (38:04)

And with that, it remains for me to thank Florian Albrecht for joining us and you for listening. I've been your host, Kris Jenkins, and I will catch you next time.

In principle, data mesh architecture should liberate teams to build their systems and gather data in a distributed way, without having to explicitly coordinate. Data is the thing that can and should decouple teams, but proper implementation has its challenges.

In this episode, Kris talks to Florian Albrecht (Solution Architect, Hermes Germany) about Galapagos, an open-source DevOps software tool for Apache Kafka® that Albrecht created with his team at Hermes, a German parcel delivery company. 

After Hermes chose Kafka to implement company-wide event-driven architecture, Albrecht’s team created rules and guidelines on how to use and really make the most out of Kafka. But the hands-off approach wasn’t leading to greater independence, so Albrecht’s team tried something different to documentation— they encoded the rules as software.

This method pushed the teams to stop thinking in terms of data and to start thinking in terms of events. Previously, applications copied data from one point to another, with slight changes each time. In the end, teams with conflicting data were left asking when the data changed and why, with a real impact on customers who might be left wondering when their parcel was redirected and how. Every application would then have to be checked to find out when exactly the data was changed. Event architecture terminates this cycle. 

Events are immutable and changes are registered as new domain-specific events. Packaged together as event envelopes, they can be safely copied to other applications, and can provide significant insights. No need to check each application to find out when manually entered or imported data was changed—the complete history exists in the event envelope. More importantly, no more time-consuming collaborations where teams help each other to interpret the data. 

Using Galapagos helped the teams at Hermes to switch their thought process from raw data to event-driven. Galapagos also empowers business teams to take charge of their own data needs by providing a protective buffer. When specific teams,  providers of data or events, want to change something, Galapagos enforces a method which will not kill the production applications already reading the data. Teams can add new fields which existing applications can ignore, but a previously required field that an application could be relying on won’t be changeable. 

Business partners using Galapagos found they were better prepared to give answers to their developer colleagues, allowing different parts of the business to communicate in ways they hadn’t before. Through Galapagos, Hermes saw better success decoupling teams.


Continue Listening

Episode 244November 22, 2022 | 29 min

Improving Apache Kafka Scalability and Elasticity with Tiered Storage

What happens when you need to store more than a few petabytes of data? Rittika Adhikari (Software Engineer, Confluent) discusses how her team implemented tiered storage, a method for improving the scalability and elasticity of data storage in Apache Kafka. She also explores the motivating factors for building it in the first place: cost, performance, and manageability.

Episode 245November 29, 2022 | 29 min

Real-time Threat Detection Using Machine Learning and Apache Kafka

Can we use machine learning to detect security threats in real-time? As organizations increasingly rely on distributed systems, it is becoming more important to analyze the traffic that passes through those systems quickly. Confluent Hackathon ’22 finalist, Géraud Dugé de Bernonville (Data Consultant, Zenika Bordeaux), shares how his team used TensorFlow (machine learning) and Neo4j (graph database) to analyze and detect network traffic data in real-time. What started as a research and development exercise turned into ZIEM, a full-blown internal project using ksqlDB to manipulate, export, and visualize data from Apache Kafka.

Episode 246December 8, 2022 | 41 min

Rethinking Apache Kafka Security and Account Management

Is there a better way to manage access to resources without compromising security? New employees need access to a variety of resources within a company's tech stack. But manually granting access can be error-prone. And when employees leave, their access must be revoked, thus potentially introducing security risks if an admin misses one. In this podcast, Kris Jenkins talks to Anuj Sawani (security product manager at Confluent) about the centralized identity management system he helped build to integrate with Apache Kafka to prevent common identity management headaches and security risks.

Got questions?

If there's something you want to know about Apache Kafka, Confluent or event streaming, please send us an email with your question and we'll hope to answer it on the next episode of Ask Confluent.

Email Us

Never miss an episode!

Confluent Cloud is a fully managed Apache Kafka service available on all three major clouds. Try it for free today.

Try it for free