March 17, 2021 | Episode 148

Event-Driven Architecture - Common Mistakes and Valuable Lessons ft. Simon Aubury

Transcript
Notes

Tim Berglund:

Hey. I got to talk to my friend, Simon Aubury, today about event-driven architecture. He's been around the Kafka space for about five years, which is kind of a long time, and he wanted to take some time to talk to that younger version of himself and offer some advice. What would Simon say today to younger Simon if he had it to do all over again? Well, you get to listen in on today's episode of Streaming Audio, a podcast about Kafka, Confluent, and the cloud.

Tim Berglund:

Hello, and welcome to another episode of Streaming Audio. I am, as always, your host, Tim Berglund, and I'm joined in the virtual studio, which is a virtual studio spanning literally the whole planet, this is from the other side of the planet relative to where I am right now, by Simon Aubury. Simon, welcome to the show.

Simon Aubury:

Why, thank you very much, Tim. I'm very excited to join you today, and thanks for inviting me.

Tim Berglund:

You got it. I should say welcome back to the show. This is a returning guest. As we say, the triumph of hope over experience, so great to have you back. If you haven't heard Simon's previous appearance on Streaming Audio, of course, it's linked in the show notes. Simon has a background as an Oracle DBA. We're still friends, but he's been an Oracle DBA and then a micro narrative developer to data engineer, and you are now principal data engineer at ThoughtWorks. Is that right?

Simon Aubury:

That's correct, that's correct. Yeah, so I think we all make a bit of a weaving path into IT, but yes, I did have to start life a good couple of decades ago as both a Java developer and an Oracle DBA, and [crosstalk 00:01:44] yeah, and-

Tim Berglund:

It's okay-

Simon Aubury:

... I'm at ThoughtWorks today.

Tim Berglund:

... firmware before I was a Java developer, so-

Simon Aubury:

Oh, yeah, I think we all make a bit of a meandering path, but definitely very excited to work in the data industry. As a career choice, it's been pretty exciting because it's given the opportunity to work both in Australia, through Asia-Pacific, and throughout Europe, so definitely working in the data space and working in those sort of challenging big data areas. It's been super exciting to get exposed a variety of problems.

Simon Aubury:

As a ThoughtWorks consultant, I get the opportunity to work with companies who have super ambitious goals, and they're essentially clients who have like really, really tough IT problems. It a real pleasure to be out there, sort of stay ahead with that disruptive thinking and bringing software consultancy and pioneering tools to come and solve some really ambitious challenges.

Simon Aubury:

On a personal level, working as a ThoughtWorks consultant gives me the opportunity to work with an awesome team of really passionate and diverse and leading-edge software engineers. I'm conscious that this is a Streaming Audio podcast, and it's probably worth just giving a tiny flavor of some of my exposure to Kafka, an awesome streaming platform. I had the first exposure Kafka call it out five or six years ago.

Simon Aubury:

That was very much in that sort of customer experience area, website analytics, website traffic analysis, and that was fantastic. It was my first exposure to the fantastic streaming platform that is Kafka, and they've very much moved into other projects that were highly available streaming solutions, everything from single-customer platforms, vehicle monitoring systems.

Simon Aubury:

For the last few years, I've worked as a consultant working on a variety of use cases, everything from transport, Fintech, energy sector trading, so it's been quite a diverse collection of Kafka and event-driven applications.

Tim Berglund:

Love it, love it. I want to talk to you today about event-driven architectures, and that's a thing that if... I feel like we're, number one, standardizing on that phrase, by the way. It seems like that's emerging as the label that we use to talk about this family of approaches to building systems.

Tim Berglund:

It's a thing we definitely talk about on the show, but we don't always... If you just look back through the catalog, it's not that often that we have an episode we're you say, "Hey, we're going to talk about event-driven architecture." Let's do that. First of all, why is that of interest to you? I mean, you work with Kafka and you build systems, maybe it's obvious, but specifically?

Simon Aubury:

Yeah. I find event-driven architecture... it's a really interesting way of thinking about how to solve problems in very large, complicated ecosystems, so typically enterprises that have different teams, different business units, and they need to be able to build solutions together. It's probably worth pointing out that event-driven architecture isn't particularly new.

Simon Aubury:

Apparently, it's been around for a good 20, 25 years or so, but what is new is essentially people's expectations when they're dealing with their bank or their energy company. I don't know about you, Tim, but my expectations of dealing with my banker are different than they were five years ago and definitely 10 years ago. I have-

Tim Berglund:

Mm-hmm (affirmative).

Simon Aubury:

... expectations when I buy something that it's going to be fast and immediate and quick, and I don't think anyone's got tolerance for slow systems that are sort of batch-orientated. I think event-driven architecture has really come into its own.

Tim Berglund:

I sometimes joke about the days of allow four to six weeks for delivery are behind us, and young-ish people might only know that if they watch vintage TV or something. That phrase, it doesn't even mean anything anymore, but more than that, as I say, you've got a phone in your pocket and you actually expect it to wiggle when you need to know about something. If something has changed, you talked about banking, if there is some sort of asynchronous event with respect to your accounts at the bank, you expect a notification.

Tim Berglund:

That's just how it is, so the systems behind all of that need to be able to deliver real-time results and the developers and the architects who are building these systems, that pressure is on you. You need to make a system that can interact with the consumer now, and while we need new tools to do that because that wasn't a requirement 30 years ago when the received paradigms were being developments were being developed, when they were an innovative new paradigm. In [crosstalk 00:06:37] other words, a lengthy way of saying, "Yes, real-time matters."

Simon Aubury:

Yes, and those expectations are continually growing. I was interested when my children the other day bought something online and then my daughter two hours later was hanging out by the post box expecting it to have arrived, so those expectations are growing all of the time.

Tim Berglund:

Yeah. That's just the reality that children are growing up in now. Let's talk about what event-driven means to you or what... Not to you. The significance, how is event-driven architecture significant to you? What are the different things that people mean by event-driven?

Simon Aubury:

Yeah. At the end of the day, event-driven is very much the building solutions so they're both independent and reactive. It's almost worth starting off by giving a bit of an example, and I wanted to use an example from one of my colleagues, Sarah Taraporewall, who uses the example of a kitchen at sort of a restaurant and how the different areas of a restaurant need to coordinate with one another. You could use the restaurant example as a proxy for thinking about some of the complexity that a typical enterprise might see, but a local example because it's both easy to conceptualize, and I like going to restaurants, so it motivates me to think about business domains and business problems that way.

Simon Aubury:

If you're ever talking to a product owner or a business stakeholder, quite often they're describing their business along the lines of when something happens, something else needs to happen. It might be a customer clicks on a website and we want to send them a welcoming email. If you think about the complexity that builds up across an organization or across a restaurant, this sort of point-to-point solution gets very, very out of hand quite quickly.

Simon Aubury:

If we were thinking about a restaurant example, you can think about all of the independent areas of a restaurant and how they need to coordinate against each other. You might have the maitre d who welcomes people to the restaurant. You might have the wait staff who need to interact with the customers, and the kitchen, who are preparing food orders. Then, there's a stock room who needs to worry about inventory and the engineer who needs to itemize bills.

Simon Aubury:

Now, we could design our restaurant so each of these teams need to talk directly to each other, but it's going to become chaotic quite quickly. It's going to be very, very difficult for the kitchen to optimize the way that they work or the wait staff how they might interact and worry about the number of wait staff they have on the floor if each of these teams need to talk directly to one another.

Simon Aubury:

If we think about event-driven architecture, we can think about how these teams might be able to optimize internally, but be able to build a really, really highly effective restaurant by just talking about things that have happened in their own domain. If you can think about someone has sat down at a restaurant, they might sit and interact with the wait staff and they might order fish. That might just be an event that has happened on a message bus, and the kitchen can react to that event and naturally start thinking, "Well, someone's ordered fish. Maybe I need to start cooking fish on the fryer?"

Simon Aubury:

What's really good about this in paradigm is that you can start thinking about this from an enterprise perspective. You can start thinking of what's necessary when we're talking about communicating across the enterprise essentially events that have happened. Then, you can also start seeing ways that each team can be quite efficient. The kitchen staff might be able to locally optimize how they organize their kitchen, the wait staff might be able to organize themselves, and each team can be essentially provided just an interface, or what we might in DDD call a bounded context, which is essentially the interface to each of those teams.

Simon Aubury:

You as a consumer, you don't need to worry about how each of those teams have optimized themselves. You just have to worry about the amazing experience that you're going to have at that restaurant.

Tim Berglund:

That makes sense, and I love the restaurant analogy because you can just play with it a little bit and think it is extremely likely that everyone who is employed by that restaurant will have at least have facility and a common language. They might not all be native speakers of the same language. Just in the typical cosmopolitan city, that's not going to work out that way, but everybody's going to be able to speak if it's in Sydney English, if it's in Denver English.

Tim Berglund:

Each of those groups is going to have a way of communicating that's optimized for their work, so like everybody could be using Protobuf, but the domains are different and the messages are different and we need to define some standard interfaces. We're all speaking English, but wait staff have a jargon, and certainly the kitchen, there's a very specialized jargon or like argot going on back there.

Tim Berglund:

Those are good. Those emerge within those groups for reasons. They're actually an optimization for the way those people communicate, but they're a suboptimization for communicating between groups, so to your point, all of the point-to-point interfaces are going to become expensive if everybody needs to be able to locally optimize to their jargon and maintain everybody else. It'd be like Star Wars where everybody speaks their own language, and somehow everybody manages to understand it, so you can produce one and you can receive all of them. It's not how any group of languages really works, so I like the restaurant analogy.

Simon Aubury:

Yeah, and there's a natural sort of carry-on to this level of thinking when we're talking about microservices, and in an enterprise, I think one of the mistakes I've definitely made in the past is spending far too much time talking about the right size of the microservice or the right size of the team to support a microservice. Should it be a particular sized repo? Should it be optimized to a particular language? Should you be able to feed that team with two pizzas?

Tim Berglund:

We can bikeshed that, Simon, I'm confident we can bikeshed this question hard.

Simon Aubury:

Oh, absolutely, absolutely, and I think like many things, if you ask 10 developers, you'll get 12 answers, but maybe the better question to ask is, how big is the interface to that microservice? If we're thinking about event-driven architecture, it's essentially, what is the service of that bounded context? To your point earlier, Tim, it doesn't really matter to me as wait staff how the kitchen is optimized as long as they get their job done and they meet particular service-level agreements on being able to turnaround orders and they are intern-supported by the stockroom who need to bring supplies to the kitchen. We can all play like adults, we can all work together, but we don't need to get too coupled to the inner workings of each of those teams across a restaurant.

Tim Berglund:

Yeah, yeah. I like it. Now, we were talking before we started recording. I saw a meme, we'll call it a meme. I guess it was just an image, but there was a caption in the tweet, so it was like it was a meme of adult Kermit the Frog talking to Baby Kermit, and I forget if Baby Kermit actually has a proper name. I think they're contemporaneous characters, like there was Kermit and then there was Baby Kermit, so Baby Kermit had to have some unique name, but whatever. Grown Kermit talking to this little frog and the caption I gave it was, "When you first read a Wikipedia article about internal family systems." Might link to that in the show notes, might not. It's a little obscure, like five people are going to get that and you're going to cry.

Tim Berglund:

You, Simon, thinking of you, speaking to younger Simon, and we don't get to be much younger because this has been five years for you that you've been in this... I guess in Kafka, as you pointed out, event-driven architectures are older than that. They were, I think it's safe to say, a bit player, at least in the enterprise prior to kind of now, the last few years. As 2021 Simon looked back to maybe 2017 Simon, what are some lessons you'd want to... Speaking empathetically and openly to that younger version of yourself who was doing the best he could at the time making decisions, trying to survive, what would you say to that younger Simon?

Simon Aubury:

Yeah, and that's a terrific question, Tim. I would love to be able to post messages in a bottle back to my younger self, but definitely there's things that over time I've learned, and both these are either mistakes or learnings that I have made, but I've also seen these characteristics pop up in other areas. I think one of the first things I would love to emphasize to a younger form of myself is that there's a really important distinction between messages and events. This might be-

Tim Berglund:

Ooh [crosstalk 00:16:39]-

Simon Aubury:

... from my background of... I'm old enough to remember three-tier architectures where an application was heavily integrated at a database level, and then we would try and solve some of the coupling problems with exciting terms such as SOA and RPC and events services buses. That might have somewhat influenced my early thinking when I got involved in event-driven systems, but back to the point. Messages and events, they're fundamentally different. A message is essentially a command to do something, and typically it's a message that's a point-to-point payload that's specifically targeted to a service where there's the expectation that a system's going to respond.

Simon Aubury:

If I was sending a message to Tim, I might ask Tim to send an email or scan a document or perform a service, and that's very much of a command or intake at system. That puts an obligation on me as the sender to wait until Tim has confirmed that that action has happened. That provides a [crosstalk 00:17:54]-

Tim Berglund:

Got to make sure [crosstalk 00:17:55] it's happened.

Simon Aubury:

... oh, absolutely. I really enjoy talking to you, Tim, but if we're doing this frequently and often, it might be a bit of a burden for both yourself and myself to send direct messages to one another.

Tim Berglund:

I'd get tired of it, yes.

Simon Aubury:

Oh, and absolutely. Whereas, if we talk about event-driven thinking, we're talking about not so much this obligation to put a command into a system. We're talking about a statement of fact. Events think it persisted in a replayable stream history such as something incredible like Kafka, and then downstream consumers can respond to those events. If we used this restaurant analogy again, I might put an event onto a Kafka topic that says, "Customer has walked in the door," in past tense. It's a statement of fact. It's nondisputable. It's an immutable event that's definitely happened.

Simon Aubury:

A consumer can look at that event and then react appropriately. Perhaps we'll want to sign that customer up to an email list, or perhaps we might want to sit them in a restaurant or give them a wine menu, but were very much talking an event which is a concrete statement of fact. It's happened, an event's happened in the past, and as a producer of that event, I'm not particularly concerned about the downstream consumers who need to react to it. It really sort of simplifies that, the architecture that's going to come off the back of this.

Simon Aubury:

This sounds very, very subtle, but it really wasn't clear to a younger Simon that a message is essentially a command and an expectation that's going to happen. An event is just a statement of fact, and getting that concept clear from the get-go could have actually solved a lot of design problems and a lot of sort of implementation friction.

Tim Berglund:

Gotcha, gotcha. I like the... and this is a point I've tried to make about asking, waiting to ask, trying to find out whether a thing happened. People new to event-driven architecture will have a lot of command and response topics in Kafka. They'll actually build that into Kafka, lie, "Here is my topic telling you a thing to do, you go do it asynchronously, except I'm waiting to make sure you did it," which if you go back to people and a restaurant, you don't treat adults that way. They will resent you. It will not last. They will make that stop by leaving the organization or checking out or something. Some bad thing is going to happen to that relationship between the people and with event-driven architectures. You don't treat services that way. You produce an event and you expect everyone else to be an adult.

Tim Berglund:

Now, just like in a restaurant, somebody's looking over this. There's usually the manager walking around saying, "Are you enjoying your meal? Is everything going well?" Just kind of making sure things are going okay. You could have metrics for how long it takes to run food, or between order and food delivery and all of that. You still measure, you still observe the system, but from one service to another you don't wait to make sure they've done the right thing. That's just not how it works. That's a nonsustainable way of getting the job done.

Simon Aubury:

I mean, absolutely, and perhaps a real world implementation of this is when we're talking about really, really complicated financial services systems, which over time sort of flourish and have more and more bells and whistles. This is a fantastic use case for an event-driven architecture. One EDA platform that I worked on had quite an important step in the flow, which was essentially address validation. Now, this sounds like a very, very straightforward task. Someone puts in an address into a free text field, but as you can imagine, addresses are quite unique and that has quite a material impact.

Tim Berglund:

They're [crosstalk 00:22:13] actually terrible.

Simon Aubury:

Yes, absolutely, so being able to turn a free text field into an address suppose so you can actually point to it on a map, but also infer other things such as, what's a property worth? What's the risk of fire or flood or burglary? I don't know about other jurisdictions, but in Australia, if you can actually localize an address down to a very, very fine area. You can actually stamp your outgoing envelopes with a particular barcode and that makes postage much cheaper, so there's impacts both large and small when you're doing address verification.

Simon Aubury:

Now, one of the systems I worked on had this quite important but somewhat convoluted process to do this address validation, and when we initially modeled this system, we probably overindexed on all of the responses back from that address validation service. There might be a confidence score returned or a timeout or an error or some sort of back pressure measure, and our instinct was to go and add all of these messages back into a Kafka topic because, of course, someone might want to know the HTTP response code or the number of messages or the amount of back pressure applied to this service. That very much comes back to the, what's an important event to model? What is a command that you're expecting someone to consume?

Simon Aubury:

One of the discoveries when we started building out this system was we ended up putting what we might term passive-aggressive events into our queue. They were events, but there was almost like this implicit expectation that someone was going to react to those events, so it wasn't just a nicety that we're getting an error something something something from the service. It was almost like this implicit expectation that someone was going to reroute that message and a human was going to get involved in the loop.

Simon Aubury:

In terminology, I love this, this passive-aggressive event. It's a statement of fact that there was a strong expectation that someone was going to react to it, and that's sort of a bad smell when building out [crosstalk 00:24:35]-

Tim Berglund:

Someone [crosstalk 00:24:35] in particular [crosstalk 00:24:36] like you're waiting almost for that response?

Simon Aubury:

Absolutely, absolutely.

Tim Berglund:

I think I'm going to use that. I really like that. I was going to say, talk about the difference between choreography and orchestration. That's something I've talked about recently on the show, but always good to get another angle on that. How do you think about those two things?

Simon Aubury:

This is a terrific question and, again, something I wish I had clarified to myself earlier in the EDA journey. Orchestration is all about something a single entity coordinating as a controller all of the communication, so in an IT sense, this might be a controller module that's responsible for calling all of the microservices in turn. Sometimes when we're talking about a more general example, we're talking about a conductor in front of a music pit. Each of the musicians, highly talented, highly skilled individual players, but it is the conductor in front of the orchestra who is actually setting the time, the tempo, and keeping everyone running together.

Simon Aubury:

This is fantastic in some circumstance, but you can imagine that if you've got 20 musicians, this would work. If you've got hundreds of thousands of musicians scattered throughout the world, putting a conductor in front of everyone might become infeasible quickly. That's the space of orchestration. It's where you've got tolerance of an individual controller for all of the activity, which in some cases and some problem spaces is the right answer, but obviously that induces a level of friction and a level of coupling across individual teams.

Simon Aubury:

The converse of that is choreography, or essentially event-driven response architecture, and that's where systems take independent action based on events that they might find in a streaming platform. When we're talking about event-driven architecture, we're really implying that this is a choreographed sequence of events. Now, this has a number of advantages because as in our restaurant example before, we don't need to directly couple systems to one another. We can have our kitchen natively reacting to orders as they come in. We can have the stockroom reacting to events that have happened in a service history. Event-driven choreography gives us a level of independence across those services and a level of flexibility when we're talking about uplift, changes, efficiency.

Simon Aubury:

One of the advantages of choreography is simply that you're treating all of your services as adults and they can mature and evolve independently from one another. It also has a great operational uplift in that if a service goes down, it only affects that one service generally and they can correct and think about bringing their service back to life and it doesn't necessarily impact other services. Both from a resilience sense and a team independence sense, choreography for nontrivial systems is an easier way to evolve enterprise systems.

Tim Berglund:

You being a thought worker have probably read or heard or thought a little bit about evolutionary architecture. Do you see a connection between those two? I mean, cards on the table, I do and I think it's an intimate connection. Is that an overstated case? Do you think event-driven architecture is the other side of the coin for evolutionary architecture? What do I need to get from one to the other?

Simon Aubury:

This is a really interesting question, and I need to confess that I only made it three chapters into the Neal Ford and the Rebecca-

Tim Berglund:

Neal [crosstalk 00:28:56]-

Simon Aubury:

... Parsons tome.

Tim Berglund:

Yeah [crosstalk 00:28:58] and Rebecca, that's right, yeah.

Simon Aubury:

Yeah. It's a really good challenge to think through of how systems evolve over time. We are constantly looking at applications that have sort of stayed true to their course and, obviously are battle-tested, but we need to be able to evolve them. To me, event-driven architecture really gives that sort of independence for teams to optimize locally and talk about how you can actually do uplift and efficient deployments where necessary. Yeah, I don't think I've got a good answer for you, Tim.

Tim Berglund:

Okay [crosstalk 00:29:44]. Hey, good enough, good enough.

Simon Aubury:

Yeah.

Tim Berglund:

You had said something about the decoupling that EDA affords you, allows services to mature and evolve independently, and I think that's the key because that's like the story that we've been telling ourselves about microservices for almost a decade now that we were going to be able to. The initial experience was, "Well, no, we're not able to." You have to release them all at the same time and in a particular order and it was this bad thing, so I think now what we get is something like the ability of services to truly evolve with only the necessary coupling, which would be schema.

Tim Berglund:

You have to know what an event is. You've got to have some kind of common language, how the event is formatted. Not only can individual services evolve on their own, but new services can grow up. They can be innovated upon logs of events that are preexisting. I think there's a connection there and that's just something worth exploring I guess more as we move forward

Simon Aubury:

Yeah, and you're spot on and maybe this is, again, a lesson I could teach my former self when it comes to who is in charge of schema evolution. Coming from a traditional DBA background, there was almost like this innate need to control everything across an organization and maybe comes from the bad old days of needing to build an entity relationship diagram to talk about how to model everything.

Simon Aubury:

When I first got exposed to schemas and schema registries and schema revolution, my first feel was this would be something that would be best centralized in an organization. I can only say that five years of history has shown me that is the worst analogy, so don't think about your schema registry as you do a centralized database.

Simon Aubury:

A much more enlightened Simon would say that the right approach is to treat all of your services as independent and mutual adults and just set some basic ground rules about schema ownership and schema revolution. The basic one would be allow producers to be run in a forward compatible mode and let them evolve gracefully forward, conscious of the fact that there will be a downstream consumer who won't appreciate a mandatory fuel disappearing. Obviously, things like putting your schema registry in forward compatible mode just sets some basic ground rules so teams can mature and evolve at an appropriate rate without causing too much friction.

Tim Berglund:

Love it. That sounds like the accidental message bus antipattern that you came close to describing there, and I've got a slide... I don't use it too much these days, but I've it in the past. It was two slides. "This is the garbage point-to-point spaghetti of all of this integration," and, "Oh, now it's okay. Everything just talks to Kafka." From A to B, architects of a certain age will say, "That looks like an enterprise service bus." You said message bus, but the accidental ESB antipattern. Why is this not an ESB? There are answers. I think they're related to what you just said, but talk to me about younger Simon accidentally building a message bus.

Simon Aubury:

Yes. I love one of my colleagues who once described this as, "No problem in the world can't be made worse than the introduction of an ESB," so... ESBs-

Tim Berglund:

Not wrong.

Simon Aubury:

... yes, ESBs have had their time, but it very much perpetuates this concept of messages need to be delivered and we need to hang around until those messages are consumed. A very command-orientated and directive and introduces all sorts of interesting failure modes if you're talking about multiple consumers on the same message. It adds to all sorts of fun and joy when you're talking about failure modes. More importantly, beyond the high-level architecture and the best intentions, there really are real-world considerations when you're talking about failure points across services.

Simon Aubury:

One system that I have... It was actually a delightful pleasure to build this because it was a really interesting problem space, but when we're talking about doing OCR on a document, we're talking about sending physical papers off to an area. They're scanned, you're pulling out the metadata, you're describing the texts, and you then have to tag all of that information. One sort of bad slippery slope on this project was we ended up putting a lot of commands back into our messages and, again, we fell down this slippery slope of almost wrapping command... putting events in, but they were implicit commands. We fell down this slippery slope again of passive-aggressive events and before long had ended up building essentially the sins of the past and ESB on top of Kafka.

Simon Aubury:

That was started off with very, very short-term reactionary, "How are we going to get around this sort of OCR failure. This is important. We need to rescan these documents. It's time-specific." Before we knew it, we were putting all sorts of commands back into Kafka topics and that was just a recipe for disaster, but the right answer is be good to the event-driven paradigm and just really, really think about, what is appropriate for a local state store? What is a process you're just prepared to rerun if it's bought halfway through? What are the true significant messages that you actually want to... those true significant events that you actually want to persist?

Simon Aubury:

In the case of an OCR document, you can really talk about, "My intent is I've received a physical document. I want a downstream system to successfully scan it and itemize all of the things on that page." That's very much where we want to go with event-driven architecture, separating the event that has happened versus a downstream system that needs to do something important.

Tim Berglund:

So, having advised Simon 2017, coming back to Simon 2021, having been through all of those things, what would you know say if you could summarize it in a list of a few things that you advocate for as some fundamentals of event-driven architecture?

Simon Aubury:

Well, I'd probably break this down into advocating for particular technical patterns and particular sort of training and uplift patterns. From a technical perspective, obviously, I love and have confidence with Kafka and Kafka ecosystem, so if I'm given an opportunity to use Kafka and the schema registry and all of the great things on top of that platform, I'd definitely advocate for building an EBA on top of Kafka just because you've got a really, really great way of doing stream processing. It's obviously got all of the observability, flexibility, and scalability would want from an event-driven platform.

Simon Aubury:

When we're actually talking about the high-level implementation, again, the ground rules that you actually want to put in place for a large enterprise, I'd consider it's kind of important to just set some really, really basic ground rules. Events always in the past tense. Something has happened, a customer has arrived, an order has been taken. All of the events should be named in the past tense just to really so they defy the fact that these are events and not messages.

Simon Aubury:

I'd also advocate very much for stamping an event with a correlation ID the moment that an event enters a system. You want to put a [inaudible 00:38:26] on it or just something unique so as it traverses the system, you can always help with your observability platform later on by having a tracking [inaudible 00:38:37]. Dates, dates and times, again, really, really important. Disks are cheap, so the more just header information you can add, things like event production time and event creation time, which may potentially be different and originating system. Stamping all of these into the payload just as a good citizen helps with being able to track and build reliable and observable systems down the track.

Simon Aubury:

Don't be tempted to break up these events too early. Having cost grain events that later on you choose to break down is a much easier path than trying to optimize too early in a process and getting too micro with the size of the events. It's much easier to break events down into their components or discard events that you're not particularly interested in rather than trying to recreate a whole bunch of very, very low-level events.

Simon Aubury:

From the people side, I also learned that it's just important to anticipate that there's sometimes going to be a bit of an uplift when we're talking about introducing event-driven architecture and streaming platforms into an organization. Most people are really, really enthusiastic to embrace technology, but do need to be guided and do essentially just need to know where there's watchpoints in both learning and discovery.

Simon Aubury:

If I look at myself, I've made a number of mistakes and I've seen those mistakes come out in other areas, so just the distinction between event-driven versus messaging, size of events, bounded context, all of these things are lessons that we can share with one another, so hopefully my learnings and my mistakes aren't relearnt by another generation. That training and uplift and just having a bit of a willingness to just understand the context or the important architecture principles before you get too involved in an EDA deployment.

Tim Berglund:

My guest today has been Simon Aubury. Simon, thanks for being a part of Streaming Audio.

Simon Aubury:

Tim, it's been a pleasure. Thanks very much for your time.

Tim Berglund:

Hey, you know what you get for listening to the end? Some free Confluent cloud. Use the promo code 60PDCAST, that's 6-0-P-D-C-A-S-T, to get an additional $60 of free Confluent cloud usage. Be sure to activate it by December 31st, 2021, and use it within 90 days after activation. Any unused promo value after the expiration date is forfeit. There are a limited number of codes available, so don't miss out.

Tim Berglund:

Anyway, as always, I hope this podcast was useful to you. If you want to discuss it or ask a question, you can always reach out to me on Twitter at @tlberglund. That's T-L-B-E-R-G-L-U-N-D. Or, you can leave a comment on a YouTube video or reach out on Community Slack or on the community forum. There are sign-up links for those things in the show notes if you'd like to sign up.

Tim Berglund:

While you're at it, please subscribe to our YouTube channel and to this podcast, wherever fine podcasts are sold. If you subscribe through Apple Podcasts, be sure to leave us a review there. That helps other people discover it, especially if it's a five-star review, and we think that's a good thing. Thanks for your support, and we'll see you next time.

Event-driven architecture has taken on numerous meanings over the years—from event notification to event-carried state transfer, to event sourcing, and CQRS. Why has event-driven programming become so popular, and why is it such a topic of interest?

For the first time, Simon Aubury (Principal Data Engineer, ThoughtWorks) joins Tim Berglund on the Streaming Audio podcast to tell all, including his own experiences adopting event-driven technologies and common blunders when working in this area.

Simon admits that he’s made some mistakes and learned some valuable lessons that can benefit others. Among these are accidentally building a message bus, the idea that messages are not events, realizing that getting too fixated on the size of a microservice is the wrong problem, the importance of understanding events and boundaries, defining choreography vs. orchestration, and dealing with passive-aggressive events.

This brings Simon to where he is today, as he advocates for Apache Kafka® as a foundation for building a scalable, event-driven architecture and data-intensive applications.

EPISODE LINKS

Continue Listening

Episode 149March 24, 2021 | 50 min

Smooth Scaling and Uninterrupted Processing with Apache Kafka ft. Sophie Blee-Goldman

Availability in Kafka Streams is hard, especially in the face of any changes. Apache Kafka Committer and Kafka Streams developer Sophie Blee-Goldman shares about how to solve the stop-the-world rebalance and scaling out problem in Kafka Streams using probing rebalances.

Listen Now

Episode 150March 31, 2021 | 30 min

Building Real-Time Data Pipelines with Microsoft Azure, Databricks, and Confluent

Processing data in real time is a process, as some might say. Angela Chu (Solution Architect, Databricks) and Caio Moreno (Senior Cloud Solution Architect, Microsoft) explain how to integrate Azure, Databricks, and Confluent to build real-time data pipelines that enable you to ingest data, perform analytics, and extract insights from data at hand.

Listen Now

Episode 151April 7, 2021 | 24 min

Resurrecting In-Sync Replicas with Automatic Observer Promotion ft. Anna McDonald

As most developers and architects know, data always needs to be accessible no matter what happens outside of the system. This week, Tim Berglund virtually sits down with Anna McDonald (Principal Customer Success Technical Architect, Confluent) to discuss how Automatic Observer Promotion (AOP) can help solve the Apache Kafka 2.5 datacenter dilemma, a feature now available in Confluent Platform 6.1 and above.

Listen Now

Got questions?

If there's something you want to know about Apache Kafka, Confluent or event streaming, please send us an email with your question and we'll hope to answer it on the next episode of Ask Confluent.

Email Us

Never miss an episode!

Confluent Cloud is a fully managed Apache Kafka service available on all three major clouds. Try it for free today.

Try it for free

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog

NEWStreamables

NEWLearn More

Articles

Patterns

FAQs

Blog

NEWStreamables

NEWLearn More

Language Guides

Tutorials

Demos

Language Guides

Tutorials

Demos

Meetups

Ask the Community

Community Catalysts

NEWCommunity Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

Meetups

Ask the Community

Community Catalysts

NEWCommunity Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog