course: Event Sourcing and Event Storage with Apache Kafka®

Why Store Events?

4 min

Anna McDonald

Principal Customer Success Technical Architect (Presenter)

Ben Stopford

Lead Technologist, Office of the CTO (Author)

Why Store Events

In the previous module, you learned how event sourcing differs from traditional forms of data management since the source of truth is an immutable event log rather than a mutable table. But why would you consider implementing event sourcing in your system? There are three primary benefits of event sourcing: it's evidentiary, it's recoverable, and it's insightful.

Event Sourcing is Evidentiary

Unlike a database table where rows are updated with new values, events simply accumulate in an event log, providing the perfect evidentiary basis for a system. This is similar to the way that accountants perform double-entry bookkeeping, a method where no numbers are changed, ever. Instead, entries are always appended to the ledger. (You may have heard the old adage, "Accountants don't use erasers.") Accountants work this way because it's evidentiary: If a calculation goes wrong for whatever reason, they can always go back and figure out why.

Since it's append only, event sourcing is similar. You can look back in time at the event log and figure out what really happened or why things went wrong. This is a huge advantage when trying to figure out why a problem occurred or why a result has incorrect figures.

Events are Recoverable

The second advantage of event sourcing is recovery through replayability, which is particularly important for data systems. Implementing a fix for a standard bug, like a formatting failure on a web application form, is generally a straightforward process: You change the code to fix the bug, and you ship. But data-related problems are often not so easy. If your service performs a computation such as calculating interest on an account and there is a bug in the computation, fixing the software likely isn't enough. There will be a significant number of accounts whose data has been corrupted as a result of the bug. Fortunately, with an event-based model, the problem is simple to fix: First fix the bug, then rewind back to a point before the bug surfaced, and replay the old events. Both the software and its resulting data are repaired in one go.

Event Data is Insightful

The final advantage of event sourcing comes as a result of its collecting detailed, event-level data: This data can be put to great use in analytics systems, whether for machine learning or for other types of analysis. Returning to the e-commerce example above, using events to represent the cart gives you an accurate, truthful record of the user's entire journey. This lets you solve useful problems that would be difficult to address otherwise. For example, you can use the data to figure out why people aren't buying much in your shop at a particular time or within a given category. This is in stark contrast to what a CRUD data model is capable of: simply representing the end state.

Do you have questions or comments? Join us in the #confluent-developer community Slack channel to engage in discussions with the creators of this content.

Use the promo code EVENTS101 & CONFLUENTDEV1 to get $25 of free Confluent Cloud usage and skip credit card entry.

Get Started

Why Store Events?

In module two, we discussed the basics of event sourcing. But you might wonder, "We've been building systems with traditional databases for decades, why should we change?" Storing data as events rather than mutable tables has three main advantages, which we're gonna cover in this module. The first is that events are evidentiary. So what does that mean exactly? One reason is that events are immutable. They never change. So unlike a database table where we update different rows with new values, events simply accumulate in an event log. So we can feel safe in the knowledge that they can never, ever change. This provides the perfect evidentiary basis for a system, one that allows you to look back at what happened at a previous time. This approach is quite similar to the way accountants do double entry bookkeeping, a method where no numbers are changed ever. Instead, entries are always upended to the ledger. This is where the old adage, "Accountants don't use erasers" come from. Accountants do this because it is evidentiary. If a calculation goes wrong for whatever reason, they can always go back and figure out why, because all the data for the previous steps remains present. Being append only, event sourcing is similar. We can look back in time at the event log and figure out what really happened or why things went wrong. This is a huge advantage when trying to figure out why a problem occurred or why some result has the incorrect figures. The second advantage of event sourcing is replayability. And this is of particular importance for data systems. To understand this, first consider the typical bug. Say a formatting failure in one of the input forms of a web application. This kind of bug is usually pretty easy to fix. You change the code, fix a bug, and ship it. But data-related problems are often not so easy to fix. If your service does a computation, for example, calculating interest on an account, and there is a bug in the computation, fixing and releasing the software likely isn't enough. There will be a whole number of accounts whose data has been corrupted as a result of the bug, but with an event based model, the problem is simple to fix. First, fix the bug. Then rewind back to a point before the bug surfaced and replay the old events. Thus both the software and the resulting data are fixed in one go. The final advantage comes from collecting such detailed event level data that it can be fed into an analytic system, whether for machine learning or for other types of analysis. To return to the shopping cart we used as an example in module two, using events to represent the cart gives an accurate, truthful record of a user's behavior. Just like the chess game for module one, we're tracking the whole game, not just the end-state. This lets us solve useful problems that would be really hard to solve otherwise. For example, figuring out why people aren't buying that much in our shop at a particular time or within a given category. This sort of analysis is possible with the event-based model because the user shopping behavior has been captured, i.e. exactly what they did: add to the shopping cart, remove from the shopping cart, et cetera. This is in stark contrast to what a CRUD data model is capable of, simply representing the end-state. So, storing data as events comes with significant benefits. But the context thus far has been simple, monolithic applications. What happens when our architectures grow larger? What happens when we incorporate event streaming as a storage media? We'll find all this out in module four when we discuss event sourcing with Kafka.

Be the first to get updates and new content

We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog

NEWStreamables

NEWLearn More

Articles

Patterns

FAQs

Blog

NEWStreamables

NEWLearn More

Language Guides

Tutorials

Demos

Language Guides

Tutorials

Demos

Meetups

Ask the Community

Community Catalysts

NEWCommunity Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

Meetups

Ask the Community

Community Catalysts

NEWCommunity Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog

Modules: Start from lesson 1
Total 7