course: Event Sourcing and Event Storage with Apache Kafka®

Storing Data as Events

5 min

Anna McDonald

Principal Customer Success Technical Architect (Presenter)

Ben Stopford

Lead Technologist, Office of the CTO (Author)

Storing Data as Events

This module introduces event sourcing and shows how it shapes the events you are collecting in your application.

Classic CRUD: Create, Read, Update, and Delete

Imagine an e-commerce application. A user, Sanjana, has added a t-shirt, pants, and a hat to her shopping cart. The cart is stored in a database table, which looks exactly like the cart at this particular moment: it has rows and columns, and it holds exactly 1 pair of pants, 1 t-shirt, and 1 hat:

pair-of-pants

Adding a completely new item would add a new row to the screen and a new row to the database table. But if Sanjana adds more than one of an already-listed item, the item count simply increases on the screen and is saved to the database. This is a classic CRUD approach: you can create, read, update, or delete any item in your table. (You will find this quite familiar if you've used databases in the past.)

Event Sourcing: Create and Read Only

When event sourcing is applied, the same goal is achieved—the shopping cart is safely stored to disk—but the process is quite different. As with CRUD, you can create and read values, but unlike CRUD, you never update a value and you never delete a value; these two destructive operations are simply not allowed with event sourcing.

Event-Sourcing

Every action a user makes is preserved forever. So adding a t-shirt to the shopping cart is one event, adding another is a second event, and removing one is a third event. Checking out is also an event. As the events accumulate, they create a timeline of the user's activity, a kind of customer journey showing exactly what the customer did. In fact, this lack of destructive options makes event sourcing systems more like the version control systems in which you store code.

Getting Current State

A stream of events by itself, however, doesn't reflect current state. Say you need to determine how many pairs of pants are in the shopping cart right now. As you can see in the diagram below, that information is spread over multiple events—two "add" events and one "remove" event, each created at different times. Event sourcing systems solve this problem by reading all of the events into the client and then performing a computation that derives the current state. So getting the current pants count would involve a chronological reduce, i.e., the image on the left would be transformed into the one on the right, letting you display the current contents of the cart.

Getting-Current-State

Event Sourcing Preserves Everything

Notice how the event-based view on the left side of the diagram above has quite a bit more information than the table view on the right. The transformation is unidirectional: you can go from the events view to the table view, but not vice versa, because information is lost as you move from left to right.

This is exactly why event sourcing is so powerful: it retains extra data about what has happened in the world, data that the CRUD approach simply throws away.

Event Sourcing in Practice

Event sourcing with our cart example is practically accomplished as follows:

Events are stored in a table in a database, or alternatively in Apache Kafka; basically, you create a table and append events in the order that they occur.
When you need to query for the latest state of the cart, you run a query that returns specific events, most likely aggregated by customer ID or session ID.
You perform a chronological reduce to filter the events that are relevant for the view you'd like to serve, so a stream of five events is turned into the three rows of a more traditional database table:

Event-Sourcing-in-Practice

Use the promo code EVENTS101 to get $25 of free Confluent Cloud usage

Get Started

Be the first to get updates and new content

We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.

Storing Data as Events

Welcome to module two of the Event Sourcing course. In this module, we're gonna cover what event sourcing is. Let's start with a simple example. Imagine we have an e-commerce application A user, we'll call her Sanjana, has added a few items to her shopping cart, a T-shirt, some pants, and a hat. The shopping cart is stored in a database table, and that table looks just like the shopping cart does The table has rows and columns stored in the database that exactly match the rows displayed on the screen when a user clicks their cart. Adding a new item adds a new row to the screen and a new row to the database table. If you add more than one item, the item count goes up on the screen and is saved to the database. This approach is known as CRUD, as you can create, read, update, or delete any row in the table you wish. You should find this very familiar, If you've used databases before. When we apply event sourcing, we achieve the same goal, storing the shopping cart safely to disk, but in a different way. Like the CRUD method, we can create new entries and read them back, but we never update a value and we never delete a value. The two destructive operations, the methods that delete or change data in the table, are not allowed. The lack of any destructive operations makes event source systems behave more like the version control systems you store code in. The lack of destructive methods means event sourcing records every action a user makes as it happens and stores that action forever. Each action is an event and the collection of actions that build up over time are called event streams. So adding a new T-shirt to the shopping cart is an event, adding another is another event, checking out is an event. As the events build up, they create a timeline of the user's activity. We can see the user adding items, removing items, and finally, checking out at different points in time in the event stream, a kind of customer journey tracking exactly what the customer did, But the stream of events isn't a great resource for reading the current state from. Say we wanna know how many pants are in the shopping cart. That information is spread over multiple events. Two add events and one remove event, all created at different times. Events source systems solve this problem by reading all the events into the client and performing a computation that derives the current state. In this case, it's a chronological reduce. By transforming the event view on the left into the table view on the right, we can then display our shopping cart easily. Finally, note how the event face view on the left-hand side of the image has more information than the table view on the right. The transformation is unidirectional. So you can go from the events view to the table view, but not back again, as information is lost as we move left to right. This is why event sourcing is so powerful. It retains extra data about what really happened in the world that would traditionally, if we're using the CRUD approach, be thrown away. Let's walk through how we actually do this in practice. First, we store the events into a table in a database, or alternatively, you can use Kafka. Basically, you create a table and append events in the order that they occur. When you need to query for the latest shopping cart, you run a query that returns events, most likely aggregated by customer ID or session ID. And then you perform a chronological reduce to filter the events that are relevant for the view you'd like to serve. So the stream of five events is turned into the three rows of the more traditional database table. So we got to see how event sourcing is different from traditional forms of data management, because the source of truth is an event log rather than a table you can mutate. This may seem a little odd, though. Why go to all that trouble? In the next module, we'll explain why storing data in an event centric form like this is so valuable.

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWHybrid and Multicloud Architecture

NEWMastering Production Data Streaming Systems with Apache Kafka®

Kafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWHybrid and Multicloud Architecture

NEWMastering Production Data Streaming Systems with Apache Kafka®

Kafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog

NEWLearn More

Articles

Patterns

FAQs

Blog

NEWLearn More

Language Guides

Tutorials

Demos

Language Guides

Tutorials

Demos

Meetups & Events

Ask the Community

Community Catalysts

NEWCommunity Use Cases

DevX Newsletter

Data Streaming Awards

NEWCurrent 2024

NEWKafka Summit 2024 - Bangalore

NEWKafka Summit 2024 - London

Current 2023

Kafka Summit 2023

Meetups & Events

Ask the Community

Community Catalysts

NEWCommunity Use Cases

DevX Newsletter

Data Streaming Awards

NEWCurrent 2024

NEWKafka Summit 2024 - Bangalore

NEWKafka Summit 2024 - London

Current 2023

Kafka Summit 2023

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWHybrid and Multicloud Architecture

NEWMastering Production Data Streaming Systems with Apache Kafka®

Kafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog

NEWLearn More

Language Guides

Tutorials

Demos

Meetups & Events

Modules: Start from lesson 1
Total 7