Staff Technologist, Office of the CTO (Presenter)
A fact event details the entire scope of what happened at a specific point in time. It forms a complete recording of the occurrence, much like a photograph. It contains all of the fields and values necessary to completely describe the fact in the context of your business. You can also think of a fact event similarly to how you may think of a row in a database: a complete set of data pertaining to the row at that point in time.
Aside from the unique cart_id, the example fact event also contains an item_map, which is a collection of item IDs mapped to the item quantity. Inside this item_map is 1 item of ID 521 and 3 items of ID 923. Finally, the fact also contains the shipping method currently chosen for the shopping cart.
A delta event details the changes between one state and another. The contents of a delta event can vary quite widely, but often includes information about the fields that have changed, the new values for those fields, and may also include the reason for the change. Delta events do not include information about data that hasn’t changed.
From its name, the item_added_to_cart delta event infers that the reason for the event is due to an item addition. The contents of the event include the cart_id, the item_id to be added, and the quantity of items added during that action.
You may be wondering, why send all the data in a fact event, even if it hasn’t changed? Why not simply send the delta?
One of the biggest reasons for using fact events is that you do not have to compute any of the state of that fact yourself. You simply subscribe to the topic that contains those facts and receive fully computed state as the data comes in. As a consumer, you do not have to build up the state yourself from multiple delta event types, which can be risky and error-prone, especially as data schemas evolve and change over time. Instead, you rely on the team that owns that section of the business to compute and produce a fully detailed fact event.
Fact events enable a pattern known as “event-carried state transfer,” which is one of the best ways to asynchronously distribute immutable state to all consumers who need it.
This module covers the properties of fact, delta, and composite event types, and discusses the scenarios for which they’re best and worst suited. Additionally, we provide you with some examples of each, as well as some usage recommendations and some tips for avoiding pitfalls.
While the shopping cart facts seem relatively simple, and plausible to compose the current state via delta events, let’s consider a more complex scenario where you want to build up the employee’s annual tax return based on a number of forms, documents, and capital gains.
If the tax team emits delta events for every tax change, you’d then end up with a vast amount of events detailing complex changes. It would be very hard for you to piece these all back together, in the correct order, with the correct business logic, to obtain the total tax results. In contrast, if the team simply emits the completed tax results as a single tax fact, you can easily consume it into your system without having to compute anything on your end.
Measurements from sensors and other Internet of Things (IoT) devices can also be represented as a stream of facts. Each measurement provides a complete picture of the state at the specific point in time: a weather sensor indicates the temperature, humidity, pressure, wind speed, and solar radiation readings. You can use these events to derive temperature trends, track humidity percentages, and detect changing weather patterns.
User behavior events, such as those collected by using Google Analytics, can also be represented as facts. Signups, searches, clicks, views, and other engagements can provide a complete view of what happened at the moment of the behavior.
Fact events can be modeled to include both a full copy of the state before a change and a full copy of the state after the change has occurred. This model is commonly found with change-data-capture services, but can be made part of any fact-based event model. In this example, the cart fact has been augmented to contain both the state of the cart before the change, and the state of the cart after the change. Note that three new items were added to the item_map, and that the shipping selection was also updated.
Before and after fields provide you with a complete picture of what data changed, giving you the ability to react to simple changes without having to store any local state. However, a major downside of this approach is that it effectively doubles the amount of data transmitted over the wire, and incurs extra network, storage, and processing costs as a trade-off.
We can also use fact events to infer a change - though it does require that we have two states to compare.
In this case, the first fact contains just one item in the shopping cart. The second fact shows the addition of 3 more items, as well as changing the shipping to “express”.
If the consumer wants to infer what changed from Fact 1 to Fact 2, they will need to maintain a copy of local state and infer the changes between Fact 1 and Fact 2. While this does incur state maintenance costs, the consumer service has a full copy of each state and can infer changes to any field, providing exceptional choice and flexibility.
There are a few things to consider when creating fact events.
First, the size of the event. Fact events could be an order of magnitude larger than delta events, especially if you decide to include the before and after components. For certain datasets, the sheer size of data may make it untenable to contain within a single fact.
Second, the frequency of change needs to be accounted for. A fact event that is updated very frequently will have many instances created. If the event size is large, it could have a substantial impact on the producer, event broker, and consumer. If you decide to pack multiple changes into a single fact event, you need to consider if it remains timely enough for the business to make use of. Finding the right balance can be challenging.
Third, fact events can lose the intent. When you emit a fact you’re basically saying, “Here is the state of the object right now,” but you do not communicate the business reason why the event change occurred. While you can still infer what fields changed by comparing a new fact event with a previous fact, the business “why” is not explicitly communicated.
Fact events remain a very useful way to communicate the current state of affairs to consumers, but ensure you have a good understanding of these trade-offs and determine if they’re acceptable to your use cases.
Delta events are defined as the changes that occur within a system, exposed as an event on the outside.
The left side of this illustration shows a sequence of moves in a game of chess—delta events. Each move indicates the piece that moved, along with the source and destination location. In contrast, the total state of the board is shown on the right with each subsequent move—synonymous with fact events.
We have already introduced the basic add and remove events for our shopping cart example. But what about some of the other operations that we might do in the cart domain?
This is a discount code event, produced whenever a user applies a discount code to their shopping cart. This event would be useful for other applications that care about when a discount code is applied to listen and react to codes being applied—but not when a code is removed. That would be yet another event.
Delta events are good at capturing the intent behind the creation of the event, and provides context for the event’s consumers.
A cart fact models the portion of the cart state that you want to publicly expose to other consumer services in our system.
With delta modeling, you can instead select specific state transitions and build a model that describes the change—not the state.
You can also use fact events to infer a change—though it does require that you have two states to compare.
In this case, the first fact contains the state before the discount is applied, where both the discount_code and discount_cost fields are null. The service then applies the discount code to produce a fully updated cart fact with the current state. Note that discount_cost is now 42.39, and the discount_code has also been populated.
Unless the fact events contain both a before and after, any consumer that wants to react to changes based on facts will need to maintain a copy of local state and infer the changes between fact 1 and fact 2. While this does incur state maintenance costs, the consumer service has a full copy of each state and can infer changes to any field, providing exceptional choice and flexibility.
Delta events are very well suited for exposing specific transitions in a system to downstream subscribers. Delta events provide a partial set of information that is usable for the consumers to act upon with their own business logic.
While there are many different ways to model your delta events, there are several risks to using them, and purposes for which they are not well suited at all. Let’s take a look at these.
Let’s start by looking at how delta events are commonly used. Event sourcing is one of the main use cases for delta events—instead of modifying state directly, you issue a “change” to an event stream. Then, you take all of those changes and apply them in the same order they occurred, to build up your current state.
Let’s look at an event sourcing shopping cart example.
On the left is a sequence of cart events detailing additions, removals, and eventually user checkout shown in the descending order. The producer system creates the events and writes them to a Kafka topic. These events are then read and applied in the same order they were written to build up the current state. The right hand side shows the current consumer state, built up by applying each of the delta events in the precise order that they were written.
You should avoid sharing event sourcing events outside of your service as they are heavily intertwined with the services implementation details.
Multiple independent consumers who want to obtain a current copy of the cart state instead have to compute it all on their own, duplicating cart building logic across multiple systems. Doing so is very risky for several reasons:
While you can use delta events to build a subset or customized view of state unique to the downstream service, many consumers can simply rely on fact-based event-carried state transfer instead. Delta types still remain great for communicating when a change has occurred, but they’re overall a poor choice for communicating the current state of a system.
With fact events, the cart building logic is kept entirely within the producer service. It owns all of the logic for composing the cart, as well as the schema and the definitions of each field and value. If a consumer wants to know what’s going on in a shopping cart, they simply listen to the fact events detailing the current state of the cart.
All of the cart building logic lives inside of the single producer service. It may be using event sourcing internally - but that is an implementation detail about what’s going on inside its boundary - its data on the inside. In contrast, the cart fact stream is published specifically as data on the outside. Consumers can infer their own deltas from the stream of facts by maintaining their own cache about the fields they care about.
The first risk is event model sprawl.
This issue tends to arise more commonly with fact events, since they’re supposed to detail the entire state of an entity at a point in time. Your consumers may request that you add more information to the fact, especially in the context of denormalizing data and joining it with data from other domains.
In this sample dialogue we have a consumer requesting extra data about the total cost of items in the cart. It may be reasonably trivial to include the pricing information, especially since it is readily available when the item is added to the cart. However, it may be much more difficult to include other data, such as item-specific properties, inventory levels, and estimated shipping time, as these pieces of data are likely the responsibility of another service.
In this case, our producer chose to expand the cart fact to include the cost_of_items because that data is readily available and the request is easily fulfilled. Meanwhile, UPC codes, inventory levels, and estimated shipping times are not sources of data that this service deals with or owns. To obtain the data, we would need to query another service or ingest another event stream, simply to do the work on behalf of the consumer. Instead, we reject this request and push the work back down to the consumer.
As a general rule for event design, feel free to add any data that your service owns to the event, but be careful about adding in data that your service doesn’t own or doesn’t use.
Another risk involves custom-triggered events, in most cases pertaining to delta events. While fact events are triggered whenever a field in the event is updated, delta events are triggered whenever specific business logic is met. For example, item_added_to_cart is triggered when an item is added to the cart.
However, you may run into consumers requesting custom notifications for when the conditions of increasingly complex scenarios are met. In this case, the requesting consumer wants to know when a sale item was added to the cart, but only if it’s clothing and only if it’s worth more than $50.
There’s a problem with this.
On the left is the producer application, in the middle is an event stream of deltas, and on the right is the consumer. The dotted lines represent the consumer-specific business logic.
One option is to put the business logic into the producer application and use it to generate an event. However, this logic is entirely specific to the consumer’s needs, but it does not live inside of the consumer’s code-base - it lives in the producer’s code! Meanwhile, reactive business work lives entirely inside the consumer’s code.
If the consumer wants to update the trigger logic at a later time (say, clothing must only cost $100 or more now), they must file a ticket to have the logic changed at the producer instead of simply changing the logic in their application. This results in extremely tight coupling, where business logic is distributed amongst many systems and changes become very hard to make.
Instead, put the consumer business logic where it belongs, inside the consumer. The consumer is solely responsible for building up their own state to evaluate the business logic, using either facts or deltas as they see fit. The consumer retains full control of their business logic, and the producer is not on the hook for producing any fine-grained events.
People commonly ask if it’s possible to build a fact event with the reason behind why it was created. The answer to this is yes, and the event type is known as a composite event. It’s helpful to think of these as a combination of both fact and delta events.
In this example, our cart fact includes an item_added_to_cart reason.
You can also create composites that have before and after fields. This composite is exactly the same event as the previous one, but it shows the state both before the change occurred, and after the change occurred.
As an alternative, you can build your composite events as a delta with some state.
This example shows an item_added_to_cart delta that also contains the current cart state, computed after the results of the event have been applied.
This requires the producer to maintain a current model of the state, but reduces the need for consumers to build up their own state based on deltas.
In practice, composite events aren’t particularly common, but they can be useful on occasion and are a good option to keep in your toolkit.
So we’ve looked at fact, delta, and composite events. But when do each work best, and when should you use one over the other?
Overall, facts provide the best choice for sharing state between systems. The producer retains all of the complexity of building up any current state, while the consumer benefits from the simplicity of up-to-date state transfer without any computation on its own part.
In contrast, simple delta events can provide “hooks” for consumers to react to certain conditions. Consumers remain responsible for their own business logic for more complex scenarios, and should rely on composing their own state using fact events. Keep deltas simple so that they can be reused and built upon by other applications.
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.
Hi, I'm Adam from Confluent. In this module, we're gonna be talking about modeling as facts versus delta event types. A fact event details the entire scope of what happened at a specific point in time. It contains all of the fields and values necessary to completely describe the fact in the context of your business. You can also think of a fact event similar to how you may think of a row in a database, a complete set of data pertaining to that row at that point in time. The cart event on one side contains a unique cart_id as well as an item_map, which is a collection of item IDs mapped to the item quantity. So inside this item map, we have one item of ID 521, and three items of ID 923. Finally, the fact also contains the shipping method currently chosen for the shopping cart. On the other side, we have a delta event, item_added_to_cart. Delta events detail the changes between one state and another. The contents of a delta event can vary quite widely, but typically includes information about the state change, including the reasons for the change. Delta events do not include information about data that hasn't changed. The name of the item_added_to_cart event gives a pretty clear reason for its existence. The contents of the event include the cart_id, the item_id to be added, and the quantity of items added during that specific action. Now, you may be wondering, "Why send all the data in a fact event if it hasn't changed? Why not simply send the delta?" One of the biggest reasons for using fact events is due to its simplicity and effectiveness in transferring state. This pattern is known as event-carried state transfer, which is one of the best ways to asynchronously distribute immutable state to all consumers who need it. As a consumer, you do not have to build up the state yourself from multiple delta event types. This can be risky and error-prone, especially as data schemas evolve and change over time. Instead, you rely on the team that owns that section of that business to compute and produce a fully detailed fact event that acts as a data-transfer object. While the shopping cart facts seem relatively simple and it's plausible to compose the current state via delta events, let's consider a more complex scenario where we want to build up the employee's annual tax return based on a number of forms, documents, and capital gains. If the tax team emits delta events for every single tax change, you'd end up with a vast amount of events detailing very complex changes. It would be very hard for you to piece all of these back together in the correct order with the correct business logic to obtain the total tax results. In contrast, if the team simply emits the completed tax results as a single tax fact, you can easily consume it into your system without having to compute anything on your end. Measurements from sensors and other internet of thing devices can also be represented as a stream of facts. Each measurement provides a complete picture of the state at that specific point in time. A weather sensor may indicate the temperature and pressure at a given point in time. Consumers use these events to derive temperature trends and to detect changing weather patterns. Fact events can be modeled to include both a full copy of the state before a change, and a full copy of the state after a change has occurred. This model is commonly found with change data capture services, but can be made part of any fact-based event model. In this example, the cart fact shows what adding three new items to the item app may look like. You may also notice that the shipping selection was also updated, all in a single event. Before and after fields provide a complete picture of what data changed, providing consumers the ability to react to simple changes without having to store any local state. However, a major trade off of this approach is that it effectively doubles the amount of data transmitted over the wire, and incurs extra network, storage, and processing costs. We can also use fact events to infer a change, though it does require that we have two states to compare. In this case, the first fact contains just one item in the shopping cart. The second fact shows the addition of three more items, as well as changing the shipping to express. If the consumer wants to infer what changed from fact one to fact two, they will need to maintain a copy of local state, and infer the changes between these two facts themselves. While this does incur state maintenance costs, the consumer service has a full copy of each state, and it can infer any changes to any fields, providing them exceptional flexibility. There are a few things to consider though when creating event facts. First, the size of the event. Fact events could be an order of magnitude larger than delta events, especially if you decide to include both the before and after components. For certain data sets, the sheer size of data may make it untenable to contain within a single fact. Secondly, the frequency of change needs to be accounted for. A fact event that is updated very frequently will have very many instances created. If the event size is large, it could have a substantial impact on the producer, the event broker, and the consumers. If you decide to pack multiple changes into a single fact event, you'll need to consider if it remains timely enough for the business and the consumers to make use of. Finding the right balance can be challenging. Third, fact events can lose the intent. When we emit a fact, we're basically saying "Here's the state of the object right now," but we do not communicate the business reason why that event occurred. While you can still infer what fields changed by comparing a new fact with a previous fact, the business why is not explicitly communicated. Fact events remain a very useful way to communicate the current state of affairs to your consumers, but you must ensure that you have a good understanding of these trade-offs, and determine if they're acceptable to your use cases. And now we're going to look at the other main event type, the delta event. Delta events are defined as the changes that occur within a system, exposed as an event on the outside. One side shows a sequence of moves in a game of chess. These are delta events. Each move indicates the piece that moved, along with the source and destination location. In contrast, the total state of the board is shown on the other side with each subsequent move. This is synonymous with fact events. We have already introduced the item_added_to_cart event, but what about another operation that we might do in the cart domain? This is a discount code event, produced whenever a user applies a discount code. This event is useful for other applications that want to react when a discount code is applied. But you'd be correct to guess that we'd also need a discount code removed event to indicate when that code is no longer in effect. Delta events are good at capturing the intent behind the creation of the event, and provides context for the event's consumers. A cart fact models the portion of the cart state that we want to publicly expose to other consumer services in our organization. With delta modeling, we instead select specific state transitions, and build a model that describes the change, not the state. Delta events are very well-suited for exposing specific transitions in a system to downstream subscribers. Delta events provide a partial set of information that is usable for the consumers to act upon with their own business logic. Let's start by looking at how delta events are commonly used. Event sourcing is one of the main use cases for delta events, typically used for data on the inside. Instead of modifying state directly, you issue a change to an event stream. Then, you take all of those changes, and apply them in the same order they occurred to build up your current state, just like the chessboard example we just saw. Let's look at an event sourcing shopping cart example. On one side, we have a sequence of cart events detailing additions, removals, and, eventually, user checkouts shown in a descending order. The producer system creates the events and writes them to a Kafka topic. These events are then read and applied in the same order that they were written to build up the current state. The other side shows the current consumer state, built up by applying each of the delta events in the precise order that they were written. You should avoid sharing event sourcing events, outside of your service, as they are heavily intertwined with the service's implementation details. Multiple independent consumers who want to obtain a current copy of the cart state would have to then compute it all on their own, duplicating cart-building logic across multiple systems. Doing this is very risky for several reasons, the cart composition logic may change over time, requiring synchronized updates to multiple consumers, the consumer may fail to compute the cart state correctly, which may go unnoticed, and cause difficult to track down errors, and the consumer application may fail to include an event type in their computations. This may be due to a human error, but may also be due to the creation of a new event type due to changing business requirements. While you can use delta events to build a subset or a customized view of state unique to the downstream service, many consumers can simply rely on fact-based event-carried state transfer, instead. Delta types still remain great for communicating when a change has occurred, but they're overall a poor choice for communicating the current state of a system. With fact events, the cart-building logic is kept entirely within the producer service. It owns all of the logic for composing the cart, as well as the schema, and the definitions of each field and value. If a consumer wants to know what's going on in a shopping cart, they simply listen to the fact events detailing the current state of the cart. All of the cart-building logic lives inside of the single producer service. Now, it may be using event sourcing internally, but that is an implementation detail about what's going on inside its boundary. This is data on the inside. In contrast, the cart fact stream is published specifically as data on the outside. Consumers can infer their own deltas from the stream of facts by maintaining their own cash about the fields that they care about. Both facts and deltas can provide data to event stream consumers, but it's important that you explicitly think about the data leaving your system boundaries as data on the outside, separating it from the data on the inside. Your events act as data-transfer objects that are purpose-built for transferring information to other teams and systems. While it's important to consider your consumer's use cases when creating facts and deltas, you should remain cautious about overly specific events, and overly broad data models. Let's take a look at these two risks a bit more closely. The first risk is event model sprawl. This issue tends to arise more commonly with fact events, since they're supposed to detail the entire state of an entity at a point in time. Your consumers may request that you add more information to the fact, especially in the context of denormalizing data, and joining it with data from other domains. In this sample dialogue, we have a consumer requesting extra data about the total cost of items in the cart. It may be reasonably trivial to include the pricing information, especially since it's readily available when the item is added to the cart. However, it may be much more difficult to include other data such as item-specific properties, inventory levels, and estimated shipping time, as these pieces of data are likely the responsibility of another service. In this case, our producer chose to expand the cart fact to include the cost of items, because that data is readily available, and the request is easily fulfilled. Meanwhile, UPC codes, inventory levels, and estimated shipping times, are not sources of data that the service deals with or owns. To obtain the data, we would need to query another service, or ingest another event stream, simply to do the work on behalf of the consumer. Instead, we reject this request, and push the work back down to the consumer. As a general rule for event design, feel free to add any data that your service owns to the event, but be careful about adding in data that your service doesn't own or does not use. Another risk involves custom-triggered events, in most cases pertaining to delta events. While fact events may be triggered whenever a field in the event is updated, delta events are triggered whenever specific business logic conditions are met. For example, item_added_to_cart is triggered when an item is added to the cart, however, you may run into consumers requesting custom notifications for when the conditions of increasingly complex scenarios are met. In this case, the requesting consumer wants to know when a sale item was added to de cart, but only if it's clothing, and only if it's worth more than $50. There's a problem with this. On one side is the producer application, in the middle is an event stream of deltas, and on the other side is the consumer. These dotted lines represent the consumer-specific business logic. One option is to put the business logic into the producer application, and use it to generate an event. However, this logic is entirely specific to the consumer's needs, but it does not live inside the consumer's code base, it lives in the producer's code. Meanwhile, reactive business work lives entirely inside of the consumer's code. If the consumer wants to update the trigger logic at a later time, say that clothing must now cost a hundred dollars or more, then they must file a ticket to have the logic changed at the producer, instead of simply changing the logic in their own application. This results in extremely tight coupling, where business logic is distributed amongst many systems, and changes become very hard to make. Instead, put the consumer business logic where it belongs, inside the consumer. The consumer is solely responsible for building up their own state to evaluate the business logic, using either facts or deltas as they see fit. The consumer retains full control of their business logic, and the producer is not on the hook for producing any fine-grained events. One other question that we commonly get is "Is it possible to build a fact event with some sort of reason as to why it was created?" The answer to this is yes, and the event type is known as a composite event. It's helpful to think of these as both a combination of fact and delta. In this example, our cart fact includes an item_added_to_cart reason. You can also create composites that have before and after fields. This composite is the same before and after event we saw earlier in this module, but now it contains a reason field as well. Your consumers can choose to react in different ways depending on the reason for the change. As an alternative, you can build your composite event as a delta with some state. This example shows an item_added_to_cart delta that also contains the current_cart state, computed after the results of the event have been applied. This requires the producer to maintain a current model of the state, but reduces the need for consumers to build up their own state based on deltas. In practice, composite events aren't particularly common, but they can be useful on occasion, and are a good option to keep in your toolkit. So far, we've looked at facts, deltas, and composites, but when do each work best, and when should we use one over the other? Overall, facts provide the best choice for sharing state between systems. The producer retains all of the complexity of building up any current state, while the consumer benefits from the simplicity of up-to-date state transfer without any computation on its own part. In contrast, delta events can provide hooks for consumers to react to certain conditions. Consumers remain responsible for inferring more complex deltas, though. While it can be possible to communicate state via deltas, it generally remains a better idea to use fact events for this purpose.