course: Designing Event-Driven Microservices

The Transactional Outbox Pattern

6 min

Wade Waldron

Principal Software Practice Lead

The Transactional Outbox Pattern

Overview

The transactional outbox pattern leverages database transactions to update a microservice's state and an outbox table. Events in the outbox will be sent to an external messaging platform such as Apache Kafka. This technique is used to overcome the dual-write problem which occurs when you have to write data to two separate systems such as a database and Apache Kafka. The database transactions can be used to ensure atomic writes between the two tables. From there, a separate process can consume the outbox and update the external system as required. This process can be written manually, or we can leverage tools such as Change Data Capture (CDC) or Kafka connectors.

Topics:

What is the dual-write problem?
What is the transactional outbox pattern?
How can we emit events to Apache Kafka?
What tools can be used to process an outbox?
What delivery guarantees does the outbox pattern provide?
What are the problems with the outbox pattern?

Resources

Do you have questions or comments? Join us in the #confluent-developer community Slack channel to engage in discussions with the creators of this content.

Use the promo codes MICRO101 & CONFLUENTDEV1 to get $25 of free Confluent Cloud usage and skip credit card entry.

The Transactional Outbox Pattern

Hi, I'm Wade from Confluent.

When we emit events from a microservice, we have to solve the problem of how to update the database and emit the events in a transactional fashion.

One of the easiest ways to do this is using the Transactional Outbox Pattern.

Let's take a moment and see how this works.

The dual-write problem occurs whenever we need to update two separate systems.

For example, if we want to update the database and also write events to a separate system such as Apache Kafka.

Because these two systems aren't linked, we have no way to update both in a transactional fashion.

We need to find another way to ensure that either both get updated, or neither does.

This is where the Transactional Outbox pattern can help.

If we have a database that supports transactional updates, then we can use it to overcome the dual-write problem.

Rather than trying to update the database and Kafka at the same time, we push the transactional logic into the database.

Whenever we do an update in our database, we can also update an outbox table in the same transaction.

Think of the outbox as a mailbox.

We fill it with letters that needs to be delivered.

We then wait for a mail carrier to pick up the letters and deliver them to the post office.

In our case, the letters are the events we want to send to Kafka.

Kafka, acts as the post office.

But we still need something to fill the role of the mail carrier.

So what can we use to emit the events from the outbox table into Apache Kafka?

In this case, we can use a separate process that asynchronously monitors the table.

Whenever it sees a new event, it can deliver it to Kafka.

Once the event has successfully been delivered, it can be removed from the outbox table.

The process is often written as another thread within the original microservice.

However, it could run as a completely separate application.

Depending on the database you are using, you might be able to use a Kafka Connector or Change Data Capture system to monitor the table and emit the events.

The advantage of the transactional outbox pattern is it avoids the dual-write problem.

The state, and the outbox table, will always be updated in a transactional fashion.

If for some reason, the state fails to update, then the event won't be written to the outbox.

This means we can guarantee that the data in the outbox is completely in sync with the data in the database.

From there, the separate process that delivers the events to Kafka ensures that the outbox table and Kafka remain in sync.

This allows us to guarantee that every database operation will have a corresponding event in Kafka, although there will be a bit of a delay.

However, when the process is emitting events to Kafka, it's possible we will experience a failure or timeout.

In these situations, to guarantee that Kafka receives the data, we'll have to try again.

These retries can result in duplicate messages.

Our delivery guarantee to Kafka, is therefore at-least-once.

We guarantee every message in the outbox will eventually arrive in Kafka, but it may arrive more than once.

As a result, we need to make sure that downstream systems are prepared to handle any duplicates.

At-least-once guarantees are common in distributed systems and therefore, it is a good practice to implement deduplication logic to handle them, even when the dual-write problem isn't involved.

For example, the receiver might fail while processing a Kafka message and when it restarts it might receive the same message again.

We have to be prepared to handle those situations.

Now, we have to be cautious when using the transactional outbox pattern.

The outbox will potentially receive an insert for every database write.

This can result in a lot of traffic.

The frequent updates mean the database might keep the table in memory at all times, occupying a lot of resources.

Meanwhile, some databases aren't very efficient at handling deletes.

They might use tombstones under the hood and with so many inserts and deletes happening all the time, those tombstones can build up.

This can lead to heavy resource usage and contention in our table.

If the database isn't equipped for this kind of traffic, it might slow down our application because, remember, every write is going to be touching that outbox table.

To resolve these issues, we might need to make adjustments, such as marking records rather than deleting them, or adjusting how the database manages tombstones.

There can be long-term benefits to keeping the events around, so a delete may not be strictly necessary.

Some databases may even have special tables that are specifically designed for this kind of traffic.

If your system meets the requirements for the transactional outbox pattern, it can be a straightforward and effective way of overcoming the dual-write problem.

It is often easier to manage than other options such as event-sourcing or the listen-to-yourself pattern.

However, if you aren't using a transactional database, or if you have other reasons to avoid using an outbox, then these other patterns might be good options to consider.

When I first discovered the transactional outbox pattern, it seemed like a good solution to the dual-write problem.

What do you think?

Can you see yourself using this pattern in your applications? Is there an alternative pattern you prefer?

Let me know in the comments.

If you want a deeper dive into the dual-write problem, or other microservice concepts, check out our YouTube channel and the courses on Confluent Developer.

Remember to like, share, and subscribe.

And stay tuned for more.

Be the first to get updates and new content

We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog

NEWStreamables

NEWLearn More

Articles

Patterns

FAQs

Blog

NEWStreamables

NEWLearn More

Language Guides

Tutorials

Demos

Language Guides

Tutorials

Demos

Meetups

Ask the Community

Community Catalysts

NEWCommunity Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

Meetups

Ask the Community

Community Catalysts

NEWCommunity Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog

Modules: Start from lesson 1
Total 25