Infinite Retention Event Stream

Many use cases demand that Events in an Event Stream will be stored for forever so that the dataset is available in its entirety.

Problem

How can we ensure that events in a stream are retained forever?

Solution

infinite-retention-event-stream

The solution for infinite retention depends on the specific Event Streaming Platform. Some platforms support infinite retention "out of the box", requiring no action on behalf of the end users. If an Event Streaming Platform does not support infinite storage, infinite retention can be partially achieved with an Event Sink Connector pattern which offloads Events into permanent external storage.

Implementation

When using Confluent Cloud, infinite retention is built into the Event Streaming Platform (availability may be limited based on cluster type and cloud provider). Users of the platform can benefit from infinite storage without any changes to their client applications or operations.

For on-premises Event Streaming Platforms, Confluent Platform adds the ability for infinite retention by extending Apache Kafka with Tiered Storage. Tiered storage separates the compute and storage layers, allowing the operator to scale either of those independently as needed. Newly arrived Events are considered "hot", but as time moves on, they become "colder" and migrate to more cost-effective external storage like an AWS S3 bucket. As cloud-native object stores can effectively scale to infinite size, the Kafka cluster can act as the system of record for infinite Event Streams.

Considerations

Infinite Retention Streams are typically used to store entire datasets which will be used by many subscribers. For example, storing the canonical customer dataset in an Infinite Retention Event Stream makes it available to any other system, regardless of their database technology. The customer's dataset can be easily imported or reimported as a whole.
Compacted Event Streams are often used as a form of Infinite Retention Event Stream. However compacted streams are not infinite. Instead, they retain only the most recent Events for each key, meaning their contents matches the dataset held in an equivalent CRUD database table.

References

The blog post Infinite Storage in Confluent goes describes the tiered storage approach in more detail.
An Event Sink Connector can be used to implement an infinite retention event stream by loading Event into permanent external storage.

Confluent Cloud is a fully managed Apache Kafka service available on all three major clouds. Try it for free today.

Try it for free

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog

NEWStreamables

NEWLearn More

Articles

Patterns

FAQs

Blog

NEWStreamables

NEWLearn More

Language Guides

Tutorials

Demos

Language Guides

Tutorials

Demos

Meetups

Ask the Community

Community Catalysts

NEWCommunity Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

Meetups

Ask the Community

Community Catalysts

NEWCommunity Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog