Event Chunking

Sometimes compression can reduce message size, but there are various use cases that entail large message payloads where compression may not be enough. Often these use cases are related to image, video, or audio processing: image recognition, video analytics, audio analytics, etc.

Problem

How do I handle use cases where the event payload is too large to move through the event streaming platform as a single event?

Solution

chunking

Instead of storing the entire event as a single event in the event streaming platform, break it into chunks (an approach called "chunking") so that the large event is sent across as multiple smaller events. The producer can do the chunking when writing events into the event streaming platform. Downstream clients consume the chunks and, when all the smaller chunks have been received, recombine ("unchunk") them to restore the original event.

Implementation

Use metadata to track each chunk so that they can be associated to their respective parent event:

Association between any given chunk and its parent event
The chunk’s position in the parent event
The total number of chunks of the parent event

Considerations

Chunking places additional burden on client applications. First, implementing the chunking and unchunking logic requires more application development. Second, the consumer application needs to be able to cache the chunks as it waits to receive all the smaller chunks that comprise the original event. This, in turn, can have implications on memory fragmentation and longer garbage collection (GC). Mitigating this depends on the programming language: in Java, for example, the JVM heap size and GC can be tuned.

Client applications that are not aware of the protocol used for chunking events may not be able to reconstruct the original event accurately.

References

To handle large events, an alternative approach that may be preferred is Claim Check

Confluent Cloud is a fully managed Apache Kafka service available on all three major clouds. Try it for free today.

Try it for free

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWMastering Production Data Streaming Systems with Apache Kafka®

Kafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWMastering Production Data Streaming Systems with Apache Kafka®

Kafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog

NEWLearn More

Articles

Patterns

FAQs

Blog

NEWLearn More

Language Guides

Tutorials

Demos

Language Guides

Tutorials

Demos

Meetups & Events

Ask the Community

Community Catalysts

NEWCommunity Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

Meetups & Events

Ask the Community

Community Catalysts

NEWCommunity Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWMastering Production Data Streaming Systems with Apache Kafka®

Kafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog

NEWLearn More

Language Guides