course: Apache Kafka® Internal Architecture

Configuring Durability, Availability, and Ordering Guarantees

9 min
Jun Rao

Jun Rao

Co-Founder, Confluent (Presenter)

Data Durability and Availability Guarantees

data-durability-availability-guarantees

One of the key components of Kafka’s durability and availability guarantees is replication. All data written to a partition will be replicated to N other replicas. A common value for N is 3, but it can be set to a higher value if needed. With N replicas, we can tolerate N-1 failures while maintaining durability and availability.

As mentioned in earlier modules, consumers only see committed records. Producers on the other hand have some choices as to when they receive acknowledgment for the success or failure of a produce request from the broker.

Producer acks = 0

producer-acks-0

The producer configuration, acks, directly affects the durability guarantees. And it also provides one of several points of trade-off between durability and latency. Setting acks=0, also known as the “fire and forget” mode, provides lower latency since the producer doesn’t wait for a response from the broker. But this setting provides no strong durability guarantee since the partition leader might never receive the data due to a transient connectivity issue or we could be going through a leader election.

Producer acks = 1

producer-acks-1

With acks=1, we have a little bit better durability, since we know the data was written to the leader replica, but we have a little higher latency since we are waiting for all the steps in the send request process which we saw in the Inside the Apache Kafka Broker module. We are also not taking full advantage of replication because we’re not waiting for the data to land in the follower replicas.

Producer acks = all

producer-acks-all

The highest level of durability comes with acks=all (or acks=-1), which is also the default. With this setting, the send request is not acknowledged until the data has been written to the leader replica and all of the follower replicas in the ISR (in-sync replica) list. Now we’re back in the situation where we could lose N-1 nodes and not lose any data. However, this will have higher latency as we are waiting for the replication process to complete.

Topic min.insync.replicas

topic-min-insync-replicas

The topic level configuration, min.insync.replicas, works along with the acks configuration to more effectively enforce durability guarantees. This setting tells the broker to not allow an event to be written to a topic unless there are N replicas in the ISR. Combined with acks=all, this ensures that any events that are received onto the topic will be stored in N replicas before the event send is acknowledged.

As an example, if we have a replication factor of 3 and min.insync.replicas set to 2 then we can tolerate one failure and still receive new events. If we lose two nodes, then the producer send requests would receive an exception informing the producer that there were not enough replicas. The producer could retry until there are enough replicas, or bubble the exception up. In either case, no data is lost.

Producer Idempotence

producer-idempotency

Kafka also has ordering guarantees which are handled mainly by Kafka’s partitioning and the fact that partitions are append-only immutable logs. Events are written to a particular partition in the order they were sent, and consumers read those events in the same order. However, failures can cause duplicate events to be written to the partition which will throw off the ordering.

To prevent this, we can use the Idempotent Producer, which guarantees that duplicate events will not be sent in the case of a failure. To enable idempotence, we set enable.idempotence = true on the producer which is the default value as of Kafka 3.0. With this set, the producer tags each event with a producer ID and a sequence number. These values will be sent with the events and stored in the log. If the events are sent again because of a failure, those same identifiers will be included. If duplicate events are sent, the broker will see that the producer ID and sequence number already exist and will reject those events and return a DUP response to the client.

End-to-End Ordering Guarantee

end-to-end-ordering

Combining acks=all, producer idempotence, and keyed events results in a powerful end-to-end ordering guarantee. Events with a specific key will always land in a specific partition in the order they are sent, and consumers will always read them from that specific partition in that exact order.

In the next module we will look at transactions which take this to the next level.

Use the promo code INTERNALS101 to get $101 of free Confluent Cloud usage

Disagree? If you believe that any of these rules do not necessarily support our goal of serving the Apache Kafka community, feel free to reach out to your direct community contact in the group or community@confluent.io

Be the first to get updates and new content

We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.