7

If I am using Kafka Async producer, assume there are X number of messages in buffer. When they are actually processed on the client, and if broker or a specific partition is down for sometime, kafka client would retry and if a message is failed, would it mark the specific message as failed and move on to the next message (this could lead to out of order messages) ? Or, would it fail the remaining messages in the batch in order to preserve order?

I next to maintain the ordering, so would ideally want to kafka to fail the batch from the place where it failed, so I can retry from the failure point, how would I achieve that?

Guruprasad GV
  • 916
  • 13
  • 18
  • Kafka will only retry if you have changed the default setting of retires. fron kafka docs: Allowing retries will potentially change the ordering of records because if two records are sent to a single partition, and the first fails and is retried but the second succeeds, then the second record may appear first. – Hector Sep 03 '16 at 06:53

2 Answers2

1

Like it says in the kafka documentation about retries

Setting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error. Note that this retry is no different than if the client resent the record upon receiving the error. Allowing retries will potentially change the ordering of records because if two records are sent to a single partition, and the first fails and is retried but the second succeeds, then the second record may appear first.

So, answering to your title question, no kafka doesn't have order guarantees under async sends.


I am updating the answers base on Peter Davis question.

I think that if you want to send in batch mode, the only way to secure it I would be to set the max.in.flight.requests.per.connection=1 but as the documentation says:

Note that if this setting is set to be greater than 1 and there are failed sends, there is a risk of message re-ordering due to retries (i.e., if retries are enabled).

Nautilus
  • 2,236
  • 2
  • 17
  • 33
  • 1
    This doesn't answer the behavior if retries=0. If you do require ordering, but want to send in batch without calling Future.get() after each send, can I set retries=0 and expect that if the producer fails to send one message, then it will fail all subsequent messages to the partition to preserve ordering? – Peter Davis Feb 29 '16 at 17:04
  • You can rely on Kafka ordering as long as you set retries to zero and each of your topics has only one partition. You do not have to set max.in.flight.requests.per.connection=1 as you only have a single partition so Kafka will only ever have 1 in flight request as partitions are kafka mechanism for parallism – Hector Sep 03 '16 at 06:49
0

Starting with Kafka 0.11.0, there is the enable.idempotence setting, as documented.

enable.idempotence: When set to true, the producer will ensure that exactly one copy of each message is written in the stream. If false, producer retries due to broker failures, etc., may write duplicates of the retried message in the stream. Note that enabling idempotence requires max.in.flight.requests.per.connection to be less than or equal to 5, retries to be greater than 0 and acks must be all. If these values are not explicitly set by the user, suitable values will be chosen. If incompatible values are set, a ConfigException will be thrown.

Type: boolean Default: false

This will guarantee that messages are ordered and that no loss occurs for the duration of the producer session. Unfortunately, the producer cannot set the sequence id, so Kafka can make these guarantees only per producer session.

Have a look at Apache Pulsar if you need to set the sequence id, which would allow you to use an external sequence id, which would guarantee ordered and exactly-once messaging across both broker and producer failovers.

Evgeniy Berezovsky
  • 18,571
  • 13
  • 82
  • 156