2

Many articles tell me that Kafka writes data to the PageCache first, which improves write performance.

However, I have a doubt, when ack=-1, when copy=2, the data does already exist in the PageCache of both nodes.

If Kafka responds to acks at this time, and immediately, both nodes experience a power outage or system crash at the same time, at this time, neither node's data is yet persistent on disk.

In this extreme case, data loss can still occur?

Smokeriu
  • 147
  • 1
  • 5

1 Answers1

1

Data loss can occur in the situation outlined.

Related reading:

  • this other answer
  • Confluent blog post: "Since the log data is not flushed from the page cache to disk synchronously, Kafka relies on replication to multiple broker nodes, in order to provide durability. By default, the broker will not acknowledge the produce request until it has been replicated to other brokers."
davidhwang
  • 1,343
  • 1
  • 12
  • 19