2

What are the conditions that trigger a flush in ElasticSearch node/index/shard?

There are good explanations about the flow of ElasticSearch refreshing every second and flushing in a less frequent manner to avoid intensive CPU usage, but which component is responsible to perform the flush?

I tried to look over the source code but didn't manage to find the right place.

Ofek Hod
  • 3,544
  • 2
  • 15
  • 26

1 Answers1

2

There is no fixed interval, Elasticsearch uses some heuristic to determine when to call flush as mentioned in the official doc

Elasticsearch automatically triggers flushes as needed, using heuristics that trade off the size of the unflushed transaction log against the cost of performing each flush.

Also as explained in at the end of this SO answer from Elastic team member, this heuristic depends on

depending on how many operations get added to the transaction log, how big they are, and when the last flush happened.

Note: You can also tweak the setting of flush operation but not recommended.

Edit: https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/action/bulk/BulkProcessor.java#L48 is the source code which processes the flush operation.

Amit
  • 30,756
  • 6
  • 57
  • 88
  • Thank you very much. I'm interested in the actual logic of these heuristics, under the assumption they are documented somewhere or al least implemented in ES source code. – Ofek Hod May 29 '20 at 13:50
  • Sure give me sometime will let u know – Amit May 29 '20 at 14:02
  • @OfekHod, I looked into the Elasticsearch source code and found https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/action/bulk/BulkProcessor.java#L48 if you read the Javadoc of this class (allowing to easily set when to "flush" a new bulk request * (either based on number of actions, based on the size, or time)) – Amit May 30 '20 at 10:33
  • @OfekHod, you can find the reference of this bulk processor at multiple places but ultimately flushing is done by a schedular which calling system can schedule based on three params (actions, size and time). I hope this information gives an answer to your question. – Amit May 30 '20 at 10:35
  • @OfekHod you can also see watcher.java in https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/watcher/src/main/java/org/elasticsearch/xpack/watcher/Watcher.java#L672 calling this flush operation under close() method. – Amit May 30 '20 at 10:40
  • @OfekHod thanks for marking it answer and glad I was helpful. – Amit May 30 '20 at 12:23