3

I need to fetch more than 10000 records from Elasticsearch but I'm unable to set the index.max_result_window in Elasticsearch 7.2 from python.

I had used the following command to set the window limit to 100000 in Elasticsearch V6 which was working.

es.indices.create(index=prod_index, body={"settings": {"index.mapping.total_fields.limit": 50000, "index.max_result_window" : 100000})

The same command is not working in Elasticsearch 7.2

Harish
  • 131
  • 2
  • 8

3 Answers3

4

It's better not to and that's the reason why they set it 10 000 as a max number. Increasing index.max-result-window is not very good idea which can lead to cluster latency or crashes. When you set a size, ES creates a heap of the same size before fetching the data. Those records will stay in RAM and unless you have great hardware and huge heap space, it's probably the best idea not to do it, because it can crash your cluster or slow it down.

Alternatives are to use scroll API ,From-size or Search-after (probably most preferable -

description)

You can check this solution. It helped me to fetch more then 700k documents without bringing down cluster. Also, you can check this answer.

dejanmarich
  • 1,235
  • 10
  • 26
0

You just need to follow below structure:

from elasticsearch import Elasticsearch

es=Elasticsearch(hosts=hosts, port=port, http_auth=("username", "password"), timeout= 1000000)

es.indices.put_settings(index="index_name".format(CurrentDate), body={"index":{"max_result_window": 20000000}})
halfelf
  • 9,737
  • 13
  • 54
  • 63
0
in spring data 
var query = new NativeQueryBuilder().withTrackTotalHits(true).build()
SearchHits searchHits = elasticSearchTemplate.search(query, Object.class);
long totalHits = searchHits.getTotalHits();
  • Remember that Stack Overflow isn't just intended to solve the immediate problem, but also to help future readers find solutions to similar problems, which requires understanding the underlying code. This is especially important for members of our community who are beginners, and not familiar with the syntax. Given that, **can you [edit] your answer to include an explanation of what you're doing** and why you believe it is the best approach? – Jeremy Caney Feb 03 '23 at 00:18