3

I tried based on the documentation to create an index then use:

PUT indexname/_settings
{
      "index.max_result_window": 20000
}

When I get the settings, I see the setting is set, but whenever I do a query against it, I keep getting the 10,000 results. I tried creating the index with the setting set, but that didn't work either.

I also tried when making the search query, to include a size parameter of 11,000 and it still did not return.

What is it do I have to do to get the results to return greater than 10,000?

Is there some setting I have to apply to the node, or some other setting applied to the index to get it to work?

I am using the latest version 7.3.1.

Rolando
  • 58,640
  • 98
  • 266
  • 407
  • 1
    If you set successfully `index.max_result_window` and then used `size: [number_larger_than_maxresultwindow]` then you should have gotten an error when running the query. Something like `Result window is too large, from + size must be less than or equal to: [20000] but was [21000]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.` You should call `GET indexname/_settings` to make sure `max_result_window` is set. Then, your `_search` request should use the larger number `size: 11000`. – Andrei Stefan Aug 27 '19 at 07:20
  • 1
    How many `hits` does the result show? – Andrei Stefan Aug 27 '19 at 07:20
  • 1
    10k maximum. I want to show more than that, but it’s as if ES is just not honoring the setting set on the index. – Rolando Aug 27 '19 at 12:34
  • 1
    I got that @Rolando. My question was referring to the actual `hits` from the JSON answer from Elasticsearch. – Andrei Stefan Aug 27 '19 at 13:08
  • Also, how do you run the query? Dev tools from Kibana? `curl`? Something else? – Andrei Stefan Aug 27 '19 at 13:08
  • My comment remains the same. I am only getting 10,000 hits from the actual 'hits' from the JSON answer even though I know I could get up to 11000. I am using nodejs elasticsearch to execute the query. Though the same limit happens using the console in the latest release of Kibana. – Rolando Aug 27 '19 at 20:32
  • 2
    Weird, setting `index.max_result_window` to 20000 and then querying with `from: 0` and `size: 20000` I do get 20000 hits in my response... as expected. – Val Aug 28 '19 at 06:34
  • My guess is that somewhere the search request parameter `size: 20000` gets lost or is not properly configured, if indeed the author is setting this value somewhere (haven't seen any code). Also, it doesn't say if setting a value greater than 10000 in the search request, it's getting the error I mentioned earlier or not (just to confirm the max_result_window has been applied). Also, it would be worth testing the query (with `size: 11000`) works outside nodejs ;-). – Andrei Stefan Aug 28 '19 at 08:56
  • did you check index.max_result_window:20000 has been applied on indices? – hamid bayat Aug 28 '19 at 12:40
  • I also test 20000 and get 20000 hits in response – hamid bayat Aug 28 '19 at 12:40

3 Answers3

7

You can use scroll API to retrieve more than 10000 records in elastic search as by default, 10000 is the upper cap for the number of documents returned.

What Scroll API basically does is it fetches documents in chunks whose size can be customized by us. We can control the size of document-set returned by using size and a time value. The actual calls take the following forms:

1st Call

In the first call to fetch the documents, you can give the size ( say 5000 docs) and scroll parameter specifying the time in minutes after which search context times out.

 POST /index/type/_search?scroll=1m
{
    "size": 5000,
    "query": {
        "match" : {
            "title" : "Something"
        }
    }
}

2nd Call ( and every other subsequent call)

In the first call's response, we get a _scroll_id which can be used to retrieve the next chunk of documents.

    POST /_search/scroll 
{
    "scroll" : "1m", 
    "scroll_id" : "XGSKJKSVNNNNNDDDD1233BBNMBBNNN===" 
}

Also, check this answer.

Tarek Essam
  • 3,602
  • 2
  • 12
  • 21
  • 2
    I know the scroll API exists, but I want to return 11000 in one request without using the scroll API. The documentation suggests the index.max_result_window as the solution, but it doesn't appear to work. What is it supposed to affect then? Is there an alternative to be able to get more than 10k in a single call? – Rolando Aug 24 '19 at 05:13
  • @Rolando I believe that index.max_result_window will set the max results for any value you give, up to 10k. It is affecting the same setting but the 10k limit has priority over it. If you set it to 5k, you'll be limited to 5k instead of 10k. I personally do not believe there is an alternative to get more than 10k results in a single query. – littledaxter Aug 30 '19 at 02:02
3

if you are just looking for hits in the JSON response and not the actual documents then add "track_total_hits": true in the search request to get the actual total hits.

POST indexname/_search { "from": 0, "size": 0, "track_total_hits": true }

Shivam
  • 169
  • 1
  • 3
1

The scroll API is no longer recommended. The current way for pagination on more than 10k results is the search-after API of ElasticSearch.

secana
  • 671
  • 6
  • 15