0

I queried a large index using a very large size, as I want to retrieve every matching document in a large index, but I got a timeout after a long time. No result is returned. Is there any other way to get all data without timing out? My query:

{
"size": 90000000,
"query": { 
    "filtered": {"query": {"match_all":{}},"filter":{"term": {"isbn": 475869}}
    }
  }
}
Nkosi
  • 235,767
  • 35
  • 427
  • 472
DevEx
  • 4,337
  • 13
  • 46
  • 68

1 Answers1

0

You should use scrolling if you need to retrieve a large amount of data.

First, initiate the scroll with your query:

curl -XGET 'localhost:9200/your_index/your_type/_search?scroll=1m' -d '{
    "size": 5000,
    "query": {
        "term" : {
            "isbn" : "475869"
        }
    }
}'

Then you'll get the first 5000 documents as well as a _scroll_id token in the response, which you can use to perform the subsequent requests.

Then you can repeatedly perform the next requests using the scroll_id token from the previous response in order to get the next batch of 5000 documents, until you get no results anymore.

curl -XGET  'localhost:9200/_search/scroll' -d '{
    "scroll" : "1m", 
    "scroll_id" : "c2Nhbjs2OzM0NDg1ODpzRlBLc0FXNlNyNm5JWUc1" 
}'

Since you're using Jest, there's a SearchScroll class you can use. See in test cases how that class is used.

Val
  • 207,596
  • 13
  • 358
  • 360