0

i want the result of the Query show all the information , i am now doing the query in Postman and eventually i will do it for python script,

I have check somewhere in stackoverflow i should do

{
  "query": {
      "match_all":{}
  }
}

and the results show the hits with

"hits": {
        "total": {
            "value": 10000,
            "relation": "gte"
        },

and the following only show few records.

And this is the same to my case , i want to shortlst all the device that have alarmstatus not equal to 0

{
  "query": {
    "bool": {
      "must_not": {
        "term": {
          "header.alarmStatus": 0
        }
      }
    }
  }
}

and the result show

"hits": {
        "total": {
            "value": 10000,
            "relation": "eq"
        },
        "max_score": 0.0,
        "hits": [
            {..........}

which have 3740 hit. I want to sort out all 3740 doc , how can i do this? Thanks

Jeff

Man Man Yu
  • 161
  • 3
  • 13

2 Answers2

0

By default, your query only returns the first 10 results. You can simply increase the size parameter to 10000 (which is the default maximum)

{
  "size": 10000,                     <--- add this
  "query": {
    "bool": {
      "must_not": {
        "term": {
          "header.alarmStatus": 0
        }
      }
    }
  }
}

If you happen to have more than 10000 hits to return, there are a few better options available

Quick example on how to use the scroll API for your use case.

First run your query normally, but by specifying a scroll timeout scroll=1m. That will create a scroll search context that you can iterate on:

POST <your-index-name>/_search?scroll=1m
{
  "query": {
    "bool": {
      "must_not": {
        "term": {
          "header.alarmStatus": 0
        }
      }
    }
  }   
}

In the response, you'll get a field called _scroll_id, and you'll need to copy that value and then run the following command to get the next results:

GET /_search/scroll
{
  "scroll_id" : "<value_of_scroll_id>",
  "scroll": "1m"
}

And so on, you need to repeat this second query until you have retrieved all your hits. Don't forget to always use the latest value of _scroll_id that you received in the previous response.

Val
  • 207,596
  • 13
  • 358
  • 360
  • thanks you. But what if my case is Over 10K , i saw some use Scoll API , however scroll API didnt show all the DOC to me tho, thanks – Man Man Yu Nov 17 '20 at 08:28
  • sorry , i still didnt understand how scroll API will work in my case – Man Man Yu Nov 17 '20 at 08:34
  • You can find a complete example [here](https://stackoverflow.com/questions/64708844/how-to-scroll-data-using-scroll-api-elasticsearch/64708989#64708989) – Val Nov 17 '20 at 08:35
  • sorry i am new to ES,,, can you show me in the case of alarmstatus !=0 (i modify the case for 10K hit) how can i extract all the doc for i needed? thanks – Man Man Yu Nov 17 '20 at 08:42
0

You can give the size of 10000(also limited by index.max_result_window setting), but it has some performance issues(high size means, ES needs to fetch the number of records based on size and transfer it over network ), that's the reason the default is just 10.

If you care about the performance, it's better for you to take advantage of from/size param, if you have even more documents to search and requires deep-pagination than there are even better options available.

From/size example for your use-case.

First query

{
  "from": 0, "size": 1000,                     
  "query": {
    "bool": {
      "must_not": {
        "term": {
          "header.alarmStatus": 0
        }
      }
    }
  }
}

Second query

{
  "from": 1000, "size": 1000,                     
  "query": {
    "bool": {
      "must_not": {
        "term": {
          "header.alarmStatus": 0
        }
      }
    }
  }
}

And like above you can batch your query according to your requirements.

Amit
  • 30,756
  • 6
  • 57
  • 88