Elasticsearch with and without filter by id

Question

I have document type user and I would like to understand why by using filter it took more time than not applying any filter.

For example imagine I have 1 billion documents, it's like scanning the whole billion record comparing with by filtering with some id.

example query:

{
    "from" : 0, 
    "size" : 10000,
    "stored_fields" : ["first_name", "last_name"],
    "query":{
        "bool": {

            "filter": {
                "ids" : {
                    "type" : "user",
                    "values" : [
                    "547303",
    **"another 200k ids"**                  ]
                }
            }
        }
    }
}

Current benchmark: 1 - Without using filter took around 400 ms 2 - With filter by passing 200k ids will take around 2100 ms

So you're willing to retrieve 200K documents, but size only allows you to get 10000 documents, how do you expect to achieve this without using scroll/scan? Besides, filtering on 200K ids is not really a good idea and what this filter is meant for. It makes no sense to have 200K ids in there since you'll only be able to retrieve 10000 max, unless you use scan/scroll. — Val, Jan 13 '17 at 04:37
@Val Yes I will use scroll for the pagination. But it is one of the requirement to pass 200k ids as a filter to the query. Right now without any filter it took around 100 ms while with 200k ids will get me around 2 seconds to return the result. Just want to know why it is slower. — odin88, Jan 13 '17 at 07:01

Elasticsearch with and without filter by id

0 Answers0