0

I am trying to improve the performance of a elasticsearch query. The goal o the query is just retrieve those document that match the query, so score does not matter, also is important to mention we got an index per day, so the quer. As far as I know for this cases is better to use filter, avoiding to calculate scoring, but also I just red that there is/are some alternative using finter inside query retrieving all document score 1, so The first query I made was the followig:

{
 "filter": {
  "bool": {
   "must": [{
     "match": {
      "from": "john.doe@example.com"
     }
    }, {
     "range": {
      "receivedDate": {
       "gte": "date1",
       "lte": "date2"
      }
     }
    }
   ]
  }
 }
}

Then I made my first test and I change "filter" for "query" and most of the time I get better times using "query" then "filter", that is my first question, why? What I have doing wrong on my query to have filter slower than a query?

After than I keep reading trying to improve it and I got this:

{
    "query": {
        "bool": {
            "must": {
                "match_all": {}
            },
            "filter": {
                "bool": {
                    "must": [{
                            "match": {
                                "from": "john.doe@example.com"
                            }
                        }, {
                            "range": {
                                "receivedDate": {
                                    "gte": "date1",
                                    "lte": "date2"
                                }
                            }
                        }
                    ]
                }
            }
        }
    }
}

With the latter I have the impression have been improved a little bit. So according with your experience could you tell me which one is better (at least in theory) to have a faster result, also Exist the chance that using one of this queries cache the results improving the queries made forward. There is a better way to make this query? Thanks in advance for your help. I forgot to mention I am using Elasticsearch v2.3

Joseratts
  • 97
  • 1
  • 9
  • What is the mapping of the `from` field? – Val Mar 29 '17 at 11:07
  • I forgot to mention that I'm also testing change match sentece for term, because the field is not analyzed and as far as I know this would improve it. So from field is string not_analyzed and the receivedDate is date – Joseratts Mar 29 '17 at 11:13

1 Answers1

1

In your first query, you were only using a post_filter. Your second query is the way to go, but it can be optimized to this (no need to wrap bool/filter inside bool/must):

{
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "receivedDate": {
              "gte": "date1",
              "lte": "date2"
            }
          }
        },
        {
          "term": {
            "from": "john.doe@example.com"
          }
        }
      ]
    }
  }
}
Val
  • 207,596
  • 13
  • 358
  • 360