0

My Search query works fine (I hope) but sometimes I have too many results with score like 1.5, 0.7, 0.6... or 0.1, 0.001, 0.001... Is it possible to block low relevance results? Fixed value is unsuitable - because it depends of maximum _score (score of most relevant result). It shuld work like "block all results which has _score twice less then maximum _score (score of most relevant result)"

{
    "query": {
        "bool": {
            "disable_coord": true,
            "must": [
                { "match": {  
                    "ObjectTypeSysName": {
                        "query":    "participant"
                    }
                }
            },
               { "match": {
                    "_all": {
                        "query": "text-to-find",
                         "operator": "and",
                         "fuzziness": "AUTO",
                         "minimum_should_match": 1
                    }
                }}
            ],
           "should": [
            { "multi_match" : {
                "query":      "text-to-find",
                "type":       "best_fields",
                "fields":     [
                "*NAME",
                "ObjectData.EXTERNALID",
                "ObjectData.contactList.VALUE",
                "*SERIES",
                "*NUMBER"
                ],
                "operator":   "or",
                "boost": 2
            }}
            ]
        }
    }
}
Ilya P
  • 37
  • 1
  • 6
  • You cannot block it in a single query: the criterion you set requires two passes over the data. You'd have to make one pass to find that most relevant score, and the second to filter out the low results. I suggest that you simply pass the results of the above search to a second query, using that to filter the data. – Prune Nov 13 '15 at 15:54
  • Is it possible to make second pass via ES or should I write code not connected with ES to filter low results? – Ilya P Nov 16 '15 at 10:52
  • This is a design decision I cannot make for you. Do you have another tool that can more easily find that maximum value? If not, general programming practice suggests that you should use ES for both passes: don't use two interfaces where one will do. This is what I do. However, it's an easy decision for me. TrustedAnalytics handles these operations in a single line, and I'm on the development team. – Prune Nov 16 '15 at 18:35
  • Prune, How can i limit score via ES on the second pass? – Ilya P Nov 19 '15 at 16:02
  • ES does this with the "range" operator, something like `"filter":{"range" : { "relevance" : { "from":0, "to": threshold } } }` where **threshold** is the value from the first pass. – Prune Nov 19 '15 at 18:07

0 Answers0