4

For my project I need to find out which results of the searches are considered "good" matches. Currently, the scores vary wildly depending on the query, hence the need to normalize them somehow. Normalizing the scores would allow to select the results above a given threshold.

I found couple solutions for Lucene:

How would I go ahead and apply the same technique to ElasticSearch? Or perhaps there is already a solution that works with ES for score normalization?

Datageek
  • 25,977
  • 6
  • 66
  • 70
  • Can you show a query where you want to normalize the scores? – DrTyrsa Apr 11 '16 at 12:24
  • 1
    Both of the solutions you've linked to strongly recommend that you **"Don't do this."** That's a solution you can, *and should* apply directly to Elasticsearch as well. – femtoRgon Apr 11 '16 at 14:52
  • 1
    If you still have the problem, you may be interested in a solution for score normalization in [this answer](https://stackoverflow.com/a/56389964/3262646). – Pierre-Nicolas Mougel May 31 '19 at 06:52
  • @Pierre-NicolasMougel suggested to look at this answer as well: https://stackoverflow.com/a/56389964/3262646 – Datageek May 31 '19 at 12:02

2 Answers2

3

As far as I searched, there is no way to get a normalized score out of elastic. You will have to hack it by making two queries. First will be a pilot query (preferably with size 1, but rest all attributes same) and it will fetch you the max_score. Then you can shoot your actual query and use functional_score to normalize the score. Pass the max_score you got as part of the pilot query in params to function_score and use it to normalize every score. Refer: This article snippet

GooDeeJAY
  • 1,681
  • 2
  • 20
  • 27
Genapshot
  • 41
  • 5
0

It's a bit late. We needed to normalise the ES score for one of our use cases. So, we wrote a plugin that overrides the ES Rescorer feature.

Supports min-max and z score.

Github: https://github.com/bkatwal/elasticsearch-score-normalizer

Usage: Min-max

{
  "query": {
    ... some query
  },
  "from" : 0,
  "size" : 50,
  "rescore" : {
      "score_normalizer" : {
        "normalizer_type" : "min_max",
        "min_score" : 1,
        "max_score" : 10
      }
   }
}

Usage z-score:


  "query": {
    ... some query
  },
  "from" : 0,
  "size" : 50,
  "rescore" : {
      "score_normalizer" : {
        "normalizer_type" : "z_score",
        "min_score" : 1,
        "factor" : 0.6,
        "factor_mode" : "increase_by_percent"
      }
   }
}

For complete documentation check the Github repository.

Bikas Katwal
  • 1,895
  • 1
  • 21
  • 42