17

I'd like to write a custom Elastic Search scorer that takes all terms from the document in index, all terms from the query and based on some custom logic calculates the score.

After some research, it seems that the most straight-forward way to implement a custom scorer in Elastic Search in Java is to use its "native scripting" functionality (i.e. implementing AbstractDoubleSearchScript). The problem I have is that I can't find a way to access the original query object in such a script. I can only access the matching document and its fields. Is there some way to get access to the query object that was used for the search?

Alternatively, what is the best way to run custom Java code per result and score the match using my own (complex) algorithm that needs to know the complete term list for both the query and the document?

Lukáš Lalinský
  • 40,587
  • 6
  • 104
  • 126
  • Have you considered writing a custom plugin that should extend `org.elasticsearch.common.lucene.search.function.ScoreFunction` and provide there all the custom logic you have? – Andrei Stefan Apr 14 '15 at 23:12
  • I have not considered `ScoreFunction` yet. From the docs I got the impression that the "script scoring" should be the most flexible option and I can write them in Java, but to be honest I'm a little lost in all the elasticsearch APIs. – Lukáš Lalinský Apr 15 '15 at 16:37
  • 1
    Have you seen/considered doing the scoring with a custom script, using the helpers from http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-advanced-scripting.html ? – Lee H Apr 17 '15 at 22:14
  • That is the API I was trying to use, but from Java, not Groovy. I can't find a way to access the query terms. – Lukáš Lalinský Apr 17 '15 at 22:36
  • Instead of trying to access the query object, why can't you pass the query string values (search texts) to the native script which you used to calculate the score? do you have any complication in implementing this? – Arun Prakash May 27 '15 at 17:16
  • calculating score based on the all terms is a costly approach, if so its better to use the native-script in es plugin – Arun Prakash May 27 '15 at 17:19
  • 3
    You could pass the query terms into a custom script using the "params" field as explained here: https://www.elastic.co/guide/en/elasticsearch/guide/current/script-score.html – Mr Hash Jul 11 '15 at 23:19

1 Answers1

2

Implement a custom Query class and wrap the actual query (for example a boolean query) as its sub query. In the Query class you have api to implement a custom scorer where you can have access to both the query and the current document which you are scoring. To fine grain control the score, implement a custom similarity class.

redragons
  • 94
  • 4