0

As I understand it, for Multi Value fields Solr boosts scores based on a few things. Specifically scoring shorter field lengths higher than longer ones (even if the search string is nearer the beginning).

The scoring factors I found in the above link:

termFreq: how often a term appears in the document
idf: how often the term appears across the index
fieldNorm: importance of the term, depending on index-time boosting and field length

However I would like to boost values in a multi value field where the value is nearer the start of the list. For example.

When searching for a document with herceptin PRODUCT 1 should rank higher than PRODUCT 2 - except PRODUCT 2 socres higher due to it's shorter field length.

PRODUCT 1

"herceptin", "succinimidyl", "radiolabeling", "labeling", "stability", "discovery", "potent", "cb2 agonists", "agonists", "linkers", "yield", "esters", "agent", "syntheses", "elimination", "ligands", "analogue", "chemistry", "functionality", "formation", "proteins", "product", "oxidizing", "agonist", "conjugated", "receptor", "activity", "model".

PRODUCT 2

"trastuzumab", "breast", "cancer", "patients", "breast cancer", "treatment", "growth", "antibody", "receptor", "human", "clinical", "chemotherapy", "herceptin", "combination", "results".

Any ideas on how I could achieve this?

Thanks

Thinkpad
  • 101
  • 1
  • 8
  • You can attach a numeric payload (the position of the token in the list) to each token and then use that for scoring or boosting: https://solr.apache.org/guide/solr/latest/query-guide/function-queries.html#payload-function and https://solr.apache.org/guide/solr/latest/query-guide/other-parsers.html#payload-score-parser – MatsLindh Oct 04 '22 at 21:24

0 Answers0