We're updating our search system from Solr to Elasticsearch. We've already improved lots of things, but something we haven't got right yet is boosting a document's (product's) score by the popularity of the product (it's an ecommerce website).
This is what we have currently (with lots of irrelevant bits stripped out):
{
"query": {
"function_score": {
"query": {
"multi_match" : {
"query": "renal dog food",
"fields": [ "family_name^20", "parent_categories^2", "description^0.2", "product_suffixes^8", "facet_values^5" ],
"operator": "and",
"type": "best_fields",
"tie_breaker": 0.3
}
},
"functions": [{
"script_score": {
"script": "_score * log1p(1 + doc['popularity_score'].value)"
}
}],
"score_mode": "sum"
}
},
"sort": [
{ "_score": "desc" }
],
}
The popularity_score
field contains the total number of orders containing this item in the last 6 weeks. Some items will have never been ordered and some will have had up 30,000 (with potentially a lot more as we continue to grow the business). It's quite a bit range.
The problem we have is that a document (product) might be a really good match text-wise but not very popular. We then have another not-very-relevant product does just about match the query, but because it is very popular it jumps up the list. What we are looking for is something will allow the popularity_score
to be taken relative to the popularity_score
of other matching results and get some form of normalisation, rather than just being taken as is (log1p doesn't seem to be enough sometimes). Does anyone have any suggestions or ideas?
Thank you!