SOLR - Rank better smaller documents that have less "EXTRA" words

Question

My SOLR documents are wine entities. When a user search for a keyword "Haut Bailly" (it's a wine from bordeaux), I would like to get first a closely matching document with shorter title length, ex:

"Château Haut-Bailly - Pessac-Léognan"
"Château Haut-Bailly La Parde de Haut Bailly - Pessac-Léognan"

However with default solr queries, the keywords "haut bailly" return this ranking:

"Château Haut-Bailly La Parde de Haut Bailly - Pessac-Léognan"
"Château Haut-Bailly - Pessac-Léognan"

Is there any parameters that I could play with to increase the score of a match that is closer to phrase searched (in terms of length) and shorter field (here it's title)? So that here the right wine ("Château Haut-Bailly - Pessac-Léognan") comes up in ranking?

Thank you!

You should share the schema for the field you are searching on. — Evan, Sep 18 '12 at 19:25

score 0 · Answer 1 · answered Sep 18 '12 at 17:13

I think default scoring would already do that (if you are not omitting norms with omitNorms). The first document scores higher cause it has the search twice (most possibly it is matching Haut-Bailly as well, or one of the words, depending on the tokenizer/parser you are using)

score 0 · Answer 2 · answered Sep 19 '12 at 04:03

0

Using a duplicate filter might work http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.RemoveDuplicatesTokenFilterFactory

answered Sep 19 '12 at 04:03

d whelan

804
5
8

SOLR - Rank better smaller documents that have less "EXTRA" words

2 Answers2