3

For example, I have a solr collection that contains documents with a field called "key_phrase".

I know it is easy to find all documents that contain all the searched words in a search query. (i.e. using mm=100% in edismax)

However, what I am asking for is how to return documents whose "key_phrase" contains only the searched words and nothing else. This "key_phrase" is also a multi_valued field.

For example: Search query: 'kids soccer gear' The query would return the following document whose "key_phrase" field contains: "kids soccer". It would also return a document who have two "key_phrase" values such as 'kids gear' and 'any other word' since one of them does not contain any words that is not in the search query.

On the other hand, it would not return a document that has 'kids soccer gear for boy' since this document contains 'boy', which is not present in the search query.

1 Answers1

0

You can try by indexing the field using the ShingleFilterFactory.

e.g.

<filter class="solr.ShingleFilterFactory" maxShingleSize="3" outputUnigrams="true"/>

you can refer here ShingleFilterFactory

<analyzer>
  <tokenizer class="solr.StandardTokenizerFactory"/>
  <filter class="solr.ShingleFilterFactory"/>
</analyzer>

If you have the input as

In: "To be, or what?"

Tokenizer to Filter: "To"(1), "be"(2), "or"(3), "what"(4)

Out: "To"(1), "To be"(1), "be"(2), "be or"(2), "or"(3), "or what"(3), "what"(4)

Abhijit Bashetti
  • 8,518
  • 7
  • 35
  • 47