0

There are compound words, which can written as "hand bag" or "handbag". If we have "handbag" in solr index, then on searching "hand bag", how can I show the result of "handbag". I have tried with multi-word synonym parser. But for that I have to add add handbag=>hand bag in the synonym file. But the list is very long. I cannot be adding words in the list.

divibisan
  • 11,659
  • 11
  • 40
  • 58
Kamal Kishore
  • 325
  • 2
  • 4
  • 15

1 Answers1

0

Solr already provides a dictionary-based decompounding filter. Have a look at the Solr wiki for more details: https://wiki.apache.org/solr/LanguageAnalysis#Decompounding

spyk
  • 878
  • 1
  • 9
  • 26
  • With solr.DictionaryCompoundWordTokenFilterFactory, I will have to provide a dictionary file and keywords mentioned in that file will only be managed. But this can also be done by solr.SynonymExpandingExtendedDismaxQParserPlugin, where I have to manually add keywords in the synonyms file. I dont wants to add any such keywords file. I wants all words to be handled dynamically. – Kamal Kishore Mar 14 '14 at 06:48
  • But you will need to provide some kind of dictionary anyway, so that the filter can detect valid word boundaries. I usually use the enable2k word list for english text.[link](http://www.morewords.com/help/) – spyk Mar 16 '14 at 11:50
  • Is it not possible to do so without providing any dictionary. that is it should generate compound words for every string irrespective of any dictionary words. – Kamal Kishore Mar 20 '14 at 16:29