There are compound words, which can written as "hand bag" or "handbag". If we have "handbag" in solr index, then on searching "hand bag", how can I show the result of "handbag". I have tried with multi-word synonym parser. But for that I have to add add handbag=>hand bag in the synonym file. But the list is very long. I cannot be adding words in the list.
Asked
Active
Viewed 1,303 times
1 Answers
0
Solr already provides a dictionary-based decompounding filter. Have a look at the Solr wiki for more details: https://wiki.apache.org/solr/LanguageAnalysis#Decompounding

spyk
- 878
- 1
- 9
- 26
-
With solr.DictionaryCompoundWordTokenFilterFactory, I will have to provide a dictionary file and keywords mentioned in that file will only be managed. But this can also be done by solr.SynonymExpandingExtendedDismaxQParserPlugin, where I have to manually add keywords in the synonyms file. I dont wants to add any such keywords file. I wants all words to be handled dynamically. – Kamal Kishore Mar 14 '14 at 06:48
-
But you will need to provide some kind of dictionary anyway, so that the filter can detect valid word boundaries. I usually use the enable2k word list for english text.[link](http://www.morewords.com/help/) – spyk Mar 16 '14 at 11:50
-
Is it not possible to do so without providing any dictionary. that is it should generate compound words for every string irrespective of any dictionary words. – Kamal Kishore Mar 20 '14 at 16:29