0

I am using Solr with Solarium and I am trying to implement searching for different words with the same meaning. For example, if a user searched for photo, it would also return results for photograph and photographs.

I have tried Implementing Hunspell and Snowball Filter Factory. Both seem to take care of plural instances of words.

Here is the entry from my schema:

<fieldType name="text_general" class="solr.TextField" multiValued="true" positionIncrementGap="100">
<analyzer type="index">
  <tokenizer class="solr.StandardTokenizerFactory"/>
  <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
  <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
  <tokenizer class="solr.StandardTokenizerFactory"/>
  <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
  <filter class="solr.SynonymFilterFactory" expand="true" synonyms="synonyms.txt" ignoreCase="true"/>
  <filter class="solr.LowerCaseFilterFactory"/>
  <filter class="solr.HunspellStemFilterFactory" dictionary="en_US.dic" affix="en_US.aff" ignoreCase="true" />
</analyzer>

Thanks!

MatsLindh
  • 49,529
  • 4
  • 53
  • 84
kyle
  • 1
  • 2
  • You are already using SynonymFilterFactory, aren't you? https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory Why not to add 'photo','photograph','picture', etc. in 'synonyms.txt' file so they mean the same? – Fbma Oct 16 '15 at 11:15
  • Yes, I am using SynonymFilterFactory. I am probably going to end up going this route, but we are still playing around with the stemming and trying to get that to be a better solution in the long run. Thanks! – kyle Oct 19 '15 at 21:43

1 Answers1

0

In stemming, the word is reduced into word stem or the root form. You have already used SnowballPorterFilterFactory and HunspellStemFilterFactory and can try PorterStemFilterFactory, KStemFilterFactory and EnglishMinimalStemFilter in solr. Stemming filters can't handle synonyms. If you want to search different words with the same meaning, you have to use stopfilterfactory in solr and add possible synonym words to the sysnoyms.txt. Replacement synonyms, one-way expansion synonyms and multiway expansion synonyms can be defined there.

Supimi
  • 71
  • 1
  • 3