0

I am using solr 8.3.

After I included Synonym Graph Filter in my managed-schema file, I have noticed that if the query string contains a multi-word synonym, it considers that multi-word synonym as a single term and does not break it,further suppressing the default search behavior.

Here "soap powder" is the search query which is also a multi-word synonym in the synonym file.

s(104254535,1,'soap powder',n,1,1).
s(104254535,2,'built-soap powder',n,1,0).
s(104254535,3,'washing powder',n,1,0).

I am sharing some screenshots for understanding the problem-

without Synonym Graph Filter (2 docs returned) enter image description here

with Synonym Graph Filter (2 docs expected, only 1 returned)

enter image description here

Has anyone experienced this before? If yes, is there any workaround?

atinjanki
  • 483
  • 3
  • 13
  • You can use a WordDelimiterGraphFilter to break your tokens after being processed by the synonyms filter. The reference guide is currently down so I can't link the documentation, but the WDGF allows you to configure how terms should be broken up, even after tokenization (which only happens at the start) – MatsLindh Mar 02 '20 at 07:42
  • Thanks @MatsLindh! I will try that and update here. At the moment I am using solr.StandardTokenizerFactory, which should ideally also treat whitespaces as delimiters. – atinjanki Mar 03 '20 at 13:02
  • Yes, but that happens _before_ the synonym filter is invoked. The expansion that happens in your synonymfilter doesn't get processed by the standard tokenizer. – MatsLindh Mar 03 '20 at 13:16
  • @MatsLindh , I have added - `` However this now does not expand the second synonym - 'built-soap powder'. The query is expanded as - `Synonym(_text_:built _text_:soap) ((+_text_:soap +_text_:powder) (+_text_:washing +_text_:powder))` If **generateWordParts** is set to 0, then 'built-soap powder' is ignored completely as- `((+_text_:soap +_text_:powder) (+_text_:washing +_text_:powder))` – atinjanki Mar 17 '20 at 16:21

0 Answers0