11

I want stopwords excluded except when the search term is within double quotes

eg. "just like that" should also search "that".

Is this possible?

halfer
  • 19,824
  • 17
  • 99
  • 186
Ruth
  • 5,646
  • 12
  • 38
  • 45

2 Answers2

16

It depends on the configuration of the field you are querying.

If the configuration of the indexing analyzer includes a StopFilterFactory, then the stopwords are simply not indexed, so you can not query for them afterward. But since Solr keeps the position of the terms in the index, you can instruct it to increment the position value of the remaining terms to reflect the fact that originally, there was other terms in between.

The "enablePositionIncrements" here is the key to achieve that:

<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/>

If the querying analyzer also has the StopFilterFactory configured with the same settings, your query should work as expected.

See this link for details: http://www.lucidimagination.com/search/document/CDRG_ch05_5.6.18

javanna
  • 59,145
  • 14
  • 144
  • 125
Pascal Dimassimo
  • 6,908
  • 1
  • 37
  • 34
  • So hard to find exact definition of the enablePositionIncrements attribute. Thanks dude! – BFree Mar 31 '11 at 18:12
  • **Deprecated**: This argument is invalid if luceneMatchVersion is 5.0 or later https://lucene.apache.org/solr/guide/6_6/filter-descriptions.html#FilterDescriptions-KeepWordFilter – Afshin Mehrabani Apr 16 '18 at 15:20
  • For Solr9 I was able to get this to work by adding `termVectors="true" termPositions="true" termOffsets="true" storeOffsetsWithPositions="true"` to my field definition. You may not need `storeOffsetsWithPositions` but I did need `term*`. See the docs on highlighting for some details on the options, though it applies to regular search from what I can tell. https://solr.apache.org/guide/solr/latest/query-guide/highlighting.html#schema-options-and-performance-considerations – markson edwardson Aug 13 '22 at 05:46
2

I've also had luck using the CommonGramsFilterFactory to achieve similar results by putting this in the appropriate place in your fieldType declaration.

<filter class="solr.CommonGramsFilterFactory" words="stopwords.txt" ignoreCase="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>

Not sure how well it works with enablePositionIncrements="true" enabled in the StopFilterFactory. You also need to be running solr 1.4 to use this.

Philip Southam
  • 15,843
  • 6
  • 28
  • 20