0

In my Rails application I have a Question model, setup with sunspot solr, with a field "text" and I'd like to search in that field doing a logical OR between words. I've found that setting minimum_match to 1 solves my problem, however I'd like also to order the results by boosting questions that have more than 1 word matching. Is there a way to do this with Solr? The documentation isn't really helpful about ranking functions.

Edit: this is the full query I'm performing in the controller

@questions = Question.solr_search do
  fulltext params[:query], :minimum_match => 1
end.results

2 Answers2

1

According to http://wiki.apache.org/solr/SchemaXml,

The default operator used by Solr's query parser (SolrQueryParser) can be configured with

<solrQueryParser defaultOperator="AND|OR"/>. 

The default operator is "OR" if unspecified. It is preferable to not use or rely on this setting; instead the request handler or query LocalParams should specify the default operator. This setting here can be omitted and it is being considered for deprecation.

You can change your defaultOperator in solr/conf/schema.xml or you could use LocalParams to specify OR via syntax like https://github.com/sunspot/sunspot/wiki/Building-queries-by-hand

It is true Sunspot's default operator is "AND", as referenced in https://github.com/sunspot/sunspot/blob/master/sunspot_solr/solr/solr/conf/schema.xml

konyak
  • 10,818
  • 4
  • 59
  • 65
0

Logical OR is the default behavior of the Dismax request handler used in Sunspot.

Plus, the more words match, the higher the document's score (which sounds like what you want)

Question.search do
  fulltext 'best pizza'
end

...should return results that match one or both words (returning the ones that match both first):

  1. "Joe's has the best pizza by the slice in NYC"
  2. "It's hard to say which pizza place is the best"
  3. "Pizza isn't the best food for you"
  4. "I don't care whether pizza is bad for you!"
  5. "What do you think the best type of fast food is?"

minimum_match is useful only if you want to filter out low relevance results (where only a certain low number or percentage of terms were actually matched). This doesn't affect scoring or logical OR/AND behavior.

Peter Dixon-Moses
  • 3,169
  • 14
  • 18
  • In my experience and also noted [here](http://blog.websolr.com/post/1299174416/how-do-i-query-with-boolean-logic-using-sunspot) and [here](http://sunspot.github.com/sunspot/docs/Sunspot/DSL/StandardQuery.html) at the minimum match paragraph, the default behavior of the search is a logical AND between words. Also I'm using this kind of filter `` for text fields which I don't know if can cause different behavior from the default one... – Matteo Depalo May 23 '12 at 21:39
  • It's easier to think about dismax scoring as favoring the documents which match the most terms from the query. This is the search behavior most people expect. The first link you posted explains it pretty well. `:minimum_match => 1` should be the default. It only comes into effect if you increase it (which limits the results to documents where more terms match). – Peter Dixon-Moses May 26 '12 at 02:43
  • NGramFilterFactory is only really useful in a couple specific situations (EdgeNGramFilterFactory helps with prefix search/auto-complete). You're probably matching on a bunch of results you don't want to (e.g. searching for 'zz' will match anything with 'pizza'). Maybe you can post more about what you're trying to do. – Peter Dixon-Moses May 26 '12 at 02:54
  • My users are searching with a search function that updates the results while they write so I'm interested in partial results to avoid showing them nothing while they are in the middle of typing a word. Maybe, as you note, the EdgeNGramFilterFactory is more suited for this purpose. – Matteo Depalo May 26 '12 at 06:59