1

I am using Solr 5.2.1, in one of my project, and got some doubt on mm paramter of dismax / edismax parser.

Questions:

  • Does mm regardless of total input term count? Document says yes, but when I set it to 3, and input a single term, it still could get records, so it seems not regardless of total input term count.
  • What is the default value of mm? Document says it's 100%, but in my query test, it seems to be 1. By the way, I didn't found configuration for mm in solrconfig.xml or schema.xml.

Any help? Thx.


@Update:

Query url for 1st question:

http://localhost:8983/solr/demo/select?q=new+york&start=0&wt=json&indent=true&defType=edismax&qf=title&mm=3&stopwords=true&lowercaseOperators=true

There are 2 terms new and york, the query result is:

  • don't specify mm, return 3 records,
  • mm = 2, return 1 records,
  • mm = 3, also return 1 records,

So, I guess it will change the mm to max term first, before query.

Eric
  • 22,183
  • 20
  • 145
  • 196
  • can you provide the query url you are using for making the search for question-1? – YoungHobbit Aug 26 '15 at 09:28
  • @abhishekbafna I updated the question with url, and gives more description about what happend when I use different value of `mm`. – Eric Aug 26 '15 at 10:55
  • 1
    I would like to correct for the question-1 description is that `new` and `york` will not be considered as two terms. Because you are specifying `+` operator between them. It is considered as `whitespace` in solr/lucene. So `new york` and `new+york` are same for solr. It is treated as one term. Maybe later decomposed in multiple token by your analyzer. You can verify this in your logs for the search query. – YoungHobbit Aug 26 '15 at 11:20
  • `+` operator [documentation](https://lucene.apache.org/core/3_6_0/queryparsersyntax.html#+). When it appear before a term it work as operator. Please try these two queries `new york` and `new +york`. The first is essentially OR of `new` and `york` where as in second term `york` is compulsory. – YoungHobbit Aug 26 '15 at 13:07
  • 1
    @abhishekbafna I did input `new york` in solr admin, and it convert that to `new+york` automatically in url, I think that means 2 term, right? I am not sure. – Eric Aug 26 '15 at 13:11
  • 1
    @abhishekbafna I just did a little test in solr admin, when I input `new york` in solr admin, then enable `debugQuery`, in the result I saw the parser result is `"parsedquery_toString": "+(((title:new) (title:york))~1)"`, that's 2 term, and I think the count in `mm` means the `term` count after parser process the raw input. And, in solr I think space is the basic separator to separate terms. About the url encoding, space become `+`, and `+` become `%2B`, they are different. – Eric Aug 27 '15 at 03:16

2 Answers2

2

Answer-2: If no mm parameter is specified in the query, or as a default in solrconfig.xml, the effective value of the q.op parameter (either in the query, as a default in solrconfig.xml, or from the 'defaultOperator' option in schema.xml) is used to influence the behavior. So the default behavior of the mm is determined by q.op parameter. If q.op is effectively AND, then mm=100%; if q.op is OR, then mm=1.

YoungHobbit
  • 13,254
  • 9
  • 50
  • 73
  • That make sense, because my `q.op` seems to be `OR`. But the document seems didn't mention that. – Eric Aug 26 '15 at 11:07
  • 1
    It is mentioned. Please check [DisMax Query Parser](https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser) `DisMax Parameters` table and `The mm (Minimum Should Match) Parameter` section at bottom. – YoungHobbit Aug 26 '15 at 11:09
  • I saw it now, nice tip. – Eric Aug 26 '15 at 11:13
1

From the Min Number Should Match Specification Format:

No matter what number the calculation arrives at, a value greater than the number of optional clauses, or a value less than 1 will never be used. (ie: no matter how low or how high the result of the calculation result is, the minimum number of required matches will never be lower than 1 or greater than the number of clauses.

Meaning that the required number will never be less than one, or greater than the number of terms present in the query. If there are three terms in the query and the mm factor is five, it'll still produce a match as the number of terms are less than the required optional terms to match. All terms matching will always give a hit, as it's otherwise just zero matches for everything with less than x query terms.

MatsLindh
  • 49,529
  • 4
  • 53
  • 84
  • That is the same as my test result, but the solr 5.2.1 reference page 258 says `Defines the minimum number of clauses that must match, regardless of how many clauses there are in total.`, that seems not very proper. – Eric Aug 26 '15 at 11:11
  • 1
    Yes, it's a bit confusing. They're talking about when there's more clauses than the `mm` parameter. As long as there's fewer parameters, all matching will give a result. – MatsLindh Aug 26 '15 at 11:42