1

I am doing some data analysis with SOLR and I am stuck on one part that can potentially provide some great value to me.

I have a solr collection that has a number of numeric fields that are ranges, for example:

pr_high_max = 10.35
pr_high_min = 8.15

pr_med_max = 12.55
pr_med_min = 10.40

Each min/max combination provides a price range, the high/med part is derived by the number of items in the current group by that range, there is some funky math that I am not going to go into.

I need to query solr with an item price and get back a document that has that price in one of the ranges, I need to be able to assign weight to it so high fields have priority over med fields. This is essentially a reverse RANGE search.

There are other fields that I am querying, so this should be included in the weighting, this also can't be in fq, since if the item doesn't match this criteria, there are others that it may match.

So far, I was able to assemble this function query:

 prboost:sum(
 if(and(query({!edismax v='pr_high_max:[8 TO *]' }),query({!edismax v='pr_high_min:[* TO 8]'})),5,0),
 if(and(query({!edismax v='pr_med_max:[8 TO *]' }),query({!edismax v='pr_med_min:[* TO 8]'})),3,0),
 if(and(query({!edismax v='pr_low_max:[8 TO *]' }),query({!edismax v='pr_low_min:[* TO 8]'})),1,0),
 )

Where 8 is the price that I will be passing it, basically this checks to see if the price is in any of the ranges, and if it is, I will get back a value, 5 for high, 3 for med, 1 for low. Ideally, I would like to include this in the regular weighting, but I wasn't able to add this as a subquery. Additionally, if I tried to boost on it, I get back "Infinite Recursion detected parsing query 'pr_high_max:[8 TO *]'"

Has anyone run into something like this before? Any ideas?

Also, I have control of the data going on, so I can easily massage it to represent the range in a different way if this would make resolution easier.

Thanks in advance

nick_v1
  • 1,654
  • 1
  • 18
  • 29

1 Answers1

1

Alright, took a while, but I got it figured out, I had to add an empty boost paramater to each query, here is what works. I am summing up all the values, starting with 1 (since, without it, any extra boost will result in a value less than 1 and actually penalize the document). Each subquery runs, and depending on which one gets matched, the boost will be increased by 1, 5, 10 or 15 percent.

sum(1, if(and(query({!edismax boost='' v='pr_shigh_max:[$doc->{pr} TO *]' }),query({!edismax boost=''       v='pr_shigh_min:[* TO $doc->{pr}]'})),0.15,0), 
if(and(query({!edismax boost='' v='pr_high_max:[$doc->{pr} TO *]' }),query({!edismax boost='' v='pr_high_min:[* TO $doc->{pr}]'})),0.1,0), 
if(and(query({!edismax boost='' v='pr_med_max:[$doc->{pr} TO *]' }),query({!edismax boost='' v='pr_med_min:[* TO $doc->{pr}]'})),0.05,0), 
if(and(query({!edismax boost='' v='pr_low_max:[$doc->{pr} TO *]' }),query({!edismax boost='' v='pr_low_min:[* TO $doc->{pr}]'})),0.01,0))
nick_v1
  • 1,654
  • 1
  • 18
  • 29