2

In my project, we use solr to index a lot of different kind of documents, by example Books and Persons, with some common fields (like the name) and some type-specific fields (like the category, or the group people belong to).

We would like to do queries that can find both books and persons, with for each document type some filters applied. Something like:

  • find all Books and Persons with "Jean" in the name and/or content
  • but only Books from category "fiction" and "fantasy"
  • and only Persons from the group "pangolin"
  • everything sorted by score

A very simple way to do that would be:

q = name:jean content:jean
&
fq= 
    (type:book AND category:(fiction fantasy)) 
    OR 
    (type:person AND group:pangolin)

But alas, as fq are cached, I'd prefer something allowing me simpler and so more reusable fq like :

  • fq=type:book,
  • fq=type:person,
  • fq=category(fiction fantasy),
  • fq=group:pangolin.

Is there a way to tell solr to merge or combine many queries? Something like 'grouping' fq together.

I read a bit about nested queries with _query_, but the very few documentation about it makes me think it's not the solution I'm looking for.

Xavier Portebois
  • 3,354
  • 6
  • 33
  • 53
  • I'm pretty sure this is not possible. Each filter query (fq) is calculated independently and results in a cached docset (unordered set of docids). Fq's are useful (read fast) because when specifying a couple of them the docsets get interesected which quickly truncates the searchspace. In other words: specifying multiple filter queries logically results in AND'ing them – Geert-Jan Oct 05 '11 at 18:28
  • It's a shame that Solr Fieldcollapsing http://wiki.apache.org/solr/FieldCollapsing doesn't support appending fq's AFTER you specified the field to group on (in your case 'type') that would have pretty much solved it. Still I hope that link might prove useful, as it's a good way to represent N top-documents per type. Although I realize that's not 100% what you're asking here.. – Geert-Jan Oct 05 '11 at 18:40
  • I was pretty sure too this is not possible, but I was hoping something I missed :) It would be a really cool thing to set groups of fq in a query, something like `fq={group:A}...&fq={group:A}...&fq={group:B}`, so solr will play the query, then filter with something like "all fq from group A OR all fq from group B" instead of its simple "all fq". Well, I guess I have to use the non-present-field condition solution I describe in my comment in @Paige Cook answer. Thanks anyway for the answer! – Xavier Portebois Oct 05 '11 at 19:53
  • Ahhh... https://issues.apache.org/jira/browse/SOLR-1223.. not really steaming hot with development though.... – Geert-Jan Oct 05 '11 at 19:56

3 Answers3

3

As Geert-Jan mentioned it in his answer, the possibility to do OR between fq is a solr asking feature, but with very little support by now: https://issues.apache.org/jira/browse/SOLR-1223

So I managed to simulate what I want to in a simple way:

  • for each field a document type can have, we have to define everytime a value (so if in my own example Books can have no category, at index time we still have to define something like category=noCategoryCode
  • when using a filter on one of this fields in a query on multiple types, we add a non-present condition in the filter, so fq=category:fiction becomes fq=category:fiction (*:* AND -category:*)

By this way, all other types (like Person) will pass through this filter, and the filter stands quite atomic and often used - so caching is still useful.

So, my full example becomes:

q = name:jean content:jean
&
fq= type:(book person)
&
fq= category:(fiction fantasy) (*:* AND -category:*)
&
fq= group:(pangolin) (*:* AND -group:*)

Still, can't wait SOLR-1223 to be patched :)

Xavier Portebois
  • 3,354
  • 6
  • 33
  • 53
0

You can apply multiple filter queries at the same time

q=name:jean content:jean&fq=type:book&fq=type:person&fq=category(fiction fantasy)&fq=group:pangolin

Paige Cook
  • 22,415
  • 3
  • 57
  • 68
  • 1
    since 'book' and 'person' are disjunct -> fq=type:book&fq=type:person would return 0 results. – Geert-Jan Oct 05 '11 at 18:21
  • I have to agree with @Geert-Jan: my whole problem is that `fq=category:(fantasy fiction)` will remove all possible persons, and `fq=group:pangolin` will throw away all books. A possible approach would be to add a non-present-field condition, something like `fq=group:pangolin (-group:[* TO *])`: it would take all persons in group "pangolin" and take also all documents without the field group (so books). I was just hoping for a better way to do it. – Xavier Portebois Oct 05 '11 at 19:46
0

Perhaps I am not understanding your issue, but the only difference between a query and a filter is that the filter is cached. If you don't care about the caching, just modify their query:

real query +((type:book category:fiction) (type:person group:pangolin))

Xodarap
  • 11,581
  • 11
  • 56
  • 94