I created a query to find parent documents in SOLR by filtering on both child and parent properties. I have simplified it for this example to:
{!parent which='content_type:"parent" AND field_a="value" AND field_b="value"'}((child_field_x:("VALUE" ) AND field_y:value))
Only parent documents have 'content_type:parent'. SOLR only returns parent documents, so that works.
Now I'm creating crossings between to other fields, lets say field_c and field_d. For all possible values of both C and D I want to calculate the number of parent documents. For each combination of values I now do this:
{!parent which='content_type:"parent" AND field_a="value" AND field_b="value" AND field_c="value" AND field_d="value"'}((child_field_x:("value" ) AND child_field_y:value))
When I add up all the results of alle these queries however, I get a much larger number then with the original query above. The original query would give me 15k results, if I add up all rows I get 80k results.
I did some testing and notice that if I take a specific value for C and a specific value for D these were the results:
Filtering only on C: 12.522 documents
Filtering only on D: 15.205 documents
Filtering on both (AND): 12.349 documents
Filtering on C and negate D: 3.265 documents -> expected
the difference between C and D which would be 2.683
Both field_c
and field_d
are single value.
If I remove the child query (everything after }
), but leave it like {!parent which='(..)
I do get the correct sum. It's only when I start adding the child document query that it doesn't add up anymore.
I just don't get it, why does this happen? I have a feeling I'm not getting something from the concept of child documents, but can't seem to find anything looking at examples and documentation. It does seem to correctly filter on the parent properties, but probably the child documents are not queried correctly, or so it seems.
UPDATE I did some extra testing by looking at the results generated. There are no duplicates in the result set and the results of parent documents are correct for the parent filters. I wasn't able to check the child documents that belong to those companies yet, but it seems to be a problem there.
One thing I noticed: if I change the default query operator to 'AND' instead of 'OR' I get 0 results in every crossing. Since my query already contained 'AND' only, I didn't get why this would be the case.