6

When I do a search with grouping result and perform group limit, I get that numFound is the same as I when I don’t use the limit.

It looks like SOLR first performs search and calculates numFound and then limit the results.

I can't use pagination and other stuff. Is there any workaround or I missed something ?


Example:

======================================
| id |  publisher | book_title      |
======================================
| 1  | A1         | Title Book      |
| 2  | A1         | Book title 123  |
| 3  | A1         | My book         |
| 4  | B2         | Hi book title   |
| 5  | B2         | Another Book    |

If I perform query:

q=book_title:book
&group=true 
&group.field=publisher 
&group.limit=1
&group.main=true 

I will get numFound 5 but only 2 in the results.

"response": {
    "numFound": 5,
    "docs": [
        {
            "book_title": "My book",
            "publisher":  "A1"
        },
        {
            "book_title": "Another Book",
            "publisher":  "B2"
        }
    ]
}
tasmaniski
  • 4,767
  • 3
  • 33
  • 65

5 Answers5

4

Set group.ngroups to true. That will produce

"grouped": {
"bl_version_id": {
  "matches": 53,
  "ngroups": 18,
  "groups": [
    {
...
Kanu
  • 362
  • 2
  • 14
  • 1
    it's important to not use `group.main=true` which discards this information – rabudde Jan 07 '16 at 20:22
  • @rabudde well said, the result obtained without it would have enough information to sort the groups manually. – SRB May 22 '16 at 15:15
1

I had the same problem, couldn't find a way to fix the root cause, but I will share my solution as a workaround.

What I did is

  1. Facet by the field I'm grouping on.
  2. Count the number of unique facets. This will match the number of unique documents (2 in your case)

Add these faceting parameters to your query:

&facet=true
&facet.limit=-1
&facet.field=publisher

Notes:

  • This is a bit expensive, but it's the only way that worked for me (so far).
  • This will only work if publisher is not multi-valued
mjalajel
  • 2,171
  • 21
  • 27
1

numFound indicate total no. of document matched for current query, here in your case 5 is correct, though you gave group.limit=1 it will give max. 1 document per group even though there are many documents resides in that group. I suggest you to use group.limit=-1 in your query it will return all 5 documents in result.

For more information please check details given below.

solr fieldcollapsing and maximum group.limit

http://wiki.apache.org/solr/FieldCollapsing

Community
  • 1
  • 1
Meet
  • 242
  • 1
  • 4
  • 13
1

group.limit isn't real limit, it's only NumRows to return.

There is no easy solution implemented in Solr for my problem.

You may find answer here Solr User Group

tasmaniski
  • 4,767
  • 3
  • 33
  • 65
-1

numFound refers to the total number of documents found by solr after executing your query, which is also something that you're gonna need to do pagination based on that query.

Pagination in solr is pretty much like you handle it with regular RDBMSs, you're gonna need to use the start and the rows parameters, for instance, executing the following query will result to fetch 10 documents starting from document number 20:

?q=you_key_word&start=20&rows=10

This query will fetch for you the desired content for the target page "this would generate page number 3 in this case assuming that you have 10 docs/page", and of course instead of executing another query to get the total number of documents to know the number of pages, you would have this info auto generated for you represented by the value of "numFound".

Hope this helps

Ma'moon Al-Akash
  • 4,445
  • 1
  • 20
  • 16
  • But that value that solr found isn't correct because I have limit in every group. I am not showing all results but only limited. I got a higher numFound than expected. – tasmaniski Dec 25 '13 at 19:57
  • 2
    In any case, numFound refers to the TOTAL number of documents, you have to keep that in mind. – Ma'moon Al-Akash Dec 25 '13 at 21:07