1

I am trying to paginate through the results of a very broad query but Solr returns only 10 results even when the rows and start parameters are present.

http://localhost:8983/solr/patents/query?q=*:*&rows=10000000&start=9

This returns:

{
  "responseHeader":{
    "status":0,
    "QTime":0,
    "params":{
      "q":"*:*",
      "start":"9",
      "rows":"10000000"}},
  "response":{"numFound":10,"start":9,"docs":[
      {
        "date":"1980-07-10T00:00:00Z",
        "id":117008,
        "country":"US",
        "title":"Solr test",
        "_version_":1525967658488430592}]
  }}

What is the best way to paginate through several thousand documents?

Istvan
  • 7,500
  • 9
  • 59
  • 109
  • What is the goal of this? Anyway fetching bulks of documents from Solr via such a high rows value is not a good idea. Paging implies fetching small batches. If you want to route/transfer docs from your index somewhere else, there are better approaches. – cheffe Feb 15 '16 at 05:46

1 Answers1

2

For deep paging you should be using cursors as performance deteriorates with high values of start in standard requests: https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results

Your current query looks technically sound (albeit bad from a performance viewpoint) and your index simply has only 10 documents as stated by numFound.

Toke Eskildsen
  • 709
  • 4
  • 10
  • Thank you very much, it seems while the data import is running Solr sees only 10 documents probably because of the previous debug run. – Istvan Feb 12 '16 at 18:47
  • 1
    That sounds reasonable. Frequent commits are normally quite costly with Solr, so for batch indexing it makes sense to do them rarely or only once (at the end). If you wish to see the progress of your job, you can issue a manual commit: http://stackoverflow.com/questions/7815628/most-simple-way-url-to-trigger-solr-commit-of-all-pending-docs – Toke Eskildsen Feb 12 '16 at 21:11
  • 1
    Perform a softCommit ( `` ) if you want to track progress. That is way cheaper and faster. – cheffe Feb 15 '16 at 05:43