In the original (Java) version of Lucene, there is no hard restriction on the size of the the TopFieldDocCollector
results. Any number greater than zero is accepted. Although memory constraints and performance degradation create a practical limit that depends on your environment, 5000 hits is trivial and shouldn't pose a problem outside of a mobile device.
Perhaps in porting Lucene, TopFieldDocCollector
was modified to use something other than Lucene's "heap" implementation (called PriorityQueue
, extended by FieldSortedHitQueue
)—something that imposes an unreasonably small limit on the results size. If so, you might want to look at the source code for TopFieldDocCollector
, and implement your own similar hit collector using a better heap implementation.
I have to ask, however, why are you trying to collect 5000 results? No user in an interactive application is going to want to see that many. I figure that users willing to look at 200 results are rare, but double it to 400 just as factor of safety. Depending on the application, limiting the result size can hamper malicious screen scrapers and mitigate denial-of-service attacks too.