3

Does anyone have a clue why Hibernate Search FullTextQuery (FullTextEntityManager) getResultSize() never matches the getResultList().size()?

I am not passing anything to setFirstResult or setMaxResult.

For example, I do a query on one field for the word "truck" the resultSize says 50,345, but the ResultList.size() is 865. Does anyone know of any reason these would be so far off? I have cleared the Lucene indexes and rebuilt them, but it still doesn't work. I am baffled.

    QueryBuilder qb = this.inventoryRepo.getSearchManager()
                .getSearchFactory().buildQueryBuilder()
                .forEntity(Inventory.class).get();

         BooleanJunction<?> junction = this.builder.createAlgorithm(
                    searchRequest, qb);

         org.apache.lucene.search.Query luceneQueryluceneQuery = junction.createQuery();

        }

        searchResult.setQuery(luceneQuery.toString());


        FullTextQuery jpaQuery = this.inventoryRepo.getSearchManager()
                .createFullTextQuery(luceneQuery, Inventory.class);

        jpaQuery.limitExecutionTimeTo(20000,
                TimeUnit.MILLISECONDS);

        List<Inventory> results = jpaQuery.getResultList();

        log.debug("Total Search Result Size: " + jpaQuery.getResultSize());
        searchResult.setTotalSize(jpaQuery.getResultSize());
cнŝdk
  • 31,391
  • 7
  • 56
  • 78
chrislhardin
  • 1,747
  • 1
  • 28
  • 44
  • If the index and the database are in sync the results should not differ indeed. Since FulltextQuery#getResultSize() is not loading entities from the database, but rather just return the hit count from the Lucene query, there is a potential risk of things being out of sync, but otherwise it should work, especially since you say that you rebuilt the index. Maybe it would help to see some code. – Hardy May 08 '15 at 19:58
  • I added in my relevant code above... I just left out the part where I add all the elements I am searching on in the query which shouldn't have anything to do with the resultSize() being totally far off from the list.size() – chrislhardin May 11 '15 at 18:23
  • Since you know the data, which is your expected result size? Do you have 865 entities in the database which match your search criteria? Or 50,345? – Hardy May 11 '15 at 20:20
  • Are you using sharding, multi-tenancy or anything like that? any Filter being enabled? Is it a Spatial query? Does it match when you select it all via queryBuilder.all().createQuery() ? – Sanne May 11 '15 at 23:00
  • The truth is in the middle. So if I search for "truck", I get 424,189, but I only have 53,556 records total in the database for the table. If I search for "pedestal", I get a list size of 27 but the resultSize is 567. The reality is there are about 27 records that match pedestal... – chrislhardin May 13 '15 at 11:28
  • No sharding, no multitenancy, no filters, no spatial and if I query all like that I get 187505, but I only have 53,556 total records in the database. – chrislhardin May 13 '15 at 11:47
  • Could it be that the mass indexer is set .purgeAllOnStart(false)? – chrislhardin May 13 '15 at 11:59
  • Can you please clarify something: You have `org.apache.lucene.search.Query luceneQueryluceneQuery = junction.createQuery();` but `searchResult.setQuery(luceneQuery.toString());` - is this a typo, or is there more code, or are there competing queries? If there is more than one query, this could interfere with your results – Drakes May 18 '15 at 14:47

1 Answers1

1

Referring to the FullTextQuery documentation and the getResultSize method section:

int getResultSize()

Returns: the number of hits for this search.

Caution: The number of results might be slightly different from list().size() because list() if the index is not in sync with the database at the time of query.

The results will be slightly different, but I think it's not the case here because there's a huge difference which is logically caused by limitExecutionTimeTo call here which is preventing the query to fetch all the results and you can see it in the documentation, so the difference is caused by this code:

jpaQuery.limitExecutionTimeTo(20000,
            TimeUnit.MILLISECONDS);

Which will execute the query only for 20 seconds and return only partial results, that's why you are getting less results because the query had not finished fetching all the results.

You can use hasPartialResults() to test if all the results was fetched or not.

Community
  • 1
  • 1
cнŝdk
  • 31,391
  • 7
  • 56
  • 78