3

I have a problem and I've been struggling for several days now. I just don't understand why Hibernate Search works fast enough for one query, but slow for another. I've read a lot of related posts on stackoverflow, hibernate search documentation, etc, but I can't find what's wrong.

I'm using the latest version of Hibernate Search (5.5.2.Final), and the latest versions of Hibernate ORM and PostgreSQL.

I'm only going to copy the important code (the one where you can see the changes between the queries), everything else is exactly the same (it's the same code, in the same function, ...)

    if(latitude != null && longitude != null){
        //Distance sort works very fast, no problems here
        Sort distanceSort = new Sort(
                new DistanceSortField(latitude,
                        longitude,
                        "location"));

        jpaQuery.setSort(distanceSort);
    } else {
        //THIS IS SLOW, i have no idea why
        Sort valueSort = new Sort(
                new SortField("documentValue", SortField.Type.DOUBLE, true)); //reverse, highest values first

        jpaQuery.setSort(valueSort);
    }

Those fields are defined in the Business.java file (important code follows).

    @Spatial(spatialMode = SpatialMode.HASH)
    @Transient
    public Coordinates getLocation() {
           ....     //latitude and longitude coordinates are stored in DB
    }


    @Field
    @NumericField
    @SortableField
    public Double getDocumentValue() {
            return documentValue;   //this number is stored in DB
    }

I have around 10 000 000 records. The distance sort query takes < 1 second, the documentValue (a numeric field, even marked as a SortableField) sort query takes > 30 seconds.

Thank you.

Hardy
  • 18,659
  • 3
  • 49
  • 65
0x42
  • 165
  • 1
  • 7
  • 1
    Turn on Hibernate SQL logging. How many SQL queries are fired when it takes a long time (expected answers are "not a lot", "a lot" or "oh my god it is loading the entire database")? To give a bit of background for that question: I once asked myself why it took an extraordinary long time to get a page of a 1000 search results. It turned out that due to the mappings on the entities, Hibernate was invoking 3001 queries to get the 1000 objects initialized. – Gimby Jan 05 '16 at 16:56
  • 1
    @Gimby: I think I'm gonna kill myself :) I do have logs available all the time during development, but your answer made me take another look. It appears that we collect another column or two in the views, and postgresql was doing a sequential scan instead of using an index. I incorrectly assumed that hibernate will make indexes on foreign keys, but it appears that it only does that with MySQL. Can you write what you said as an answer, so I can accept it, or can I directly accept a comment, or...? – 0x42 Jan 05 '16 at 17:26
  • actually I only gave you a hint, you came to the source of the problem yourself. I would self-answer, and feel free to include whatever you like from my comment. – Gimby Jan 06 '16 at 09:24
  • Hmm, do these two queries return roughly the same amount of entities? I am asking since your question implied a problem in Hibernate Search, but the actual search+sort in your case is done against a Lucene index. Once the ids of the matching entities are determined they are loaded from the DB (unless you are using projections). Looking at the comments above, the problem seems to be the actual DB retrieval. This would imply that you are retrieving different size result sets. Could you clarify your question and what your problem is? – Hardy Jan 06 '16 at 21:08
  • Hi @Hardy, this issue has been fixed, though (because I don't use SO often) I maybe didn't mark it properly as such. Basically, when someone does a search without distance, the view presents an extra line of information (address and phone). The problem here is that those were in another table, and the page was much slower because the db was doing a sequential scan to find the address and phone for that specific business because there wasn't an index on the foreign keys in that table. Once I fixed that (which I agree, is not HbSearch or Lucene's fault), everything was working properly. – 0x42 Jan 07 '16 at 21:55
  • Hi @0x42 that's good to know. For sake of other users on SO, could you please write down the solution as an answer to your own question, so you can mark the question resolved? Thanks – Sanne Jan 08 '16 at 12:04

1 Answers1

1

Ok, closing this question, as a solution to the problem has been identified. Basically, per Gimby's suggestion, I've look into the SQL logging (provided by Hibernate) in more detail, and noticed that additional information were referenced when someone is doing a value-sort. Basically, in the view of the application, they're showing an extra line of information (address and phone).

After looking at the logs, I noticed that, since the address and the phone were in another table, the page was much slower because the db was doing a sequential scan to find the address and phone because there wasn't an index on the foreign keys in that table. I incorrectly assumed that there were indexes on all foreign keys. Once that was fixed, everything was working properly.

As it turns out, the problem has nothing to do with Lucene, and I was looking into it too deep (since I've never used Hibernate Search before, i thought that I'm doing something wrong there). Instead, I should have investigated other possibilities as well (enabling logging, using a profiler, looking at a thread dump, etc), instead of wasting several days on analyzing every bit of Hibernate Search and Lucene. My fault.

0x42
  • 165
  • 1
  • 7