5

I'm trying to improve the performances of my asynk transactional method.

In this task I have to read almost 7500 record from a table, elaborate it, and insert/update a corresponding row in another table.

I'm using spring data jpa with hibernate.

In order to get a ScrollableResults I inject the EntityManager into my service.

Here how I get my ScrollableResult object:

Session session = (Session) em.unwrap(Session.class);
        ScrollableResults res = session.createQuery("from SourceTable s")
                .setCacheMode(CacheMode.IGNORE)
                .scroll(ScrollMode.FORWARD_ONLY);


while (res.next()){
.... // em.flush() called every 40 cycles
}

Cycling on result take about 60 seconds.

And here the bottleneck. If inside my loop I execute a simple query:

query = em.createQuery("from DestTable d where d.item.id = :id", DestTable.class);

 while (res.next()){
     query.setParameter("id", myId).getSingleResult();     
 }

The execution time become x10 slower.. and takes about 600s.

I've tried to modify a parameter of my Session or of my EntityManager: session.setFlushMode(FlushModeType.COMMIT); em.setFlushMode(FlushModeType.COMMIT);

It increase the performance and removing the manual flush() method the work is done in 40s!!!

So my questions are:

  • What is the difference of call setFlushMode on session or on enityManager?
  • Why setFlushMode(FlushModeType.COMMIT); increase in that way performance, and I cannot have same performance only by manually flushing entityManager?
Markus Pscheidt
  • 6,853
  • 5
  • 55
  • 76
gipinani
  • 14,038
  • 12
  • 56
  • 85

1 Answers1

5

The problem is that the default flush mode is FlushModeType.AUTO. In auto flush mode, Hibernate will flush before each query (only queries, not find operations). This means that in your above example, by default, Hibernate is flushing each time you call getSingleResult(). The reason it does this is because it's possible that the changes you have made would affect the results of your query and so Hibernate wants your query to be as accurate as possible and flushes first.

You don't see the performance hit in your first example because you are only issuing one query and scrolling through it. The best solution I have found is the one you mentioned which is just to set the flush mode to COMMIT. There should be no difference between calling setFlushMode on the Session or the EntityManager.

Pace
  • 41,875
  • 13
  • 113
  • 156
  • One possible difference that comes in mind is that using it on session will override the flush property only for the current transaction, while setting it on entityManager the properties is setted for all. Could it be? – gipinani Jan 28 '14 at 14:10
  • Nope, both the EntityManager and the Session are tied to the first-level context. They have the same lifecycle. If you were able to set the flush mode on the SessionFactory or EntityManagerFactory then that might be the case but as far as I know there is no way to do that. – Pace Jan 28 '14 at 14:41
  • 1
    The problem with hibernate flush is that this triggers a build of the actionqueue and then processing it. The problem starts to be more visible the more managed entities you have at the time of a flush. Hibernate checks every! Instance for changes. So if, like in your case, you know that your code does not have an impact on the queried list the solution of having one final flush (on commit) can solve the issue. Another option chould be to call a clear on the entitymanager before your destination query to keep the flush loop as tiny as possible. A side effect is a smaller memory footpribt. – Martin Frey Jan 28 '14 at 20:06