1

I am using a ScrollableResults object to scroll through about 500,000 to 1,000,000 rows from a table. Whilst scrolling I create a different entity using the resulting entity from each iteration and use session.save() to persist this object. Below is example code, where the real code is more complex but essentially doing the same thing.

Session = getSessionFactory().openSession();
Transaction tx = session.beginTransaction();
ScrollableResults results = session.createQuery("from Foo_Table f join f.bars b")
    .scroll(ScrollMode.FORWARD_ONLY);
int i = 0;
while(results.next())
{
    Foo foo = (Foo) results.get(0);
    Bar bar = new Baz(foo);
    bar.setFoo(foo);

    session.save(bar)

    if(i % 50 == 0)
    {
        session.flush();
        session.clear();
    }
}

tx.commit();
session.close();

The important entities:

@Entity
@Table(name = "FOO_TABLE")
public class Foo_Entity implements Serializable {
    @Id    
    @Column(name = "Foo_ID", nullable=false)
    private String id;

    @OneToMany(fetch = FetchType.EAGER, //FetchType.LAZY fixes the slow down 
               mappedBy = "fooParent", cascade = CascadeType.ALL)
    private Set<Bar> bar_entities = new HashSet<>(0);
}

@Entity
@Table(name = "BAR_TABLE")
public class Bar_Entity implements Serializable {
    @Id
    @GeneratedValue
    @Column(name="Id")
    private Long id;

    @ManyToOne
    @JoinColumn(name="foo_pk")
    private Foo fooParent;

    // setFoo, getFoo...

}

When I time this transaction, the running time starts at about 100ms per 500 iterations but gradually rises to a few seconds per 500 iterations after about 20,000 iterations. As a result the transaction has extremely poor performance. The only line of code which is taking any time is the results.next(), which gradually takes longer and longer to execute.

The issue is resolved if I change the fetch type for Bar entities in Foo from eager to lazy. I don't understand why using an eager fetch type for a set that is not yet filled causes problem with scrolling through entities that contain the relationship. The set is indeed filled during scrolling on session.flush(), but in my scenario the set is typically filled with only one to two elements, which is why I would prefer to have this fetch type as eager.

Does anyone know why this slow down happens for this particular scenario?

Note that this question was first posted before I realised that changing the fetch type solved the problem, so the question has now shifted from "How can I fix this" to "why is this a problem?"

MattLBeck
  • 5,701
  • 7
  • 40
  • 56
  • Have you tried to use a profiler for memory leaks? Even without a profiler I bet you could figure it out with [jmap](http://docs.oracle.com/javase/7/docs/technotes/tools/share/jmap.html) and [jhat](http://docs.oracle.com/javase/7/docs/technotes/tools/share/jhat.html). Also if you use eclipse I strongly recommend [Eclipse Memory Analyzer](https://eclipse.org/mat/) – durron597 Apr 17 '15 at 16:04
  • @durron597 Thanks I'll have a look in java VisualVM. But how might a memory leak slow down the next() execution time? – MattLBeck Apr 17 '15 at 16:13
  • Not Oracle per chance? http://stackoverflow.com/questions/12890305/oracle-jdbc-performance-of-resultset – BretC Apr 17 '15 at 16:15
  • @Bret No, h2, sorry i forgot to mention this. – MattLBeck Apr 17 '15 at 16:16
  • @Bret also, I only require FORWARD_ONLY as the scroll mode. – MattLBeck Apr 17 '15 at 16:21
  • I have updated the question with the solution that I found. Its now more a question of "why does this solution fix the problem". – MattLBeck Apr 18 '15 at 11:08

2 Answers2

0

Missing index on the BAR_TABLE.foo_pk column would slow things down with an eager fetch since a full table scan would be performed to load the BAR entity associated with each FOO entity,

0

First, If fetch is eager which means lazy loading is false then Bar_Entity's are loaded whenever Foo_Entity's are loaded. So either remove join in query or make fetch to lazy. Having both is redundant.

Second, Regarding slow down. Since you are opening a stateful session. every object is cached in memory due to hibernate first level cache. In this scenario Slow down has nothing to do with lazy or eager or join. Slow down is due to number of objects being holded in cache(Memory) by hibernate. Try using stateless Session. Then slow down should go away. Please refer to below URL

https://docs.jboss.org/hibernate/orm/3.3/reference/en/html/batch.html

user3435860
  • 59
  • 1
  • 5