1

This is how I am doing bulk inserts using EntityManager in JPA. Batch size is currently set to 50. I cannot call commit for each batch size. Either the whole transaction should be committed, or nothing at all.

The method bulkInsert() is being called by a separate method in another class and marked as @Transactional

How can I make the inserts much faster? I am inserting upto 200K rows.

class MyRepo{

@PersistenceContext
    private EntityManager em;

    @Value(value = "${batch_size}")
    private int batchSize;

    @Override
    public List<Object> bulkInsert(List<Object> objects) {

        IntStream.range(0, objects.size()).forEach(index -> {
            em.persist(objects.get(index));

            if (index + 1 % batchSize == 0) {
                em.flush();
                em.clear();
            }
        });

        em.flush();
        em.clear();

        return objects;
    }
}

I also have the following properties set to avoid the postgres connectivity loss, which I still face. The connection closes by itself inspite of the properties below.

spring.datasource.test-while-idle=true
spring.datasource.test-on-borrow=true
spring.datasource.validation-query=SELECT 1
spring.datasource.tomcat.validation-interval=0
Righto
  • 855
  • 3
  • 11
  • 32
  • you can make inserts faster by using JDBC, so the objects are not being "managed" by the persistence mechanism hence much less overhead. Then you instead use JPA for what it was intended for ... the persistence, update, deletion and querying ... but not the bulk – Neil Stockton Apr 12 '17 at 06:06
  • Thanks, can you please provide an example or refer some link for that? – Righto Apr 12 '17 at 06:07
  • You mean something like this - http://www.javarticles.com/2016/01/spring-boot-jdbc-example.html – Righto Apr 12 '17 at 06:13
  • Have you enabled JDBC batch inserts? – Arnold Galovics Apr 12 '17 at 06:13
  • how do you do that? – Righto Apr 12 '17 at 06:53
  • What about calling commit for each batchSize? Actually the problem could be too big transaction (it keeps more and more data in a rollback segment) – StanislavL Apr 12 '17 at 07:33
  • I actually cannot do that. I want either the whole thing to go through or nothing at all. Also, wondering if removing this property will make it faster - hibernate.order_inserts. I had set it to true but I feel that order may not be important in these batch inserts – Righto Apr 12 '17 at 07:35
  • Does your entity use an `IDENTITY` primary key? – Naros Apr 12 '17 at 15:54
  • yes it uses a identity primary key. Does that mean that I should keep the hibernate.order_inserts property and set it true? I am just thinking that because either the whole transaction goes through or nothing does - order is not important for me. Eventually, all rows will be inserted. – Righto Apr 12 '17 at 16:08

0 Answers0