3

I'm working on a Spring Boot application that uses JPA (Hibernate) for the persistence layer.

I'm currently implementing a migration functionality. We basically dump all the existing entities of the system into an XML file. This export includes ids of the entities as well.

The problem I'm having is located on the other side, reimporting the existing data. In this step the XML gets transformed to a Java object again and persisted to the database.

When trying to save the entity, I'm using the merge method of the EntityManager class, which works: everything is saved successfully.

However when I turn on the query logging of Hibernate I see that before every insert query, a select query is executed to see if an entity with that id already exists. This is because the entity already has an id that I provided.

I understand this behavior and it actually makes sense. I'm sure however that the ids will not exist so the select does not make sense for my case. I'm saving thousands of records so that means thousands of select queries on large tables which is slowing down the importing process drastically.

My question: Is there a way to turn this "checking if an entity exists before inserting" off?


Additional information:

When I use entityManager.persist() instead of merge, I get this exception:

org.hibernate.PersistentObjectException: detached entity passed to persist

To be able to use a supplied/provided id I use this id generator:

@Id
@GeneratedValue(generator = "use-id-or-generate")
@GenericGenerator(name = "use-id-or-generate", strategy = "be.stackoverflowexample.core.domain.UseIdOrGenerate")
@JsonIgnore
private String id;

The generator itself:

public class UseIdOrGenerate extends UUIDGenerator {

  private String entityName;

  @Override
  public void configure(Type type, Properties params, ServiceRegistry serviceRegistry) throws MappingException {
      entityName = params.getProperty(ENTITY_NAME);
      super.configure(type, params, serviceRegistry);
  }

  @Override
  public Serializable generate(SessionImplementor session, Object object) 
  {
        Serializable id = session
            .getEntityPersister(entityName, object)
            .getIdentifier(object, session);

      if (id == null) {
        return super.generate(session, object);
      } else {
        return id;
      }
  }
}
Maciej Kowalski
  • 25,605
  • 12
  • 54
  • 63
Geoffrey De Vylder
  • 3,963
  • 7
  • 36
  • 56

2 Answers2

2

If you are certain that you will never be updating any existing entry on the database and all the entities should be always freshly inserted, then I would go for the persist operation instead of a merge.

Per update

In that case (id field being set-up as autogenerated) the only way would be to remove the generation annotations from the id field and leave the configuration as:

@Id
@JsonIgnore
private String id;

So basically setting the id up for always being assigned manually. Then the persistence provider will consider your entity as transient even when the id is present.. meaning the persist would work and no extra selects would be generated.

Maciej Kowalski
  • 25,605
  • 12
  • 54
  • 63
  • Thanks, I forgot to mention that. I've tried using persist but it throws an exception. I've updated my original post with the details. – Geoffrey De Vylder Jul 12 '17 at 10:19
  • If you are able to make changes, check my upadte – Maciej Kowalski Jul 12 '17 at 11:08
  • That's a good idea but if I understand correctly, this would break inserting new items in all the other parts of the application because the ids will be empty. I'd have to manually do a setId(generateId()) everywhere which is not an option right now. – Geoffrey De Vylder Jul 13 '17 at 06:49
  • Do those selects happen after every merge method is called? or all at once at the end of the transaction? Or every insert is coded as being part of a one transaction? – Maciej Kowalski Jul 13 '17 at 08:09
  • Everything happens in one transaction, first I can see selects being called for everything that is being saved (multiple entities of all kinds), then I see a list of inserts. – Geoffrey De Vylder Jul 13 '17 at 09:20
1

I'm not sure I got whether you fill or not the ID. In the case you fill it on the application side, check the answer here. I copied it below:

Here is the code of Spring SimpleJpaRepository you are using by using Spring Data repository:

@Transactional
public <S extends T> S save(S entity) {

    if (entityInformation.isNew(entity)) {
        em.persist(entity);
        return entity;
    } else {
        return em.merge(entity);
    }
}

It does the following:

By default Spring Data JPA inspects the identifier property of the given entity. If the identifier property is null, then the entity will be assumed as new, otherwise as not new.

Link to Spring Data documentation

And so if one of your entity has an ID field not null, Spring will make Hibernate do an update (and so a SELECT before).

You can override this behavior by the 2 ways listed in the same documentation. An easy way is it to make your Entity implement Persistable (instead of Serializable), which will make you implement the method "isNew".

Bertrand88
  • 701
  • 6
  • 14