7

My project is using hibernate with spring transaction manager and my database is postgres (might be irrelevant).

I'm trying to read big xml files and construct objects out of those (objects are not big but the amount is) and insert them into database.

If by some chance one of my objects violates database constraint the whole process stops. How can I skip the ones which violate the database constraint? alternatively log their id or whatever to a log file?

Question update:

I've been browsing trough SO and found that for batch inserts it's best recommended to use Stateless session but I still get the same issue and insert stops :

May 26, 2012 4:45:47 PM org.hibernate.util.JDBCExceptionReporter logExceptions
SEVERE: ERROR: duplicate key value violates unique constraint "UN_FK"
  Detail: Key (fid)=(H1) already exists.

Here are the relevant parts of my code for parsing xml and inserting into db, for simplicity let's assume I'm inserting movies :

//class field
@Autowired
private SessionFactory sessionFactory;

@Override
public void startDocument() throws SAXException {
    session = sessionFactory.getCurrentSession();
}

@Override
public void endElement(String uri, String localName, String qName) throws SAXException  {
if (qName.equalsIgnoreCase("FILM")) {
        movie.setCategory(category);
        movie.setAdded(new Date());
        session.insert(movie);
    }
}

I and have this property set in the app-ctx hibernate.jdbc.batch_size to 100. Is it really necessary to do select before insert in order to avoid this?

Update 2:

If I use StatelessSession instead of session, I get arround 20 inserts and than the processing stops indefinitely without any exception or anything.

I assume the number 20 is because I'm pooling connections with tomcat and have maxActive="20".

Bounty Update :

I'd really love to see someone offer solution (without defensive select if possible). Using statelessSession or just session.

Sebastien Lorber
  • 89,644
  • 67
  • 288
  • 419
Gandalf StormCrow
  • 25,788
  • 70
  • 174
  • 263

6 Answers6

4

Most types of constraints, such as if a column is nullable or has a max width, you can check using Hibernate Validator. Just manually execute the validation on the object before attempting to persist it.

For some things, particularly unique constraints, you need to either execute a 'defensive' select to see if a collision exists, or maintain an in memory set of values already inserted.

Affe
  • 47,174
  • 11
  • 83
  • 83
4

I think it is impossible to validate something so completely that you can guarantee successful insert. In some cases, no matter what you do, someone else can insert something into the DB between validation and insert causing a constraint violation.

In most cases I would just recommend to handle the exception like any other.

Izzy
  • 106
  • 4
2

To insert a large amount of objects from an xml file, you should consider using spring batch. The parameter skip-limit permits you to tell how many erroneous lines you can have in your xml before stopping the batch process. Check also skip-policy and skippable-exception; seee Configuring a Step in the Spring documentation.

If you don't want to use spring batch, just use a simple try catch that will permit your process to continue until the end.

Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254
Sebastien Lorber
  • 89,644
  • 67
  • 288
  • 419
  • A simple try catch statement won't do. Once an exception is thrown by Hibernate, the session state is incoherent, the transaction must be rolled back, and the session closed. Moreover, the exception will only be thrown at flush time, long after the faulty record has been persisted. – JB Nizet May 26 '12 at 11:52
  • This is why it's recommended to use a Stateless Hibernate Session for such cases. It will prevent incoherent state and also reduce the memory consumption (or you won't have to evict already treated entities) – Sebastien Lorber May 26 '12 at 13:24
2

Why do you think this all has to be one big transaction? Everything you describe in fact implies to me that you in fact have numerous transactions here. For objects that "error out", just evict that entity and rollback the transaction. It gets a little more complex if the "FILM" element defines a graph of objects, but the idea is the same.

Steve Ebersole
  • 9,339
  • 2
  • 48
  • 46
1

I think Affe has the right approach with a defensive select before insert.

You could consider using a savepoint before each insert and rolling back to the savepoint if an exception is thrown. There will be a performance overhead of creating a savepoint. The savepoint will need to be released when you are done.

See AbstractTransactionStatus.setSavepoint() or if you have access to the JDBC connection pool setSavepoint rollback(Savepoint) and releaseSavepoint(Savepoint)

Community
  • 1
  • 1
pd40
  • 3,187
  • 3
  • 20
  • 29
0

Though it's an old question, I faced a similar situation recently and here's what I did.

I used stateless sessions too but here's what I did differently. I issued a session.insert which either succeeds or fails due to Constraint violation, up on Constraint violation a ConstraintViolationException is thrown, catching it would give u the object that failed the insert, do what ever you wanted to do with the failed object. Along with transactions also used save points for every insert (save points are cheaper and doesn't cause that big performance impact), so when a transaction fails due to some issue the commit is done till last save point (in catch clause).

my code looked something like this:

try {
  session.connection().setSavePoint("lastSaved");
  session.insert(obj);
}
catch(ConstraintViolationException) {
  log.error(obj.getUserId());
  ....
}
tx.commit();
....
catch(TransactionException e) {
  tx.rollback();
}
Aryan
  • 61
  • 2
  • 7