So this my maiden question so here it goes... The idea is I am doing a large number of update statements to a database table. In sql it would be a simple
. Since there are millions of these it is best thought to batch some of these together. I followed directions here:
update table_name set col1 = 123 where col2 = 456 and col1 is null
http://docs.jboss.org/hibernate/orm/3.3/reference/en-US/html/batch.html
and a few other pages I've randomly found here on stackoverflow and other places but they all read similar.
My idea was to not do a read but do just the update directly and keep looping like this:
sessionFactory = new Configuration().configure("finaldetail/hibernate.dev.cfg.xml")
.addPackage("com.xxx.app.ftm.domain.event").addAnnotatedClass(FinalTrainDetail.class)
.addAnnotatedClass(AbstractDetail.class).addAnnotatedClassFinalTrainDetailWork.class).buildSessionFactory();
inputStream = new BufferedReader(new FileReader(new File(args[0])));
session = sessionFactory.openSession();
transaction = session.beginTransaction();
String s;
int count = 0;
while ((s = inputStream.readLine()) != null) {
Query query = session.createQuery("update FinalTrainDetail detail set detail.trainSummary "
+ "=:summaryId where detail.trainDetail=:detailId and detail.trainSummary=null");
query.setParameter("summaryId", new Long(s.substring(9, 18)));
query.setParameter("detailId", new Long(s.substring(0, 9)));
query.executeUpdate();
count++;
if (count % 20 == 0) {
log.debug("should commit");
session.flush();
session.clear();
}
}
transaction.commit();
System.out.println("exit");
} catch (IOException e) {
transaction.rollback();
log.error(e.toString());
} catch (Throwable t) {
System.out.print(t);
log.error("exception caught during Updateing Offline", t);
System.exit(2);
} finally {
if (inputStream != null)
inputStream.close();
session.close();
}
So the understanding here is that flush will cause every 20 updates to commit and then the clear empties first level cache to avoid the OutOfMemory exception.
So far a config I have
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE hibernate-configuration PUBLIC
"-//Hibernate/Hibernate Configuration DTD 3.0//EN"
"http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd">
<hibernate-configuration>
<session-factory>
<!-- SQL dialect -->
<property name="dialect">org.hibernate.dialect.Oracle10gDialect</property>
<!-- Database connection settings -->
<property name="connection.driver_class">oracle.jdbc.driver.OracleDriver</property>
<property name="connection.url">jdbc:oracle:thin:@dev264.oracle.XXXX.com:1521:DEV264</property>
<property name="connection.username">XXXX</property>
<property name="connection.password">XXX</property>
<property name="connection.shutdown">true</property>
<!-- JDBC connection pool (use the built-in one) -->
<property name="connection.pool_size">1</property>
<!-- Enable Hibernate's automatic session context management -->
<property name="current_session_context_class">thread</property>
<!-- Disable the second-level cache -->
<property
name="cache.provider_class">org.hibernate.cache.NoCacheProvider</property>
<!-- disable batching so HSQLDB will propagate errors correctly. -->
<property name="jdbc.batch_size">20</property>
<!-- Echo all executed SQL to stdout -->
<property name="show_sql">true</property>
</session-factory>
</hibernate-configuration>
Show sql is enabled for debugging purposes. So what I don't get or am not seeing is when I set in log4j
<logger name="org.hibernate.transaction">
<level value="debug"/>
<appender-ref ref="file"/>
<appender-ref ref="console"/>
</logger>
I only see
[DEBUG] [main] [org.hibernate.transaction.JDBCTransaction] [commit]
[DEBUG] [main] [org.hibernate.transaction.JDBCTransaction] [commit]
[DEBUG] [main] [org.hibernate.transaction.JDBCTransaction] [commit]
at the very end of the log file and not occurring when the 'flush' is occurring. So what I am wondering is whether the commit is really being called every 20 records and whether I am building up too many objects in memory and will I get OutOfMemory in production when this runs being that I don't have hundreds of thousands of test records.