Does the mysql jdbc driver class know to execute multiple inserts in bulk?

Question

I'm trying to move a large number of records from one MySQL instance two another inside RDS. They are on different VPCs and different AWS accounts, so I can't create a data pipeline that would do the copy for me.

I've written a quick java program that connects to both the import database and the export database and does the following:

query the import database for the highest in table.primary_key with SELECT MAX(primary_key) FROM table
get a result set from the export table with SELECT * FROM table WHERE(primary_key > max_from_import) LIMIT 1000000
create a PreparedStatement object from the import connection and set the queryString to INSERT INTO table (col1....coln) VALUES (?....n?)
iterate over the result set and set the prepared statement columns to the ones from the result cursor (with some minor manipulations to the data), call execute on the PreparedStatement object, clear its' parameters, then move to the next result.

With this method I'm able to see around 100000 records being imported an hour, but I know that from this question that a way to optimize inserts is not to create a new query each time, but to append more data with each insert. i.e.

INSERT INTO table (col1...coln) VALUES (val1...valn), (val1...valn)....(val1...valn);

Does the jdbc driver know to do this, or is there some sort of optimization I can make on my end to improve insert run time?

UPDATE: Both answers recommended using the add and execute batch, as well as removing auto commit. Removing auto commit saw a slight improvement (10%), doing the batch yielded a run time of less than 50% of the individual inserts.

Unless you are using MyISAM tables, you will see enormous speed gain if you wrap your insert statements inside transactions, turn autocommit off and e.g. issue a commit every 100 or every 1000 insert. The exact detail illudes me right now, but pertaining to your question is the `rewriteBatchedStatements` connection option documented here https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-configuration-properties.html — nos, Jun 14 '17 at 17:17
@CBredlow - are you able to speed up insert using my method? — Fathah Rehman P, Jun 14 '17 at 17:34
@FathahRehmanP writing it as we speak, I was mostly curious if the driver did the bulk itself, but the fact that you guys have all mentioned to turn autocommit off leads me to believe that that wasn't the case. — CBredlow, Jun 14 '17 at 17:36
@CBredlow - if you do bulk insert with auto commit off you can insert really fast. If you try to insert as in my answer, you will get huge benefit in insert speed. — Fathah Rehman P, Jun 14 '17 at 17:40

Fathah Rehman P · Answer 1 · 2017-06-14T19:07:30.513

1

First create a JDBC connection to Destination database and make its auto commit property to false.

After that in a loop do the following

Read N(for example 1000) number of rows from Source database and write that to destination database.

After some inserts commit destination database connection.

Sample code to get more idea is given below

Connection sourceCon = getSourceDbConnction();
Connection destCon = getDestinationDbConnction();
destCon.setAutoCommit(false);
int i=0;
String query;
while((query=getInsertQuery()!=null)
{         
     statement.executeUpdate(query);
     i++;
     if(i%10 == 0)
     {
       destCon.commit();
       i=0;
     }
}
destCon.commit();

The getInsertQuery function should give string in INSERT INTO table (col1...coln) VALUES (val1...valn), (val1...valn)....(val1...valn); format. Also it should return null, if all tables are processed.

If you are using Prepared Statements, you can use addBatch and executeBatch functions. Inside loop add values using addBatch function. After some inserts call executeBatch.

edited Jun 14 '17 at 19:07

answered Jun 14 '17 at 17:04

Fathah Rehman P

8,401
4
40
42

@CBredlow - You can change that 10 in i%10 statement with a higher value to see big difference. – Fathah Rehman P Jun 14 '17 at 18:01
@CBredlow - How many rows you are inserting in a single insert statment? – Fathah Rehman P Jun 14 '17 at 18:04
@CBredlow - If you are using Prepared Statements please use addBatch and executeBatch functions in your code. – Fathah Rehman P Jun 14 '17 at 18:17
@CBredlow - what about the insert time now? are you able to insert fast? – Fathah Rehman P Jun 14 '17 at 18:30
Insert for 1000 went from 39 to almost 16 seconds – CBredlow Jun 14 '17 at 18:31
Accepted answer. Mind including something about doing addBatch, then executeBatch? – CBredlow Jun 14 '17 at 18:37
@CBredlow - Just confirmed the problem is fully solved. – Fathah Rehman P Jun 14 '17 at 18:39
@CBredlow - added info related to addBatch and executeBatch – Fathah Rehman P Jun 14 '17 at 19:08
@CBredlow - is that 1000 _rows_ per commit, or 1000 _batches_ per commit? – Rick James Jun 29 '17 at 02:59

score 1 · Accepted Answer · answered Jun 14 '17 at 18:34

You need to use batch insert. Internally, Connector/J (MySQL JDBC driver) can rewrite batch inserts into multi values insert statements.

(Note that this is the default Connector/J behavior. You can add the option useServerPrepStmts=true to the JDBC url to enable server side prepared statements)

The code looks like the following:

 try(PreparedStatement stmt = connection.prepareStatement(sql)) {
    for(value : valueList) {   
      stmt.clearParameters();     
      stmt.setParameter(1, value);
      stmt.addBatch();
    }
    stmt.executeBatch();
 }

The code above will generate a multi value insert:

INSERT tablename(field) VALUES(value1), (value2), (value3) ...

Does the mysql jdbc driver class know to execute multiple inserts in bulk?

2 Answers2