2

I am trying to migrate data from an Oracle to SQLServer database. I'm currently using a JDBCCursorItemReader to read the Oracle data, then a JDBCBatchItemWriter to write the data to SQLserver.

My problem is that this is taking way too long. For a table of ~200,000 rows, it takes almost an hour (and I have to run four of these queries, each around 200K of rows).

@Bean
public JdbcCursorItemReader<DataPOJO> dataReader() throws Exception, ParseException, UnexpectedInputException {
        final JdbcCursorItemReader<DataPOJO> dataReader= new JdbcCursorItemReader<>();

        dataReader.setDataSource(oracleDataSource);
        dataReader.setSql(Constants.DATA_QUERY);
        dataReader.setRowMapper(new BeanPropertyRowMapper<DataPOJO>(DataPOJO.class));
        return dataReader;
    }

@Bean
    public JdbcBatchItemWriter<DataPOJO> dataWriter() throws UnexpectedInputException, ParseException, Exception {

        JdbcBatchItemWriter<DataPOJO> dataWriter = new JdbcBatchItemWriter<>();
        dataWriter.setItemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider<DataPOJO>());
        dataWriter.setSql(Constants.DATA_MERGE);
        dataWriter.setDataSource(mssDataSource);
        return dataWriter;
    }

Does anyone have any tips for improving performance on these kinds of jobs?

Gavin
  • 197
  • 1
  • 1
  • 10
  • That looks definitely like a db problem. How long does it take until the Select-Query returns if you run it directly in a sql/db browser? Furthermore, have you set autocommit to false? What is the commitSize oft your step? Generally, you should be able to read and write a couple of thousands of entries per second. – Hansjoerg Wingeier Sep 19 '16 at 08:04

2 Answers2

0

Does anyone have any tips for improving performance on these kinds of jobs?

I would approach in this way..

1.See the wait stats for the process i am running

select * from sys.dm_exec_requests where session_id=<< your session id>>

To identify your sessionid,you can check sys.processes

select spid,waittime,lastwaittype,status
hostname,     --below three columns help you identify your query/spid,
 program_name,--- you can replace spid in above query to get live status
nt_username
 from sys.sysprocesses

2.Troubleshoot based on wait type..

TheGameiswar
  • 27,855
  • 8
  • 56
  • 94
0

I would first insert your records to a source table (personally, I'd use a temp table) and then leverage the MERGE command to push the records from that "source" table into your "target" table.

A couple key performance concerns:

  • Make sure you match on an indexed key on your target table
  • Assuming you want to update MATCHED rows, do NOT update the index column(s) as part of your UPDATE statement
  • Set your commit interval decently high. Your DB should be able to merge 1000 rows without breaking a sweat. Committing too frequently will add significant overhead.

Now to accomplish this with Spring Batch, you'll probably need a composite ItemWriter (one comes out of the box with the framework). Delegate writer 1 would be the JdbcBatchItemWriter you're already using and would insert to that source table. Delegate writer 2 would be custom and would simply execute your MERGE command.

Dean Clark
  • 3,770
  • 1
  • 11
  • 26