1

I have a table with atleast 12 million rows of data in a table that looks like this:

Id (varchar(10) | Image (varbinary(max)
---------------- -------------------------
X123456789      | 0x....
1121132121      | 0x....
JF13232131      | 0x....

What would be an efficient way to move data from this table to another table in chunks. I'm writing a simple .net console app to do this and i fear that when i try to load all 12 million rows at once my app will crash.

Thanks for any help

Addendum #1: The source db is oracle and the destination db is sql server

Addendum #2: I'm also converting the image data from tiff to jpeg

zSynopsis
  • 4,854
  • 21
  • 69
  • 106
  • Note that hthe poster put into a ocmment that this is NOT about copying data - total misrepresentation. The binary data is TIFF which has to be written into the target database as JPEG. So, this is NOT a data transfer problem. – TomTom Dec 04 '10 at 14:03

5 Answers5

2

First of all, I wouldn't attempt to move 12M rows via a .NET app. I would use a tool like SSIS (SQL Server Integrated Services) to do this. SSIS is very fast, can support transactions and supports sophisticated data transformations. SSIS was designed to do this sort of thing.

However, if you must do this within a .NET application. I would probably read something like 1K to 5K rows at a time, insert them into the target database, and get the next 1K to 5K rows etc. But this is going to be much slower than via a tool like SSIS.

Randy Minder
  • 47,200
  • 49
  • 204
  • 358
  • how would you pull 1k to 5k rows at a time given i have only 2 fields?This is the part i'm actually having trouble understanding on how to implement – zSynopsis Dec 03 '10 at 19:02
  • I also need to perform some converting/translating on the data after the initial pull which is why i'm not using ssis. – zSynopsis Dec 03 '10 at 19:03
  • @zSysop - What difference does the number of columns you have in each row make? You move 1-5K rows at a time. Each row has two columns of data. – Randy Minder Dec 03 '10 at 19:11
  • How would i pull 1k and the next 1k etc. I can't just do select * from table where row = first 1k, second 1k... – zSynopsis Dec 03 '10 at 19:14
  • 1
    @zSysop - You are going to want to retrieve rows from your database in 'pages'. You didn't say what database you are using, but if you are using SQL Server, you'll want to use the Row_Number() function. Take a look at this article: http://blog.sqlauthority.com/2007/04/03/sql-server-t-sql-paging-query-technique-comparison-sql-2000-vs-sql-2005/ – Randy Minder Dec 03 '10 at 19:18
  • Sorry about that but i'm using oracle. – zSynopsis Dec 03 '10 at 19:24
  • @zSysop - Sorry to hear that. I don't know much about Oracle, but I do believe that have a similar function. Take a look at this thread: http://stackoverflow.com/questions/827108/how-can-i-speed-up-row-number-in-oracle – Randy Minder Dec 03 '10 at 19:33
2

You should have a look at the SqlBulkCopy class, you may be able use it to move the entire table in one shot.

T3hc13h
  • 602
  • 6
  • 15
2

"What would be an efficient way to move data from this table to another table in chunks"

You think you need to move the data in chunks because you're hand-rolling a tool instead of using SSIS. You think you can't use SSIS because you want to do "some converting/translating on the data after the initial pull". What transformations do you want to apply which you think you can't do with SSIS? I mean SSIS can do loads of things. Find out more.

Because using a purpose-built tool is way more efficient than trying to write your own. And the other thing is, migrating the whole table in a single operation is a much safer bet with regards to avoiding data corruption and data loss.


"I'm converting the images from tiffs to jpegs. Can SSIS do this for me?"

Okay, so clearly that is a rather specialized requirement and not the sort of thing we would expect SSIS to have as a built-in. But SSIS does support custom transformations so you could write a component which converts the images.

APC
  • 144,005
  • 19
  • 170
  • 281
  • I'm converting the images from tiffs to jpegs. Can SSIS do this for me? – zSynopsis Dec 04 '10 at 05:36
  • No. But then your speed problem is not one of readin and writing data - the tifff / jpeg recoding will burn up your cpu's. Nice point in misrepresenting the problem. – TomTom Dec 04 '10 at 14:03
1

The way I'd tackle the problem would be to connect to the Oracle database in .Net and use a DataReader to read the data out row by row. Then process each row to do your image conversion and then simply insert the new data into your Sql Server database. Now, I'd imagine that this would not be a quick thing to do, but I can't see any reason why it would crash as the data is streamed from Oracle and then pushed straight into Sql Server.

If you want to make it run a bit quicker then it wouldn't be too hard to have multiple threads doing the transformation and insert into Sql Server.

Jon Mitchell
  • 3,349
  • 5
  • 27
  • 37
0

Have you looked at BCP? We had a similar problem and it worked fine for us.

Chuck Conway
  • 16,287
  • 11
  • 58
  • 101