13

From what I've read:

Table or Views data access mode commits each row at a time as a transaction. Thus, making a package transferring 5 million rows take long (30+ minutes).

Fast-load data access mode allows one to specify the batch rows and the commit size when inserting to destination. For example, inserting 5 million records would take just over 2 minutes.

Now the question arises where one of the SSIS packages that loads to the DW uses Table or View data access mode in the OLE DB destination. From my understanding, this is in order to pick up the error rows which it inserts (error constraint) into an error records table. Thus, we have a process that takes over 30 minutes. In turn, Fast-Load would take less than 2 minutes for the same action.

If I understand correctly, fast-load would be unable to differentiate which row caused the error in the batch which in turn fails the batch completely? If so, is there an alternative method to this situation where the batch with the error row is redirected by the error constraint and then worked into the destination in a way where the good records in the batch are sent to the correct destination while still sending the error record into the error log table? Is it a good idea to even do so? Is it better to bite the bullet sort of speak in regards to the amount of time it takes?

Thank you in advance.

Ramon Gonzalez
  • 241
  • 1
  • 4
  • 11

2 Answers2

14

What I've seen done in that situation is a cascading failure approach. Attempt to insert into the OLE DB Destination in successively smaller batches to try and get as much in via batch mode before you start the singleton inserts.

Assume you have a commit size of 10k rows (arbitrary number, test for your situation, etc). Redirect failing rows to an OLE DB Destination, still in Fast Load mode but with a commit size of 2.5k rows. Add yet another ole db destination, with a commit size of maybe 100 and then have a final destination that is in RBAR mode. You can then identify the failing rows and do whatever needs to be done with them.

billinkc
  • 59,250
  • 9
  • 102
  • 159
  • I'm wondering why you don't do a commit of X rows, and if there are errors, don't fail the component, but simply redirect the error rows directly to the flat file? - See answer of El.HAM – codeputer Nov 06 '17 at 19:25
  • @codeputer I'm not failing the component. We have the same answer, just different terms. The idea is that you attempt to commit in a large (efficient) size and then pare down to smaller segments. The reason for paring down the failing batches is that there is still some good rows in there and you don't want to throw them out. Where you ultimately direct the rows (flat file, table, etc) is based on the architecture of your ETL – billinkc Nov 07 '17 at 15:01
  • i understand about the good rows, but if the rows that failed were redirected, would the batch not succeed without the rows that were redirected? – codeputer Nov 07 '17 at 17:51
4

Well I used a destination in Fastload mode and I redirected its error rows to another Destination for the same DestinationTable but in Row by row mode.

It was not as slow as Row by Row and not as fast as Fastload but it worked for me!
Also I have an Error Description and real errors.

El.Hum
  • 1,479
  • 3
  • 14
  • 23