2

I am new to SSIS and would like the best method for an ETL package creation to load all data initially and then next run to only load changes or new items. I will use the package on a schedule SQL job. I know that I could set SSIS package to truncate the destination table but for me that is a waste of resources and if large tables are concern the transaction log would be huge.

Thanks in advance!

Hadi
  • 36,233
  • 13
  • 65
  • 124
Brian Tran
  • 115
  • 10
  • Usually this is done by having something like a `LastUpdated` date in your source table. You then compare that to the `LastUpdated` value in your destination, and if the source record has a newer date, load it, if not, ignore it. – digital.aaron Nov 01 '18 at 16:10

1 Answers1

3

I think there are more than one method to achieve that:

  1. Adding a reference table that contains the last inserted ID (or primary key value), and this table must be update each time.
  2. Using Change Data Capture (CDC), you can refere to the this article for more information about it: Introduction to Change Data Capture (CDC) in SQL Server 2008 (Note that the database must be configured to enable CDC - may not work with your situation)
  3. Adding a LastUpdate column to the table, and read only date where LastUpdate column value is higher than the last running SSIS job date.
Hadi
  • 36,233
  • 13
  • 65
  • 124