I have created an SSIS package to extract and load data from access table (.mdb) named TmShipping to a SQL table say, TmShippingImport, this SQL table has Id and importedDate as additional columns. I have scheduled the package to run for every 30 minutes.
TmShipping.mds
----------------------------------------
OrderId CarrierId TotalCharge
TmShippingImport (SQL table)
-------------------------------------------------------
Id OrderId CarrierId TotalCharge importedDate
In the Data Flow Task:
I am getting the data from source using OLD EB connection and extracting all row data from the access table and the output of this is connected to a Recordset Destination so that I can extract each row.
In the Control flow task:
I have a loop container (connected to data flow task's o/p) which inserts each row into the sql table by a sql query and loads all the row data along with the current datetime.
Package execution
The SSIS package when executed for the first time loads each row into the SQL Table and add a DateTime to imporatDatetime Column. When new records are created in the Access table, the package now takes all the rows in the MS access (rows that were extracted previously and new rows) and loads them in the SQL table again. How to avoid duplicates? My primary key in SQL table is ID which is not present in the MS Access Table.
Tried using Lookup table in the Dataflow in between source and recordset destination but it failed saying I can't connect the available column to BLOB.
Should I be trying with Lookup Merge in the dataflow or should I make changes in the foreachloop container so that it checks for duplicates before inserting the rows into the sql table or...?