I am working on a task where my source is AWS RDS - SQL Server and my target is Azure SQL Server.
There's a table with 80M records in my source that needs to be merged with my target table.
This merging will happen every 15 mins and based on the business key
, I need to -
- Update the target table if the key is updated in the source table.
- Insert a new key into the target table.
- Mark
isDeleted
astrue
in the target if the key is no more present in the source.
IMP Note - The source row is hard-deleted and no history is maintained.
Since this merging happens every 15 mins and the source table is pretty big, I use lastUpdated
column to select only limited records in the source query of my merge query.
With this, I am able to perfectly handle the "upsert" scenario, but on delete, it is deleting all the records from the target which is not desirable.
I have tried the below option -
Read the entire source table in a temp_table
every 15 mins and then perform merge from temp_table
to the target table. But this is very costly in terms of processing and time.
Is there any better way to handle this scenario? I am happy to share more information as needed.