I have a challenge which involves merging two applications which reside entirely in separate SQL server databases. Each database maintains its own internal database ID's which are sequenced integers in the 5,000,000 range. I am using a tool which imports the data via standard API's.
The key challenge I'm facing has to do with referential integrity with parent and child relationships once the datasets are merged. Please refer to the table below for an example. Each application has identical data structures, but their data is unique. We would like all of the data to reside in one system.
One example is with the project object, which has multiple tasks, and each task can have multiple resources assigned. In the parenthesis is the internal database sequence ID (primary key) for each object type. Each project is unique, each task is unique per project, but the same resource can be assigned across projects and tasks.
System 0
Project 1 (PROJECT ID 5000001)
Task A ( TASK ID 5000001)
Resource X (RESOURCE ID 5000001)
Resource Y (RESOURCE ID 5000002)
Task B ( TASK ID 5000002)
Resource Y (RESOURCE ID 5000002)
Resource Z (RESOURCE ID 5000003)
Project 2 (PROJECT ID 5000002)
Task A (TASK ID 5000003)
Resource Z (RESOURCE ID 5000003)
Task B (TASK ID 5000003)
Resource X (RESOURCE ID 5000001)
System 1
Project 3 (PROJECT ID 5000001)
Task C ( TASK ID 5000001)
Resource F (RESOURCE ID 5000001)
Resource G (RESOURCE ID 5000002)
Task D ( TASK ID 5000002)
Resource G (RESOURCE ID 5000002)
Resource H (RESOURCE ID 5000003)
Project 4 (PROJECT ID 5000002)
Task A (TASK ID 5000003)
Resource H (RESOURCE ID 5000003)
Task B (TASK ID 5000004)
Resource F (RESOURCE ID 5000001)
From the data above, you can see how merging Project 1 from System 0 would overwrite Project 3 in the target System 1 due to the pre-existing PROJECT ID 5000001.
My question is how to merge the data while maintaining referential integrity? My first thought is to create a view which combines both data sets from both systems for each object and somehow update the records that will be imported, and also store the new internal ID and cascade that new reference ID to the related objects. Is there a simpler approach?
Are there tools which can automate this?