0

I have a Data Warehouse which uses internal surrogate keys and type 2 slowly changing dimensions. In the clearing we just have the business keys from the erp-system, like this:

Image of Clearing, sorry don't have enough reputation to post it directly or more than two links

In the Data Warehouse we want to use the surrogate keys instead (Note: Article price changed from 500$ to 1000$ and articles is using surrogate keys where possible, here only for manufacturer).

Image of Data Warehouse

If we were just using the business keys it's no problem, just compare, update old entries, insert new entries. But what's the best way to do this with surrogate keys?

Get existing Ids (0 or -1 for not existing yet) from the Data Warehouse in the clearing and then compare the entries?

enter image description here

Keep the business keys in the Data Warehouse aswell, compare them and update Ids then in the Data Warehouse?

i.pictr.com/3kqn6wb9ou.png

mmarie
  • 5,598
  • 1
  • 18
  • 33
Josh Alvo
  • 96
  • 1
  • 6
  • 1
    not clear what is your requirement. If you want to load data from source clearing to dwh - you have to generate surrogate keys - for this you also keep source system primary keys in dwh table along with surrogate keys. and while loading you do lookup of data in dwh table based on source/natural keys and if record is existing you compare columns you want to track changes for and incase of change of values you update existing row's validuntil and insert a new row with new validfrom.. – garpitmzn Feb 17 '14 at 12:16
  • So you mean best practice would be to keep the business keys in the data warehouse as in the last [screenshot](http://i.pictr.com/3kqn6wb9ou.png) ? – Josh Alvo Feb 17 '14 at 12:21
  • 1
    yes it makes sense to keep source business keys in dwh – garpitmzn Feb 17 '14 at 17:02

1 Answers1

0

to be able to do lookups when loading tables - just like when referencing a manufacturer while loading the articles, you have to store the natural/business keys in the DWH. From my experience, this is always done.

But you should store the business keys of a source entity only in the destination entity. Let me clarify, business key of the manufacturer should only appear in the Manufacturer table in your DWH, not elsewhere. When you need to reference the manufacturer in different table, such as Article, you use your surrogate key of the manufacturer.

So, you got it right in the second screenshot.

Then, when you load Article table and you need to know if the manufacturer changed for a specific article, you first look up the manufacturer's surrogate key based on its business key and compare the surrogate key with the key in the Article table. This is how it is usually done.

michalko
  • 166
  • 2