I am building my first datawarehouse in SQL 2008/SSIS and I am looking for some best practices around loading the fact tables.
Currently in my DW I have about 20 Dimensions (Offices, Employees, Products, Customer, etc.) that are of Type 1 SCD. In my dw structure, there are a few things I have already applied:
- No Nulls (replaced with blank for text or 0 for numeric during staging)
- unknown key members populated in each dimension (SK ID 0)
- UPSERT for SCD Type 1 loading from stage to production table
- SELECT DISTINCT for my loading of dimensions
In my Fact loading SSIS project, the current method I have for loading dimensions is having multiple lookups (20+) to each of the DIMs, then populating the FACT table with the data.
For my lookups I set:
- Full Cache
- Ignore Failures for "no matching entries"
- Derived Transformation with "ISNULL(surrogate_idkey) ? 0 : surrogate_idkey" for each SK so that if lookups fail they will default to the SK ID 0 (unknown member).
- Some of my dimension lookups have more than one business key
Is this the best approach? Pictures attached to help with my description above.