0

We are implementing a new dwh solution. I have many dimensions that require slowly changing type 2 attributes. I was considering implementing a combination of Type 2 and Type 1 attributes in my dimension. That is for some dimension attributes, we track history by inserting new rows in the dim table (Type2), for other attributes we will just update the existing row for any changes (Type1)

Questions:

  • Is this a good practice? is it OK to have a combination of SCD 1 and 2 for the same dim?

  • Is there any limit on the number of SCD 2 attributes in a dimension? My dimension is pretty wide, like 300 cols, out of which one of the users is requesting for about 150 cols to be tracked by scd type 2. Is it OK to have so many scd2 attributes in a dim? Is there going to be any impact on performance of downstream reporting BI solutions like cubes and dashboards because of this?

  • In the OLTP system, we maintain an "audit" table to log any updates. Though this is not in a very easily queryable format, we get answers to most of our questions related to changes from this. We don't need much reporting on data changes. Of course there are some important columns like Status for which we definitely need SCD2 but rest of the columns, I am not sure having history for lot of other columns in the DWH adds any value. My question is when we have this audit table in OLTP, how do I decide what attributes need SCD 2 in the DWH?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
siri
  • 1

1 Answers1

1

Good practice? Yes. Standard feature of dimensional modelling that is overlooked too often. I've seen dimensions with combinations of SCD0, SCD1 and SCD2, and there's nothing to prevent other SCD-types being used as well.

No limit on columns, but that does seem a little excessive. You probably want to use a "hash" method to detect the SCD2 changes, where you calculate a hash over the SCD2 columns, and use this value to detect if any of the columns have changed.

Sorry, but I don't understand the question about audit logs. Are these logs your data source?

Ron Dunn
  • 2,971
  • 20
  • 27