0

I have a question in regards to star schema design, whether I need to use snowflake (which I read should be avoided). I have the following three dimension tables:

  • Main list dim. - contains list of people
  • Sub lists dim. - contains all sorts of combinations from the main list
  • Program dim. - identify lists of programs, each program could be connected to a sub list

Each row in the fact table will contain keys from the following three tables (and metrics), but the issue is this - some sub lists can be the exact list (in terms of list content) but pointing to different programs. So should I create in the sub list dimension repetition of the same content or should I use snow flake to connect between the sub lists and the programs? Example - assuming my main list contain 100K records, and I have 3 programs A, B and C. Program A has 10K sub list so I will have 10K entries in the sub list dimension, however program B and C have the same sub list with 30K records, so should I create 60K entries, 30K per each?? Important to note that there are other attributes in the program DIM that differentiate between each program, and the fact data is at the program level.

Thanks!

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
karel lahmy
  • 397
  • 1
  • 5
  • 15
  • If the question is, "Is it OK to have multiple records in a dimension table that have the same attributes?", then the answer would be, "No". Are these dimensions independent of each other, or is this a hierarchy? I'm having trouble understanding the situation here. – David Aldridge Aug 25 '15 at 07:23
  • The records in sub lists dimension will have different sub list id. You can look at sub list and main list as hierarchy as sub list is always consists of records from that table but there could be endless permutations of those, so not sure whether it is pure hierarchy. – karel lahmy Aug 25 '15 at 12:41
  • Could you show the table structure for each table and some example data for each? The data can be fake if privacy is an issue, as long as it mimics the nature of the real data. – Jo Douglass Aug 28 '15 at 15:22
  • I think I found a solution to the issue, it was business related rather than technical one, thanks for the help anyway. – karel lahmy Sep 03 '15 at 19:29

0 Answers0