I have a question in regards to star schema design, whether I need to use snowflake (which I read should be avoided). I have the following three dimension tables:
- Main list dim. - contains list of people
- Sub lists dim. - contains all sorts of combinations from the main list
- Program dim. - identify lists of programs, each program could be connected to a sub list
Each row in the fact table will contain keys from the following three tables (and metrics), but the issue is this - some sub lists can be the exact list (in terms of list content) but pointing to different programs. So should I create in the sub list dimension repetition of the same content or should I use snow flake to connect between the sub lists and the programs? Example - assuming my main list contain 100K records, and I have 3 programs A, B and C. Program A has 10K sub list so I will have 10K entries in the sub list dimension, however program B and C have the same sub list with 30K records, so should I create 60K entries, 30K per each?? Important to note that there are other attributes in the program DIM that differentiate between each program, and the fact data is at the program level.
Thanks!