0

I have to create Data Warehouse for travel agency. I'm doing it for the first time. I've learned all the basics about star, snowflake and constellation schema and about creating data warhouses. I would like to ask what could be changed for better and if this design is good overall.

Here are my dimensions hierarchy:

enter image description here

Here is what I've achived for now (creating schema in MySQL Workbench):

enter image description here

anton86993
  • 638
  • 1
  • 9
  • 26
  • 1
    Can you be more specific about which parts of this you're having difficulty with? Do your diagrams show all your fields in those dims? Also I don't think payment type should really include year/month/day/hour/minute – Rich Apr 02 '17 at 14:07
  • 1
    I'm not a big fan of sub-dimensions. Once you start to introduce these the database design quickly resembles the standard [OLTP](https://en.wikipedia.org/wiki/Online_transaction_processing). This removes many of the benerfits offered by a star schema. An exmaple: data via visualisation tools (like [PowerBI](https://powerbi.microsoft.com/en-us/), [QlikView](http://www.qlik.com/en-gb), etc) prefer the flatter dimension tables, recommended by [Kimball](https://en.wikipedia.org/wiki/Dimensional_modeling). – David Rushton Apr 03 '17 at 12:44
  • Would like to point out that the first two comments are based on a previous version of the question. – Rich Apr 04 '17 at 13:01

2 Answers2

1

To take DimClient as an example. You have a nice surrogate key in there. Next you need to fill in all the things about a client (including the clientID) and then also include district, city, region and country. When you have all that in there, that dimension is complete.

You link to it in your Fact table by the ClientKey, so you need to put that Key in the Fact table as a foreign key.

Go through a similar process with your other dimensions, filling out both the dimensions and facts, and you'll be in a good shape. You don't need subdimensions to reflect your hiearchies: dimensions are denormalised.

Edit: The question was originally quite different, hence the answer above which was relevant to its original form.

Rich
  • 2,207
  • 1
  • 23
  • 27
  • Why should I create star schema here instead of snowflake? Our lecturer told us to make it more like normalized schema (snowflake). Why? – anton86993 Apr 02 '17 at 14:32
  • I don't know why your lecturer told you to do that, but if you want good marks, you should probably do what they said! If you want to do a dimensional model in the Kimball style, you should avoid snowflaking (normalising) where possible, because that reduces the benefits of doing things a dimensional way. If you want to snowflake, make subdimensions, also with surrogate keys, and link to them with foreign keys from your dimensions, like a 'normal' normalised model. Really happy to help if you have specific questions- just ask. – Rich Apr 02 '17 at 14:47
  • As soon as I get home I'll make it and show here my results :) – anton86993 Apr 02 '17 at 14:59
  • I'll look forward to it! – Rich Apr 03 '17 at 08:50
  • I've updated my post. I've created DW as you said and based it on Star Schema. – anton86993 Apr 03 '17 at 14:54
  • Thanks! However your question now doesn't state what problems you have and what you are asking. It is too broad to just say 'give me suggestions to improve my model'. Although I can spot lots of suggestions I'd made to improve it, really this should be led by you – Rich Apr 04 '17 at 07:45
  • You've said in your question that you have a problem with its final form. What are your problems? Or are you just after general advice and comments on what could be done better? – Rich Apr 04 '17 at 11:15
  • I would like to ask if the final form is in a good shape, what could be changed for better. Any advices would be nice :) – anton86993 Apr 04 '17 at 11:46
0

Here's a new answer based on the revised question. There are a number of things you might want to look at for this design. Here's a few pointers but not a complete list:

  • What granularity is your DimTime dimension supposed to be? Normally you have a date dimension at the day/date granularity, but in your table it looks like weeks.

  • You could create a separate time of day dimension if that is important for analysis of when sales or satisfaction reviews were.

  • The loyalty fact seems to be a summary of customer behaviour over a time period- is that supposed to be weeks? If so you could go for an extra dimension at the week level

  • Why does payment type have seconds of the day in it? That doesn't seem right- payment types aren't to do with seconds in a day. Perhaps this is your missing time of day dimension, and payment type should be separate?

  • Should the product dimension have a regional hierarchy? Are you saying a product is different if it is in a different City? You might want to look at that again.

I'm sure other suggestions could be found, good luck with your course!

Rich
  • 2,207
  • 1
  • 23
  • 27
  • 1.We wanted to make date starting from year then quarter, month, week then day of the week without time of the day; 2. I guess it is just for educational purposes not for implementation in real life, so it does not have to be so detailed :); 3.I guess it is good. We wanted to check customer loyalty at max day lvl cause it is travel agency.; 4. PaymentType does not have seconds. It has Minutes then PaymentId or I don't get it; 5. In my country we have voivodeships so I guess it should be called Province instead of Region am I right? – anton86993 Apr 04 '17 at 13:56
  • 1. I agree, you don't want time of day, but it should at least be 'date'. 4. Why minutes at all? A payment type is a type of payment- it has nothing to do with minutes, I'd think! 5. Products aren't regional at all, is what I'm saying. Good luck with your studies – Rich Apr 04 '17 at 14:16