Data Warehouse - MDX query to calculate top 5 universities

Question

I am trying to create a star schema to analyze rankings on universities.

I created star schema which is at the below;

fact table;

Ranking

University_id
Time_id
Classification_id
ranking(measure)

the dimensions and hierarchies;

University_id

state
city
university_name

time_id

year
quarter
month

classification_id

category( it can be Engineering, or Social sciences etc.)
type ( it can be computer engineering, civil engineering etc.)
type_info (info about the type)

I want to write a MDX query that lists the top-5 universities in California in terms of Computer Engineering in 2011. The result should consists of university name and it's rating.

Here is my MDX query;

SELECT Measures.ranking ON COLUMNS
HEAD (ORDER ({[University_id].[California].[Members].[Children],
Measures.[ranking], BDESC, 5}) ON ROWS
WHERE (Measures.[ranking], [Time_id].[Year].[2011], [classification_id].[engineering].  [computer Engineering])

Is my design of Star schema and my relevant mdx query correct and efficient? Otherwise how can I improve my solution or make it correct? I also created the star schema therefore we can change it to improve the design.

Why care about efficiency? How many universities are there? How long does it take? How many universities will there be in 50 years? This data should be so tiny, there is no reason to doubt the SQL optimizer will do a good enough job, without *benchmarking*. — Has QUIT--Anony-Mousse, Apr 20 '14 at 22:49
Yes, you are right. Is it correct then and written in an widely accepted style? — yns, Apr 21 '14 at 05:43

score 0 · Answer 1 · answered Apr 20 '14 at 23:06

0

Instead of combining head/order, I would use the TopCount function.

answered Apr 20 '14 at 23:06

Marc Polizzi

9,275
3
36
61

Thanks for your suggestion. But combining head/order(my query) will also work truly, right? – yns Apr 21 '14 at 05:45
Should yes; TopCount might be faster. – Marc Polizzi Apr 21 '14 at 22:59

score 0 · Accepted Answer · answered Apr 23 '14 at 12:03

The general idea is fine. Some remarks, however:

I would call the dimensions "University", "Time", etc. instead of "University_id" etc. The id is an internal key that is in many cases not exposed to users, and users think in terms of "University", not in terms of "id".
If you are using measures on the columns, you may not use them in the slicer axis (aka WHERE clause).
I am not sure how your ranking measure looks like. But if it really is a rank in the sense that lower numbers are better, you should order ascending instead of descending, or use BottomCount.

Data Warehouse - MDX query to calculate top 5 universities

2 Answers2