8

I need a mental process to design an OLAP database...

Essentially for standard relational it'd be (loosely):

Identify Entities
Identify Relationships
Identify Properties of Entities

For each property:

Ensure property can be related to only one entity
Ensure property is directly related to entity

For OLAP databases, I understand the terminology, the motivation and the structure; however, I have no clue as to how to decompose my relational model into an OLAP model.

R. Barzell
  • 666
  • 5
  • 24
stevenrcfox
  • 1,547
  • 1
  • 14
  • 37

2 Answers2

13

Identify Dimensions (or By's) These are anything that you may want to analyse/group your report by. Every table in the source database is a potential Dimension. Dimensions should be hierarchical if possible, e.g. your Date dimension should have a year,month,day hierarchy, Similarly Location should have for example Country, Region, City hierarchy. This will allow your OLAP tool to more efficiently calculate aggregations.

Identify Measures These are the KPI's or the actual numerical information your client wants to see, these are usually capable of being aggregated, therefore any non flag, non key numeric field in the source database is a potential measure.

Arrange in star schema, with Measures in the center 'Fact' table, and FK relations to applicable Dimension tables. Measures should be stored at the lowest dimension hierarchy level.

Identify the 'Grain' of the fact table, this is essentially the 'level of detail' held. It is usually determined by the reporting requirements, the data granularity available in the source and performance requirements of the reporting solution.You may identify the grain as you go, or you may approach it as a final step once all the important data has been identified. I tend to have a final step to ensure the grain is consistent between my fact tables.

The final step is identifying slowly changing dimensions, and the requirements for these. For example if the customer dimension includes an element of their address and they move, how is that to be handled.

stevenrcfox
  • 1,547
  • 1
  • 14
  • 37
  • Non-numeric fields can be aggregated too with count function, right? – kazinix Aug 13 '14 at 06:48
  • @dpp, agreed, Non numeric fields can indeed be counted, which may be useful for text fields with number of options (i.e. status fields) numeric fields provide many more aggregation options (avg, percentage, std.dev etc) note that there are also numeric data that should not be aggregated (average of averages!!) or that only some types of aggregation makes sense. – stevenrcfox Aug 19 '14 at 16:45
  • what is the good book which explained the above concepts @stevenrcfox – Sudhir N Mar 21 '17 at 16:46
4

One important point in identify the Dimensions and Measures is the final cardinality that you are electing for the model. Let´s say that your relational database data entry is during all day. Maybe you don´t need to visualize or aggregate the measures by hour, even by day. You can choose a week granularity or monthly etc.

Norberto108
  • 167
  • 1
  • 9