I have a denormalized table say Sales that looks like:
SalesKey, SalesOfParts, SalesOfEquipments, CostOfSales as some numeric measures Industry, Country, State, Sales area, Equipment id, customer id, year of sale, month of sale and some more similar dimensions. (Total of 12 dimensions)
I need to support aggregation queries on the Sales, like total number of sales in a year, month... total cost of them etc. Also these aggregates need to be filtered, i.e. something like total sales in year 2013, 04 belonging to Manufacturing industry of XYZ customer.
I have these dimension tables and facts in hive/impala.
I do not think I can make a cube on all the dimensions. I read a paper to see how to do OLAP over multiple dimensions : http://www.vldb.org/conf/2004/RS14P1.PDF
Which basically suggests to materialize cubes over small fragments and do some kind of runtime computation when query spans multiple cubes.
I am not sure how to implement this model in Hive/Impala. Any pointers/suggestions will be awesome.
EDIT: I have about 10 million rows in the Sales table, and the dimensions are not comparable to 100, but are around 12 ( might go upto 15) but have a good cardinality each.