I am considering using a Dimensional Modeling/Star Schema warehouse to support an existing Business Intelligence reporting process (and obviously I am hoping to support other BI tasks that haven't even been though of yet), and I have a very basic problem understanding how my end users would write what I think should be a very simple query.
Consider a fact table in a SQL star schema relating to the business process of sales with a grain of a single order. The current BI report collates data by time and geography (a common one would be one row per each week for each US state). It reports the actual sales numbers in many columns that collectively are exhaustive and mutually exclusive (MECE: if you were to add them all together you would know how many sales happened in that week in that state).
There could be many of these columns, in some reports there are hundreds as it is sliced by different products, sales channels, etc.
How do end users write queries like this easily, and ensure they are MECE? Obviously you can use large CASE ... WHEN
blocks, but that relies on the user or BI tool enforcing rules that catch everything. What is a common technique that would mirror the task of for each combination of sales channel and product in this list of combinations, make a column. For any combination not in the list, group into 'other'
?
I can imagine some PL/* scripts that will write unholy CASE ... WHEN
blocks where the last column will have hundreds of conditional statements, but I have a hard time believing that is an efficient way to do this.
Sample query using case when blocks:
SELECT
date.week,
geography.state,
CASE WHEN channel.channel = 'DIRECT'
AND product.name = 'ITEM 1'
THEN SUM(sales.values)
ELSE 0 as direct_item_1,
CASE WHEN channel.channel = 'AFFILIATE'
AND product.name = 'ITEM 5'
THEN SUM(sales.values)
ELSE 0 as affiliate_item_5,
CASE WHEN NOT ((channel.channel = 'DIRECT'
AND product.name = 'ITEM 1') OR (channel.channel = 'AFFILIATE'
AND product.name = 'ITEM 5'))
THEN SUM(sales.values)
ELSE 0 as other
FROM
sales,
channel,
product,
date,
geography
WHEN
date.date_key = sales.date_key AND
geography.geo_key = sales.geo_key AND
channel.channel_key = sales.channel_key AND
product.product_key = sales.product_key
GROUP BY
date.week,
geography.state