0

I am considering using a Dimensional Modeling/Star Schema warehouse to support an existing Business Intelligence reporting process (and obviously I am hoping to support other BI tasks that haven't even been though of yet), and I have a very basic problem understanding how my end users would write what I think should be a very simple query.

Consider a fact table in a SQL star schema relating to the business process of sales with a grain of a single order. The current BI report collates data by time and geography (a common one would be one row per each week for each US state). It reports the actual sales numbers in many columns that collectively are exhaustive and mutually exclusive (MECE: if you were to add them all together you would know how many sales happened in that week in that state).

There could be many of these columns, in some reports there are hundreds as it is sliced by different products, sales channels, etc.

How do end users write queries like this easily, and ensure they are MECE? Obviously you can use large CASE ... WHEN blocks, but that relies on the user or BI tool enforcing rules that catch everything. What is a common technique that would mirror the task of for each combination of sales channel and product in this list of combinations, make a column. For any combination not in the list, group into 'other'?

I can imagine some PL/* scripts that will write unholy CASE ... WHEN blocks where the last column will have hundreds of conditional statements, but I have a hard time believing that is an efficient way to do this.

Sample query using case when blocks:

SELECT date.week, geography.state, CASE WHEN channel.channel = 'DIRECT' AND product.name = 'ITEM 1' THEN SUM(sales.values) ELSE 0 as direct_item_1, CASE WHEN channel.channel = 'AFFILIATE' AND product.name = 'ITEM 5' THEN SUM(sales.values) ELSE 0 as affiliate_item_5, CASE WHEN NOT ((channel.channel = 'DIRECT' AND product.name = 'ITEM 1') OR (channel.channel = 'AFFILIATE' AND product.name = 'ITEM 5')) THEN SUM(sales.values) ELSE 0 as other FROM sales, channel, product, date, geography WHEN date.date_key = sales.date_key AND geography.geo_key = sales.geo_key AND channel.channel_key = sales.channel_key AND product.product_key = sales.product_key GROUP BY date.week, geography.state

ajknox
  • 13
  • 5
  • 1
    You could define the relationship using a SSAS cube. End users would not write queries. They would select the measures and dimensions from the cube. – devlin carnate Apr 25 '16 at 16:12
  • get started you can always get help optimizing it.. – RanchiRhino Apr 25 '16 at 16:14
  • you should share sample of your data and table structure for more info... – RanchiRhino Apr 25 '16 at 16:16
  • I haven't actually built this yet, so there is no sample data. It is currently supported with a non-DW process and I am trying to figure out what these queries would look like if we moved to a star schema – ajknox Apr 25 '16 at 16:37
  • So... what you're wanting is to select grouping by `week`, `state`, `channel` and `product`, summing the `sales.values` column, filtered to certain combinations of these dimensions, and your issue is that you want it to come out as columns instead of rows? – Jeff May 30 '16 at 06:33

0 Answers0