I'm designing on my first real Kimball-style star schema dimensional model. I've slogged through a lot of his books, and I'm part-way through Star Schema - The Complete Reference by Adamson (which I find much more practical and straightforward than the Kimball books). I've not been able to find an answer to this question. Please help! A reference to an article or book would be fine.
In dimensional modeling, what is the canonical way to model the relationship between a fact and its corresponding baseline(s), projection(s), or goal(s)?
For instance, suppose that, for company A, in year 2016, its actual sales were $1 million. That's clearly a fact.
Additionally suppose that, in 2014, the company projected sales for 2016 to be $1.2 million, and in 2015, the company projected sales for 2016 to be $1.1 million.
But it turns out that the 2014 (older) projections are the one against which the 2016 sales must be measured. In other words, we need to explicitly specify the relationship between actual sales and their related projected sales estimate. So "drill-across" won't work, because we aren't sure which projections are the right ones to be comparing to.
Essentially, it seems like a fact needs to be explicitly related to another fact, which (according to the literature) is verboten?
So which of these implementations is the canonically best?
Create fact_sales and fact_sales_projection (at the same grain). Include sales_projection_key in fact_sales, essentially linking a fact to a fact (which is supposedly not a good idea).
Create fact_sales and dim_sales_projection (at the same grain), so call the projections dimensions, even though they contain the same numbers as the facts they support. Again include sales_project_key in fact_sales, but it's OK now, because semantically, it's a dimension.
Just create fact_sales with a dimension called SalesType which is either "Actual" or "Projected". Include both the actual and projected sales in the same table, with a self-joining key from "actual" sales records to their correct "projected" sales record.
Just create fact_sales, but add additional fact columns which contain the projections as well as actual sales. This will result in considerable duplication of projection data but ensures that actual data are kept side-by-side with the projection that "matters".
Of these, I'm pretty sure that #3 is not the best solution. I've waffled a lot as to whether #1, #2, or #4 is best, although (to me) #1 seems like a better idea than #2.
Any input is appreciated. Also, I'm not entirely clear if this type of questions is better for the "Stack Overflow" forum or the "Database Administrators" forum?