0

I have a Fact table that shows the results of KPIs. There are several KPIs, and some of these have a similar output.

My current columns are something like this:

KPI_ID, DOCUMENT_ID, TRUE_FALSE_FLAG1, TRUE_FALSE_FLAG2, DURATION_3, DURATION_4

So, for KPI number 1 (true false output), the last three columns will be NULL- values. Should I combine TRUE_FALSE_FLAG1 and TRUE_FALSE_FLAG2? What is BEST PRACTICE?

In total, there are 18 columns, where 12 of them are either true/false- flags or durations in the shape of "number of days" (integer).

picture of the two alternatives

EDIT: KPI 3 could be "duration of problem", and you'd have a bunch of problems, each with a documentID, represented as a row. Dur_3 would be like 5 days, 3 days, 10 days, etc. KPI 4 would be "Delay of fix after repair was ordered", and the answer would still be an integer in days. But completely non- related to KPI 3.

Reporting could be "average delay of fix". So roughly a select AVG() from table where KPI_ID = 3 group by KPI_ID.

  • With no domain knowledge about what the columns mean, the second alternative looks better to me. – Gordon Linoff Jan 23 '17 at 12:17
  • Gordon Linoff: KPI 3 could be "duration of problem", and you'd have a bunch of problems, each with a documentID, represented as a row. Dur_3 would be like 5 days, 3 days, 10 days, etc. KPI 4 would be "Delay of fix after repair was ordered", and the answer would still be an integer in days. But completely non- related to KPI 3. Reporting could be "average delay of fix". So roughly a select AVG() from table where KPI_ID = 3 group by KPI_ID. – fredskogsBI Jan 23 '17 at 12:23
  • As long as every KPI is only True/False, or has only one duration to report, you are better with Alternative 2. If there is a chance that any KPI would need to store multiple values, then you want to consider Alternative 1 or something between Alternative 1 and Alternative 2. You say "there are 18 columns, where 12 of them are either true/false- flags or durations in the shape of "number of days" (integer)." What are the other 6 columns. That seems that it would have a bearing on this. – Josh F Jan 23 '17 at 13:09
  • @fredskogsBI . . . It sounds like the flags actually represent a different dimension, which is why I like the second solution. – Gordon Linoff Jan 23 '17 at 13:12
  • Josh F: Thank you for your reply. The other six columns are just metadata, such as INSERT_DATE and UPDATE_DATE, and a few other non- relevant columns. I've made the ETL such that the KPI would not need to store values in multiple columns. That is; a KPI will not be needing to store both a T/F and a duration. – fredskogsBI Jan 23 '17 at 13:16

1 Answers1

0

Based on your latest comment, you are best with Alternative 2. Specifically, as long as every KPI is only True/False, and has only one duration to store, you are better with Alternative 2.

EDIT: with Alternative 2, each KPI can store one True/False value AND one duration value

Josh F
  • 110
  • 1
  • 13