0

Here's the situation, in the source database we have more than 600K active rows for a dimension but in reality the business only uses 100 of them.

Unfortunately the list of values that they might use is not known and we can't manually filter on those values to populate the dimension table.

I was thinking, what if I include the dimension columns for that table in the fact table and then when we send that to staging area, just seperate it from the fact and send it to it's own table.

This way, I will only capture the values that are actually used.

P.S. They have a search function in the application that help users navigate through 600K values. it's not like a drop-down field !

Do you have a better recommendation?

Ariox66
  • 620
  • 2
  • 9
  • 29
  • A dimension table shouldn't need to be updated regularly. Can you describe why your dimension table has to be regularly updated? 600K rows are not too large. – Gilbert Le Blanc Oct 12 '20 at 23:53
  • Humm, you're right...it doesn't have to be updated regularly...once a week or fortnight would be enough...the problem is loading it into BI (Qlikview) . 600K is too big for a dimension table in Qlikview. would slow down everything – Ariox66 Oct 13 '20 at 00:02

1 Answers1

1

Yes - you could build the Dimension from the fact staging table. A couple of things to consider:

  1. If the only attribute for the Dimension is the field in the fact staging table then you can keep this as a degenerate dimension in the fact table; no need to build a dimension table for it - unless you have other requirements that require a standalone dimension table, such as your BI tool needs it.
  2. If there are other attributes you need to include in the dimension then you are still going to need to bring in the source dimension table - but you can filter it using the the values in the fact staging table and only load the used values into your dimension
NickW
  • 8,430
  • 2
  • 6
  • 19