0
  1. In a data ware house project why do we need to use DataVault modelling before transfer the data in data marts(which use Kimball methodology) from landing / staging area databases?. ie Why can't we use the Kimball methodology to transfer the data straight forward from the landing / staging area databases to final data marts?

  2. Is it a must to do this?.

Explorer
  • 295
  • 1
  • 12

2 Answers2

4

Well, you don't need to use the Data Vault methodology. Also, you don't need to use Kimball. Also, you don't need to use 3nf.

This all depends on the requirements you have in your environment. On

  • the data structure,
  • data complexity,
  • sizing,
  • schedules,
  • changing source formats,
  • need of historization,
  • requirements for reports, dashboards or other ETL structures...

There is no 'need' to do Data Vault specifically.

It all depends on what you want to do and what your requirements are.

tobi6
  • 8,033
  • 6
  • 26
  • 41
  • All depends is key here. If you need to truly track where the data is coming from to audit both good and bad data before it hits that data mart, then the data vault will come in handy. For example, if you're storing medical records, then a data vault is handy for truly tracking all the source systems data came from, when it was first seen, what it truly is, and all the relevant information that is critical before it gets to that mart. The same goes for data that did not make it to the mart (i.e.: bad data). – Fastidious Sep 04 '17 at 02:56
0

I've noticed that lately, the Data Vault 2.0 methodology has become quite prevalent in EDW projects. Kimball/Star-schema data models are still very much used, but they are mainly just the top level abstraction on top of the Data Vault that allows for reporting.

Kent Graziano has a great primer to understand DV modelling here, which is an excerpt from Dan Linstedt's book (also a great read).

Karri
  • 188
  • 2
  • 11