0

I am working on building a simple data vault model to upload it into snowflake. I am having a some sort of problem in all satellites. I know that adding auto incement surrogate keys into a sat is neither wrong nor practical, which means we can add it, but it will not represent anything to the whole design. Each SAT contains a hash key coming from the related HUB and represented as FK.

  1. I think selecting LOAD_DATE as PK will result in duplicated PK values.
  2. And it is not logical to add a FK as PK in the same table where there is a one-to-many relationship between the HUB and its SATs.

Can we skip the part of adding a primary key to a SAT in DV2.0 ?

alim1990
  • 4,656
  • 12
  • 67
  • 130

1 Answers1

4

Not sure where you are getting your DV modeling information but, adding an auto incremented surrogate key to a Sat is not in keeping with the DV standards at all. Never was.

PK of a Sat is ALWAYS the PK from the Hub (the Hub Hash key) plus LOAD_DTS. Hub to Sat relationship is ALWAYS modeled as 1:M.

So no - you cannot and should not skip having a PK definition for your Sat.

If you do not believe me - please read any of my blogs on data vault or the published data vault books from Amazon. You might also look at DataVaultAlliance.com

  • I believe you sir. but for the DDLs, should I set the LOAD_DATE as primary key within the DDL scripts? – alim1990 Jan 07 '21 at 07:04
  • And keep the hash as the FK coming from the hub ? I mean as in the DDLs – alim1990 Jan 07 '21 at 07:15
  • 1
    You need a compound PK with the Hash Key + LOAD_DATE plus keep the FK on the Hash Key to the Hub. See the diagram in the Hash Key section of this post https://www.snowflake.com/blog/tips-for-optimizing-the-data-vault-architecture-on-snowflake/ – Kent Graziano Jan 07 '21 at 15:06