0

Our sales representatives call Leads to propose services. Leads are stored in Crm with attributes: LeadId, PrimaryContactNumber, SecondaryContactNumber. All calls are done via Teams and recorded. Calls could be extracted via Microsoft Graph Api and have attributes: CallId, UserId, CalleeNumber, CallerNumber, Duration. Users have these attributes: UserId, Username, Email. How can I generate LINK (relation between User, Lead) using data vault modeling when foreign keys are not known? I thought my design should look like this: User(hub)-Call(link)-Lead(hub), but my call has only userId, leadId could only be inferred from one of the Lead attributes(PrimaryContactNumber or SecondaryContactNumber) What is the best solution to this problem? Or should I model Call as a hub aswell and perform filter when loading data to datamart?

DmitriBodiu
  • 1,120
  • 1
  • 11
  • 22

1 Answers1

0

A call is not a business entities, it's rather a relation between a sales person and a lead. So, your first thought is right.

To answers your question, you might want take a look at the zero/ghost records concept. http://roelantvos.com/blog/unknown-keys-zero-keys-or-ghost-keys-in-hubs-for-dv2-0/

Basically, add a "Unknown" entry in your Lead(Hub). Then, when you generate the link and the Lead is not known, simply link it to the unknown entities. Make sure you add a satellite to you link, so you can track the period it was unknown, and when it is known.

MLeblanc
  • 1,816
  • 12
  • 21
  • "Then, when you generate the link and the Lead is not known" - how do I know if my Lead is known or not? Only by comparing CalleeNumber from Calls with PrimaryContactNumber or SecondaryContactNumber on Lead, right? That would mean I must load Lead before the CallLink, and adds coupling. I would like to load them in parrallel. Could Exploration Link concept help me here? – DmitriBodiu Oct 25 '21 at 14:33
  • Depending on your architecture, this can be done in parallel. You already have the information whether the lead is known or not in the staging area. If the PrimaryContactNumber or SecondaryContactNumber match with the CalleeNumber, hash the associate business key, otherwise, hash "Unknown", or whatever the BK you set. The Exploration link won't help you here, basically, you will fall into the same problem, how can i generate the link when the lead is not known. – MLeblanc Oct 25 '21 at 16:43
  • Sounds nice. Could you please elaborate how should my flow be? Imagine I have an etl job which load leads inrementaly every day, and another one which loads Calls every 5mins. When there are no leads all calls will be links to unknown leads. When I load leads, what should I do to change unknown links to known ones? Could you explain this step please. – DmitriBodiu Oct 25 '21 at 17:52
  • 1
    Zero keys and ghost records are not the same thing, Roelant needs to update that article. https://patrickcuba.medium.com/data-vault-mysteries-zero-keys-ghost-records-3be7cb05ee94 Ghost records need to pre-populate satellites with a single record, zero keys do not --- they naturally occur if you are staging them first. – patrick_at_snowflake Nov 14 '21 at 22:57