2

Is it possible to create an Slowly Changing Dimension mechanism using Delta Live Tables? I would like to implement something like this https://docs.databricks.com/_static/notebooks/merge-in-scd-type-2.html

But in the DLT docs i found "Processing updates from source tables, for example, merges and deletes, is not supported. To process updates, see the APPLY CHANGES INTO command" - but https://docs.databricks.com/data-engineering/delta-live-tables/delta-live-tables-cdc.html it's only for CDC, do you have any idea for SCD?

repcak
  • 113
  • 8
  • your first link is broken – Oliver Angelil Jul 06 '23 at 16:19
  • 1
    @OliverAngelil They removed this notebook. It was very similar to this example: https://towardsdatascience.com/handling-slowly-changing-dimensions-scd-using-delta-tables-511122022e45 Here u can find docs about merge in delta lake docs: https://docs.delta.io/latest/delta-update.html#id2 And in Databricks docs: https://docs.databricks.com/delta/merge.html – repcak Jul 07 '23 at 07:03

3 Answers3

3

As you noticed right now DLT supports only SCD Type 1 (CDC). Support for SCD Type 2 is currently in the private preview, and should be available in near future - refer to the Databricks Q2 public roadmap for more details on it. If you have solutions architect or customer success engineer in your account, ask them to include you into private preview.

Update, September 2022: SCD type 2 is generally available since June 2022: https://www.databricks.com/blog/2022/06/29/delta-live-tables-announces-new-capabilities-and-performance-optimizations.html

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
  • Are you familiar with the workaround example? Can it be done by using CDC or something? or should i create it using this method https://docs.databricks.com/_static/notebooks/merge-in-scd-type-2.html – repcak Jun 02 '22 at 09:29
  • I know that implementation but i haven't used with DLT – Alex Ott Jun 02 '22 at 10:46
2

DLT is now supporting SCD Type 2 in public preview. You can learn about it and see examples in this document

Mojgan Mazouchi
  • 355
  • 1
  • 6
  • 15
1

Merge is supported in DLT , please see the sample code below

mergeDF1
  .as("merge_tbl1")
  .merge(mergeDF2.as("merge_tbl2"), "merge_tbl1.key = merge_tbl2.key")
  .whenMatched("merge_tbl1.isCurrent = true AND (merge_tbl2.country <> merge_tbl1.country OR merge_tbl2.region <> merge_tbl1.region)")
  .updateExpr(Map(
    "isCurrent" -> "false",
    "endDate" -> "<date>",
  .whenNotMatched()
  .insertExpr(Map(
    "personId" -> "<personId>",
    "personName" -> "<personName>",
    "country" -> "<country>",
    "region" -> "<region>,
    "isCurrent" -> "true",
    "effectiveDate" -> "<date>",
    "endDate" -> "null"))
  .execute()
Sandy
  • 223
  • 1
  • 2
  • 8