0

Am trying to implement type2 in Datafusion.Can someone help in performing insert and updating by using the pipeline transformations/action/conditions to achieve this. I was trying to generate hash using Wrangler for both source and Target and join them. I was struck here ,not sure how to determine which goes to update/insert and how to handle those.My Target is BigQuery tables.

Manasa
  • 31
  • 3

1 Answers1

0

Welcome on Stack overflow @Manasa!

Once you are considering Bigquery as a warehouse for the target sink, leveraging SCD type2 merging method, you might have first to get the practice way how to implement it composing particular SQL statement, reaching the desired logic. Saying this, I encourage you to study this great article written by @Hariharan Ramachandran, explaining SCD techniques with some hand on examples using Bigquery service as a destination data holder.

According to Data Fusion documentation, you can find a rich set of Plugins, extending standard data processing workflows.Therefore, in this particular case I would recommend implementing BigQuery Execute plugin which can be adjusted in your pipeline, executing SQL queries against Bigquery managed service and perform some SCD2 specific actions.

enter image description here

Nick_Kh
  • 5,089
  • 2
  • 10
  • 16