0

We have multiple dimensional and fact tables in Exadata. The requirement is to perform CDC on certain attributes from these 5 different dimensional and fact tables. So basically, we will have to collate all the required attributes from 5 different tables and perform CDC on daily basis. My question here is -

If I have to create one unique history table with all the required attributes from the 5 different tables(dimensional & fact), then creating the daily incremental table from the same attributes will be a pain, which would take whole lot of time. Is there a better way to achieve this instead of creating the history and incremental load separately from the existing tables? And is it a good idea to create a unique table combining both dimensional and fact tables?

SecretAgentMan
  • 2,856
  • 7
  • 21
  • 41
  • Fact tables are typically *insert only* so the CDC is trivial. The often used way is to introduce an *auditing dimension* defining all rows inserted in a specific ETL load. The dimensions are *per definition* independent, so you may process them independently. The best way is to define a SCD type 2 table for each dimension and run them daily. – Marmite Bomber Feb 16 '20 at 20:15
  • SCD2 in Hive is a pain, and the question tagged Hive. No non-equi joins. Avoid SCD2 in Hive – leftjoin Feb 17 '20 at 04:54

0 Answers0