I have the scripts to pull data from RDBMS and populate the Data warehouse. It works. I was wondering how Temp table comes in between. What are the steps to ETL exactly? Even though my data warehouse is populated, my teacher says that we need to use a temp table. why is it important?
Please help me. I am very confused right now. Thank you.
We need to pull the data from the databases of two different offices in two different locations. I will give the details of the table below.
lds_job_role
job_role_id:integer
job_role_desc:varchar
key_skill_1:INEGER
key_skill_2:INEGER
key_skill_3:INEGER
recommended_sal:INTEGERlds_account
account_id:INTEGER
acc_name:VARCHAR
acc_postcode:VARCHARlds_placement
placement_id:INTEGER
plt_short_desc: VARCHAR
plt_required_start_date:DATE
plt_estimated_end_date:DATE
plt_actual_start_date:DATE
plt_renewal_no:INTEGER
plt_to_permanent:VARCHAR
max_salary: INTEGER
min_salary:INTEGER
actual_salary:INTEGERmch_job_role
job_role_id:INTEGER
job_role_desc:VARCHAR
recommended_sal:INTEGERmch_account
account_id:INTEGER
acc_name:VARCHAR
acc_postcode:VARCHARmch_placement
placement_id:INTEGER
plt_short_desc:INTEGER
plt_required_start_date:DATE
plt_estimated_end_date:DATE
plt_actual_start_date:DATE
plt_actual_end_date:DATE
plt_renewal_no:INTEGER
plt_to_permanent:VARCHAR
max_salary:INTEGER
min_salary:INTEGER
actual_salary:INTEGER
supervisor_name:VARCHARBelow are the facts and dimensions of the Star schema:
job_roe_dim
job_role_id
job_role_desctime_dim
time_id
yearaccount_dim
account_id
account_namefact_accounts
Report_id
no_of_placements
salary
FK1_time_id
FK2_account_id
FK3_job_role_id
The exercise tells us to "deal with data quality issues, measures for FACTs, identifiers etc ". I THINK I have already done that with the scripts, but I do not know how to show it. Perhaps that is why the the temp table is required?