I'm new to ETL process. At this moment I'm working with kettle more precisely with geokettle to make a ETL process. I'm building a transformation to insert data in a normalized structure on my database target from a descriptive table of a database source.
Please take into consideration the following example.
Example:
I'm trying to insert in cascade on my database a contact record. Notice that in the source database I have the following contact table:
I would like to transform into the following structure:
So the main idea is:
step one: Insert the phone in CI_Telephone
table and get t_id
.
step two: In parallel with the previous step insert institutionAddress
and country
on CI_Address
and get a_id
third step: after that, i need to get those ids (t_id
and a_id
) to be inserted on CI_Contact
(if not exist yet) and get the c_id
fourth step: Insert (if not already exists) on CI_ResponsibleParty
the instituionName
=> organizationName
and name
=> individualname
, and insert the correct foreign key (contactInfo) that references CI_contact
table.
This was my idea about the ETL process:
Question: How can you do a "cascade" insertion using geokettle or kettle? ( i dont know if this is the correct term) Does this transformation make sense to you?
I couldn't find any similar example to this case. Any suggestion about this issue would be great. Please, share your experience about ETL process and how to normalize a database from another that isn't normalized.