There is a debate between our ETL team and a Data Modeler on whether a table should be normalized or not, and I was hoping to get some perspective from the online community.
Currently the tables are set up as such
MainTable LookupTable PrimaryKey (PK) Code (PK) Code (FK) Name OtherColumns
- Both tables are only being populated by a periodic file (from a 3rd party)
through an ETL job
- A single record in the file contains all attributes in both tables for a single row)
- The file populating these tables is a delta (only rows with some change in them are in the file)
- One change to one attribute for one record (again only by the 3rd party) will result in all the data for that record in the file
- The Domain Values for Code and Name are not known.
Question:Should the LookupTable be denormalized into MainTable.
- ETL team: Yes. With this setup, every row from the file will first have to check the 2nd table to see if their FK is in there (insert if it is not), then add the MainTable row. More Code, Worse Performance, and yes slightly more space. However ,regardless of a change to a LookupTable.Name from a 3rd party, the periodic file will reflect every row affected, and we will still have to parse through each row. If lumped into MainTable, all it is, is a simple update or insert.
- Data Modeler: This is standard good database design.
Any thoughts?