I have the following entities in my data mart:
- Location: Contains information like Country, City, etc.
- Users: Contains information about a user this includes their address, which includes Country and City
Both these dimensions contain attributes regarding country and city. Storing the attributes only in the location dimension would make more sense, so we don't duplicate data and we can retrieve the user's location information by joining the tables surrogate keys together through the fact table. However, doing it this way would increase the overhead because we would need to perform multiple joins.
I’m inexperienced with dimensional modelling and data warehousing, so I don’t know which one is better practice:
- Let both of the dimensions contain the same attributes
- Put the attributes in the table that it belongs to (in this case the Location dimension)