0

I have the following dataset ready to load to a data warehouse:

CUSTOMER_KEY | PRODUCT_KEY | PRODUCT_DESCRIPTION | DATE | VOLUME | AMOUNT

I am confused with the Date dimension. The date format in the dataset is 'YYYY-MM-DD'. I want to create a dimension for date with year, month and day instead of having just the date in the FACT_SALES table.

For example, would this star schema work?

  • FACT_SALES: CUSTOMER_KEY (PK), PRODUCT_KEY (FK), DATE (FK), VOLUME, AMOUNT

  • DIM_PRODUCT: PRODUCT_KEY (PK), PRODUCT_DESCRIPTION

  • DIM_DATE: DATE (PK), YEAR, MONTH, DAY

When I load my dataset to the data warehouse (where the Date column is in 'YYYY-MM-DD' format), will the YEAR, MONTH and DAY be automatically populated based on the star-schema design I created?

Or do I need to create YEAR, MONTH and DAY columns in the dataset before loading it to the data warehouse?

2 Answers2

1

You would normally create and populate your date dimension as a one-off activity when you first set up your DWH

NickW
  • 8,430
  • 2
  • 6
  • 19
-1

When you are first creating your data warehouse you generate the date dimesion as a table. Here's an example of the SQL (using dbt) I used for Snowflake to generate the data.

WITH CTE_MY_DATE AS (
  SELECT DATEADD(DAY, SEQ4(), '2000-01-01') AS MY_DATE
    FROM TABLE(GENERATOR(ROWCOUNT=>10000))  -- Number of days after reference date in previous line
)
SELECT MY_DATE::DATE                AS date,
       YEAR(MY_DATE)::NUMBER        AS year,
       MONTH(MY_DATE)::NUMBER       AS month,
       MONTHNAME(MY_DATE)::VARCHAR  AS month_name,
       DAY(MY_DATE)::NUMBER         AS day_of_month,
       DAYOFWEEK(MY_DATE)::VARCHAR  AS day_of_week,
       WEEKOFYEAR(MY_DATE)::NUMBER  AS week_of_year,
       DAYOFYEAR(MY_DATE)::NUMBER   AS day_of_year
  FROM CTE_MY_DATE

If you search data warehouse date spline you'll probably find some examples for whichever db you're using.

Jonathan Porter
  • 1,365
  • 7
  • 34
  • 62