For a cluster of 8 segments, design a calendar ( calendar_dt, month_nm, day_of_week_nm,holiday_flg ) and a table with deposits( deposit_id,cuxtomer_id,term, amount, open_dt(the field, which is fk for calendar) ) so as to maximize the performance of analytical Requests, placement on the study of the behavior of clients depending on the day of the week.
Asked
Active
Viewed 133 times
1 Answers
0
As a general rule of thumb, your distribution key should be the key used to join with other tables. So, if the open_dt column in the deposit table is the same as the calendar_dt column in the calendar table, the calendar table should be distributed by calendar_dt.
If you are primarily going to be grouping on the day_of_week column, you might make the calendar table columnar.
Hope this helps. Jim

user9142937
- 46
- 1
-
1The rule of thumb is to distribute the table by the natural key of the table so you get even distribution. So for a calendar dimension, you would distribute by the lowest grain column (calendar_dt) and for deposits, you would distribute by (deposit_id). – Jon Roberts Apr 05 '18 at 02:20