0

I have two tables in a data warehouse: balances and dates. Balances has the following structure:

Surrogate Key Date | Date       | Account | Balance
1                  | 2017-02-01 | 100     | 1234
1                  | 2017-02-01 | 200     | 5151
2                  | 2017-02-02 | 100     | 5123
2                  | 2017-02-02 | 200     | 8234

And dates has the following structure:

Surrogate Key Date | Date       | Weekday | Week in Year | ... other columns 
1                  | 2017-02-01 | Wed     | 5            |
2                  | 2017-02-02 | Thu     | 5            |

The Surrogate Key Date column is type INT, and the Date column is type DATE in both tables.

The surrogate key in the balances table is used in OLAP queries, and the date is used for regular reports.

Now, I need to develop a program that uses the database intensively (it's a batch process) and it needs to access repeatedly the balance table through the date column. Should I use in this process the Surrogate Key Date column or the Date column? I need to filter by date. Is the INT access in the where clause more efficient than the DATE access? Should I ignore the Surrogate Key Date column when not using OLAP?

ps0604
  • 1,227
  • 23
  • 133
  • 330

1 Answers1

1

From a performance point of view, it likely doesn't matter at all on SAP HANA. Generally speaking, HANA doesn't work on the actual data types when calculating joins and result sets but with internal reference IDs.

Actually, I don't see a reason at all for having a surrogate key in the first place when it maps 1:1 to the Date column. Also, as it seems to allow duplicates it's not a key at all. Not sure, how you are supposed to address specific records from the same date in this model.

Lars Br.
  • 9,949
  • 2
  • 15
  • 29
  • The key of the `balances` table is `Surrogate Key Date + Account number`, it doesn't allow duplicates. I need the `Surrogate Key Date` to run OLAP, and the `Date` column to run reports (as the `Surrogate Key Date` doesn't make any sense to the end user that builds the reports) – ps0604 May 08 '17 at 11:44
  • I see - well, in that case, still, there is no benefit in using the surrogate key in the first place. Create a compound primary key based on `DATE` and `Account number` and use that. HANA will internally create a hidden column for this concatenation and all processing of the compound key will be done on that single column. The main disadvantage obviously is the additional space requirements, but that you get with the surrogate key as well. – Lars Br. May 09 '17 at 02:05