I am starting to build a star schema, and I like it ^^
I have a design problem with dimensional modeling.
I have a Fact table for each transaction in the star schema (highest grain) Something like that (simplified version)
transaction_facts
- id
- account_dim
- date_dim
- status_dim
- amount
status_dim
- id
- code
- description
- final
For a transaction, the status is not clearly defined at process time. Most all of the status fall into these cases:
- the transaction is ok
- the transaction is ko
- the transaction is ok, but to be confirmed.
The last status is the problematic one, since I can receive the confirmation of the transaction few days (up to 10 and sometime, even more) after the original transaction.
How should I handle this kind of late change? Intuitively, I would be tempted to just reaffect the existing transactions to the new dimension, but it make me think of 2 things:
- Is it a good practice? (do not rewrite history etc...)
- How to handle this kind of change in BigQuery or Redshift or any append only system ? On a very high amount of rows, it will be a problem since these systems don't work well with updates