0

I'm working on the design of a new periodic snapshot fact table. I'm looking into health insurance claims and the amount of money people owe to the insurance company and the amount they've already paid. Data in the table will look like this.

CLAIM_ID    TIME_KEY   AMOUNT_OWED     PAID
123        31.1.2000          1000     0
123        28.2.2000           900     100
123        31.3.2000           800     200
123        30.4.2000             0     1000
123        31.5.2000             0     1000
123        30.6.2000             0     1000
123        31.7.2000             0     1000
123        31.8.2000             0     1000
...

As you can see after 30.4.2000 it doesn't make sense to insert new data for claim_id 123 as it no longer changes (there is a reasonable degree of certainty this won't happen). Is it a good idea to stop inserting data for this claim or should I do so till the end of time :)?

I'm mainly concerned about sticking to best practices when designing Data Warehouse tables.

Thanks for any answer!

Marmite Bomber
  • 19,886
  • 4
  • 26
  • 53
xcelm
  • 541
  • 1
  • 6
  • 19

2 Answers2

0

just a few thoughts...

  1. Unless you can have multiple payments in a day against a claim (and potentially other transactions e.g. interest that increases the amount owed), then what you have shown is not really a snapshot fact, it is a transactional fact. The normal example given is a bank account where you have multiple in/out transactions per day and then a snapshot of the end-of-day (or end-of-month) position. Obviously I don't know your business model but it seems unlikely that there would be multiple transactions per day against a single claim
  2. If there have been no changes to a claim since the last fact record was created there seems little point creating a new fact record
NickW
  • 8,430
  • 2
  • 6
  • 19
  • I see your point - I forgot to say I use dd.mm.yyyy format :). So these are in fact monthly snaphshots. I edited my post so it is more clear – xcelm Jan 04 '21 at 14:28
  • OK, that makes more sense. I assume that the amt paid in April should be 800 not 1000? Your data also looks wrong for subsequent months as they are not paying 1000 every month. If the amount owed and the amount paid are both zero for a particular month then I see no point in creating a monthly snapshot record for that claim – NickW Jan 04 '21 at 15:02
0

Typically you choose a periodic snapshot if you have

a) a large number of transactions and

b) you need an effective access to the data at some point of time (end of the month in your case)

If you have say 50 claim transactions per month and the claim is active one year on average, you will profit from this design even if you will hold the inactive claims for 50 years (which you will probably will not do;)

Your doubts suggest that you have not so much transactions per claim life cycle. It that case you should consider a fact table storing each transaction.

You will have definitively no overhead for inactive claims, but to get a snapshot information at a specific time you'll have to read the whole table.

On the contrary the periodic snapshot is typically partitioned on the snapshot time, so the access is very affective. get no free lunch with saving space and an effective access.

Marmite Bomber
  • 19,886
  • 4
  • 26
  • 53