Late arriving fact - best way to deal with it

Question

I have a star schema that tracks Roles in a company, e.g. what dept the role is under, the employee assigned to the role, when they started, when/if they finished up and left.

I have two time dimensions, StartedDate & EndDate. While a role is active, the end date is null in the source system. In the star schema i set any null end dates to 31/12/2099, which is a dimension member i added manually.

Im working out the best way to update the Enddate for when a role finishes or an employee leaves.

Right now im:

Populating the fact table as normal, doing lookups on all dimensions.
i then do a lookup against the fact table to find duplicates, but not including the EndDate in this lookup. non matched rows are new and so inserted into the fact table.
matching rows then go into a conditional split to check if the currentEndDate is different from the newEnd Date. If different, they are inserted into an updateStaging table and a proc is run to update the fact table

Is there a more efficient or tidier way to do this?

You say "two time dimensions, StartedDate & EndDate" ... dont you mean two time columns? — Marcus D, Feb 18 '16 at 16:08

score 0 · Answer 1 · answered Feb 17 '16 at 00:01

0

How about putting all that in a foreach container, it would iterate through and be much more efficient.

answered Feb 17 '16 at 00:01

DrHouseofSQL

550
5
16

I dont understand, i cant see how a for each container helps optimize the workflow, or how it fits in at all. can i ask for a bit more detail please. Thanks! – JD_Sudz Feb 17 '16 at 15:16

score 0 · Answer 2 · answered Feb 18 '16 at 16:29

I think it is a reasonable solution. I personally would use a Stored Proc instead for processing efficiency, but with your dimensional nature of the DWH and implied type 2 nature, this is a valid way to do it.

The other way, is to do your "no match" leg of the SSIS as is, but in your "match" leg, you could insert the row into the actual fact table, then have a post process T-SQL step which would update the two records needed.

Late arriving fact - best way to deal with it

2 Answers2