I have an ETL process I'm building and I'm trying to figure out the best way to build it. For context, i'm working with retail orders related data at a distribution center, so these orders start with an "open" status, meaning orders that need to be fulfilled, "WIP" meaning orders in process and then "shipped" meaning the order is complete and on a truck out of the DC.
I'm building an ETL process to create a Fact table that will have all order related information. The issue I'm having is that the primary key between the two tables (Table A, Table B) I'm trying to join changes once the order is "shipped". Example, Table A is the main transaction table, and Table B is the "task" information table. So once the order is "shipped" in table A, the primary key of Table B changes (it is called a sequence number, some WMS system update causes the sequence number to change) and then I can't join table B anymore. So basically, I can only track the more detailed Table B (Tasks) related information along side the rest of the transaction information as long as the order hasn't shipped. I can't figure out how best to go about this. Do I create a process that doesn't update the primary key in my staging tables? if so, how I would I do that? I'm just kind of stuck and am looking for some advice/guidance on this. Appreciate any help!
I tried making a copy of the column with the idea being that I'll prevent it from updating, but the issue is that the next day when I pull the data from the DW, the primary key's stored there are the updated ones, so basically I don't get to have more than 1 day of visibility to the joined data. I'm thought of many ways but nothing's working out.