0

I have a question regarding the best practice regarding the loading of invoice header and line data inside InvoiceLineFact table. I am following the Kimball recommendation where I bring all the dimensionality of the header down to the line items like shown on the picture :

HeaderLineFact table example

For data extraction part we use store procedures created on our DWH SQL Server which are automatically executed via job in sequential order. What we do in every procedure is that we target the transaction tables on remote/linked server (DynamicsNAV db sql server):

Example:

SELECT *
FROM LinkedServer.NAVDB.dbo.InvoiceHeader
INNER JOIN LinkedServer.NAVDB.dbo.InvoiceLine

The problem is that the our ETL is started to slow down from day to day.

I wanted to ask if this is the efficient thing to do or there is a better way of doing this, like extracting the raw data in separate table and then join it on the DWH server, or maybe placing it inside SSIS package as it is.

starball
  • 20,030
  • 7
  • 43
  • 238
stevo92
  • 11
  • 3
  • I forgot to write down the join clause but I think that you should get the point of my question. It is joined by InvoiceID attribute. – stevo92 Jan 17 '23 at 08:00
  • 1) From Source To Staging Table Then 2) From Staging To DWH! Divide the package and rule it better! – Ozan Sen Jan 17 '23 at 21:06

1 Answers1

0

if this is the efficient thing to do

No. Joining distributed tables and sending complex queries to operational systems are both problematic.

or there is a better way of doing this,

Yes

like extracting the raw data in separate table and then join it on the DWH server

Yes. This is standard best-practice for building data warehouses.

David Browne - Microsoft
  • 80,331
  • 6
  • 39
  • 67