I've been doing some searching on Stack Overflow as well as Google and haven't quite found the answer to my question, so here we go:
It's been a minute since I've done a 'from the ground up' data warehouse project, so I'm dusting off some of my past knowledge, but am blanking on a solution to one of my data load scenarios.
I am creating a Fact Table (factOrderLines) with of course many dimensions joined to it. One of the dimensions I would like to link to factOrderLines is the dimItem. The problem is an Item is unique based on either the item's vendor and vendor part number, manufacturer and manufacturer part number, or an identifier from a subset of items called ManagedItems (MngItemID).
source ex:
Vendor VendorPartNo Manufacturer ManufacturerPartNo MngItemID 100 3456 NULL NULL 67 100 3254 03 1234 23 NULL NULL 03 1235 24 NULL NULL 15 5120 NULL
Problem is when I do my join to the dimItem table from my source table to populate the factOrderLines table I have three lookup scenarios. This is causing the numbers to inflate and performance to be horrible.
LEFT OUTER JOIN dimItem AS i ON
(i.Vendor = src.Vendor AND i.VendorPartNo = src.VndrItemID) OR
(i.Manufacturer = src.Manufacturer AND
(i.ManufacturerPartNo = src.MfgItemID) OR (i.MngItemID = src.MngItemID)
Is there a more efficient/better approach to this scenario than what I have started to implement?
edit: Full INSERT query (for better understanding)
INSERT INTO fctOrderLine
(PurchaseOrderKey
,DateKey
,PurchaseOrderLineNo
,VendorKey
,ManufacturerKey
,ItemKey
,UnitPrice
,Qty
,UnitOfMeasure
,LineTotal)
SELECT PurchaseOrderKey = po.PurchaseOrderKey
,DateKey = ISNULL(c.DateKey, 19000101)
,PurchaseOrderLineNo = ISNULL(p.POLineNbr, -1)
,VendorKey = ISNULL(v.VendorKey, -1)
,ManufacturerKey = ISNULL(m.ManufacturerKey, -1)
,ItemKey = ISNULL(i.ItemKey, -1)
,UnitPrice = ISNULL(p.UnitPrice, -1.00)
,Qty = ISNULL(p.POQty, -1.00)
,UnitOfMeasure = ISNULL(p.ANSI_UOM, N'UNKNOWN')
,LineTotal = ISNULL(p.LineTotalCost, -1)
FROM stgOrders AS p
INNER JOIN dimPurchaseOrder AS po ON po.OrderNo = p.PONumber
LEFT OUTER JOIN dimCalendar AS c ON c.Date = (CASE WHEN p.DT_PO IS NULL OR ISDATE(REPLACE(p.DT_PO, '''', '')) = 0 THEN CAST('19000101' AS DATETIME) ELSE REPLACE(p.DT_PO, '''', '') END)
LEFT OUTER JOIN dimVendor AS v ON v.VendorID = p.VendorID
LEFT OUTER JOIN dimManufacturer AS m ON m.ManufacturerID = p.MfgID
LEFT OUTER JOIN dimItem AS i ON (i.VendorKey = v.VendorKey AND i.VendorPartNo = p.VndrItemID) OR (i.ManufacturerKey = m.ManufacturerKey AND i.ManufacturerPartNo = p.MfgItemID) OR (i.MngItemID = p.MngItemID)