I'm reading Kimball's data-warehouse book at the moment and at the same time, I'm looking at designing a data warehouse for the following example:
1: Client can buy products which would generate an invoice which then would generate a inventory row. Note: The inventory row is a accumulating table, i.e. if they order Product A with a quantity of 5, and then another 5, there would be 2 invoice lines, and one inventory row of 10 units.
So based on the scenario described, the following design would be appropriate: 1. Client dimension 2. Product dimension 3. Date dimension 3. Invoice fact 4. Inventory fact
Now, let's say I want to find out answers to questions like 1. What clients currently have zero inventory but have generated invoices in the last 3 months . There will be other similar questions like these as well. 2. What clients have product A but don't have product B
Based on Kimballs book, it seems like there would need be some sort of joining between the two fact tables, like a subset?
This would seem rather inefficient, especially the sheer size of records between the 2 fact tables. Is this the right approach?