We have Purchase order management data as mentioned below. The order comes from different sources Retail, Wholesale, and online. Assume the addition of source is infrequent. Assume we don't have information on source ids as it is coming from a different system. it is just a plain text
PURCHASE_ID | RETAIL_ID | WHOLE_SALE_ID | ONLINE_ID | ITEM_ID | QUANTITY | AMOUNT | PURCHASE_DATE |
---|---|---|---|---|---|---|---|
P1001 | R384894 | NULL | NULL | IT541 | 45 | 450 | 11/12/2022 |
p1002 | NULL | W6479 | NULL | IT541 | 50 | 500 | 15/12/2022 |
P1003 | NULL | NULL | 035822 | IT541 | 10 | 100 | 20/02/2022 |
P1004 | 37828 | NULL | NULL | IT679 | 500 | 1500 | 18/06/2023 |
P1005 | NULL | W64757 | NULL | IT230 | 23 | 679 | 18/06/2023 |
P1006 | NULL | NULL | O5363 | IT222 | 67 | 4784 | 18/06/2023 |
A purchase ID refers to a purchase from a single source. For example, PURCHASE_ID P1001 is the purchase done by RETAIL STORE ID R38494 so the other two columns(WHOLE_SALE_ID, ONLINE_ID) are null. ITEM_ID is the ITEM number being ordered. We are having 10 million records in the database and the table is partitioned by PURCHASE_DATE.
I can think of two DB Schema designs options. Would like to know which is the correct way to design the table for faster data retrieval and performance. Please let me know other options if any
Option 1: Three Columns that allow NULLS with needed indexes
Purchase: PURCHASE_ID (Primary Key) RETAIL_ID WHOLESALE_ID ONLINE_ID ITEM_ID (Foreign Key referencing Items table) QUANTITY AMOUNT PURCHASE_DATE Items: ITEM_ID (Primary Key) ITEM_NAME Other item-related information Indexes: Index on PURCHASE_DATE column in the Purchase table. Index on ITEM_ID column in the Purchase table. Indexes on columns RETAIL_ID , WHOLESALE_ID , ONLINE_ID
Option 2: Composite ID with needed index
Purchase: PURCHASE_ID (Primary Key) SOURCE_TYPE_ID (Foreign Key referencing SourceTypes table) SOURCE_ID ITEM_ID (Foreign Key referencing Items table) QUANTITY AMOUNT PURCHASE_DATE SourceTypes: SOURCE_TYPE_ID (Primary Key) SOURCE_TYPE_NAME (e.g., "Retail", "Wholesale", "Online") Items: ITEM_ID (Primary Key) ITEM_NAME Other item-related information Indexes: Index on PURCHASE_DATE column in the Purchase table. Index on ITEM_ID column in the Purchase table. Indexes on composite columns (e.g., SOURCE_TYPE_ID, SOURCE_ID) in the Purchase table.