0

Given that there are two kinds of products X and Y. X has A, B and C as the primary key whereas Y has A and D as it's primary key. Should I put them in the same table ? Why should I and if I should not, then why is that ?

I have currently put them in two separate tables but some colleagues are suggesting that they belong in the same table. My question is should I consider putting them in the same table or continue with different tables?

Below I have given example tables for the above case.

CREATE TABLE `product_type_b` (
    `PRODUCT_CODE` VARCHAR(50) NOT NULL,
    `COMPONENT_CODE` VARCHAR(50) NOT NULL,
    `GROUP_INDICATOR` VARCHAR(50) NULL DEFAULT NULL,
    `RECORD_TIMESTAMP` DATE NULL DEFAULT NULL,
    PRIMARY KEY (`PRODUCT_CODE`, `COMPONENT_CODE`)
)
COLLATE='utf8mb4_general_ci'
ENGINE=InnoDB
;
CREATE TABLE `product_type_a` (
    `PRODUCT_CODE` VARCHAR(50) NOT NULL,
    `CHOICE_OF_COVER` VARCHAR(50) NOT NULL,
    `PLAN_TYPE` VARCHAR(50) NOT NULL,
    `RECORD_TIMESTAMP` DATE NULL DEFAULT NULL,
    `PRODUCT_TENURE` INT(11) NULL DEFAULT NULL,
    PRIMARY KEY (`PRODUCT_CODE`, `CHOICE_OF_COVER`, `PLAN_TYPE`)
)
COLLATE='utf8mb4_general_ci'
ENGINE=InnoDB
;

As you can see there are certain fields that are not common to both tables but are part of the primary key. There are also some other fields which are not common to both tables.

Here is the bigger picture of the system in consideration.

  • Each product type has a different source from where it is sent to the system.
  • We need to store these products in the database.
  • I would like to have a balance between normalization and performance so that my read-write speeds aren't compromised much due to over normalization.
  • There is also a web-app which will have a page where these products are searchable by the user.
  • User will populate certain column fields as filters based on which we need to fetch the products and show on the UI.
  • variations in subtypes is currently 2 and is not expected to increase beyond 4-5 which again is go ig to be over a decade maybe. This again is an approximation.n I hope this presents a bigger picture of the system.

I want to have good read and write speeds without compromising much. So should I go ahead with this design ? If not, what design should be implemented ?

jackfr0st
  • 551
  • 2
  • 8
  • 19
  • Possible duplicate of [How can you represent inheritance in a database?](https://stackoverflow.com/questions/3579079/how-can-you-represent-inheritance-in-a-database) – philipxy Jul 11 '19 at 19:06
  • The question better suites dba.stackexchange.com . Anyway a bigger picture of the system in question would help to value pros and contras of the options. – Serg Jul 11 '19 at 19:11
  • @Serg updated the post with the bigger picture. – jackfr0st Jul 12 '19 at 04:12
  • Please clarify via edits, not comments. Please ask new questions in new posts. Ask one question per post. Please don't insert EDITs/UPDATEs, just make your post the best presentation as of right now. PS You have given no information for us to address 2NF, so this doesn't reflect an understanding of what it is. Ask a question where stuck following your textbook. But this sort of rearrangement for subtypes/inheritance is not normalization anyway. So again, your question doesn't reflect an understanding of normalization. – philipxy Jul 12 '19 at 04:33
  • Apologies, as already stated I am not sure of that part where I mention 2NF. I'll edit the post so as to ask just "one question". – jackfr0st Jul 12 '19 at 04:40

2 Answers2

1

This is typical Category/SubCategory model issue. There are a few options:

  1. Put everything in one table, which will have some columns NULLable because different subtypes do not have the same attributes;

  2. One parent table for all the common attributes, and also with the column of the type indication column. Then each sub type has its own table just for the columns the Subtype has.

  3. Each subtype has its own table, including all the common columns of all the sub type.

(1) is good if the sub type is very limited;
(3) is suitable if the variations of the sub types are very limited.

The advantage of (2). is it is easy to return all the records with the common columns. And if an artificial key (like auto-increment id) is used, it ensures all records, regards less the sub type, has unique id.

In your case, no artificial PK is used, I think your choice is not bad.

PeterHe
  • 2,766
  • 1
  • 8
  • 7
  • It is also affects your application development. Put everything into one table makes the app development easier. – PeterHe Jul 11 '19 at 19:17
  • Variations of subtypes are limited indeed. Currently, there are two but I want the design to easily integrate new subtypes if they come up tomorrow. – jackfr0st Jul 12 '19 at 04:14
  • As for application development. My idea was to put these tables in the same DB and then while fetching, the choice as to which table to fetch from would be done in the backend code. Furthermore, if I go with 2 wouldn't that mean that while fetching data, I will need to have joins in place which might affect the performance and while writing data into the tables, I will have to update at multiple tables which again will affect writing speeds ? – jackfr0st Jul 12 '19 at 04:17
  • 1
    Each approach has its pros and cons, and totally depends on the the nature of the data, how they are going to be used and managed, application development, architecture etc. If your subtypes do not have wide variations, it is OK to put them into one table.But if they have wide variations, (2) is the best choice. Performance impact is very minimum. – PeterHe Jul 12 '19 at 13:27
1

For a trading system and taking into account max 5 product types and very limited number of attributes I'd prefer a single table for all products with a surrogate PK. Think about references to products from trading transactions, this is the biggest part of the total DB content in a long run.

A metadata table describing every product-specific attribute and its mapping to the general table column would help to build UI and backend/frontend communications.

Search indexes would reflect most popular user seraches depending on product type.

Serg
  • 22,285
  • 5
  • 21
  • 48