1

I have a database design that makes use of compound primary keys to ensure uniqueness and which are also foreign keys.

These tables are then linked to other tables in the same way, so that in the end the compound key can get up to 4 or 5 columns. This led to some rather large JOINs, so I thought a simple solution would be to use an autoincrement column which is not part of the primary key but which is used as part of the primary key of other table(s).

Here is some pseudo code showing the general layout :

CREATE TABLE Item (
id AUTO_INCREMENT,
...
PRIMARY KEY (id)
) ENGINE = InnoDB;

CREATE TABLE PriceCategory (
id AUTO_INCREMENT,
...
PRIMARY KEY (id)
)

CREATE TABLE ItemPriceCategory (
itemId,
priceCategoryId,
id AUTO_INCREMENT,
...
UNIQUE INDEX id,
PRIMARY KEY (eventId, priceCategoryId)
)

CREATE TABLE ClientType (
id AUTO_INCREMENT,
...
PRIMARY KEY (id)
)

CREATE TABLE Price (
itemPriceCategoryId,
clientTypeId,
id AUTO_INCREMENT,
...
UNIQUE INDEX id,
PRIMARY KEY (itemPriceCategoryId, clientTypeId)
)

table Purchase (
priceId,
userId,
amount,
PRIMARY KEY (priceId, userId)
)

The names of tables have been changed to protect the innocent ;-) Also the actual layout is a little deeper in terms of references.

So, my question is, is this a viable strategy, from a performance and data integrity point of view ? Is it better to have all keys from all the referenced tables in the Purchase table ?

Thanks in advance.

ianaré
  • 3,230
  • 26
  • 26

2 Answers2

5

Generally, the advice on primary keys is to have "meaningless", immutable primary keys with a single column. Auto incrementing integers are nice.

So, I would reverse your design - your join tables should also have meaningless primary keys. For instance:

CREATE TABLE ItemPriceCategory (
itemId,
priceCategoryId,
id AUTO_INCREMENT,
...
PRIMARY KEY id,
UNIQUE INDEX  (eventId, priceCategoryId)
)

That way, the itemPriceCategoryId column in price is a proper foreign key, linking to the primary key of the ItemPriceCategory table.

You can then use http://dev.mysql.com/doc/refman/5.5/en/innodb-foreign-key-constraints.html foreign keys to ensure the consistency of your database.

In terms of performance, broadly speaking, this strategy should be faster than querying compound keys in a join, but with a well-indexed database, you may not actually notice the difference...

Neville Kuyt
  • 29,247
  • 1
  • 37
  • 52
  • I like this solution, simple and effective. Let me see what others may contribute before accepting though. – ianaré Jun 15 '11 at 13:50
2

I think that something has been lost in translation over here, but I did my best to make an ER diagram of this.

In general, there are two approaches. The first one is to propagate keys and the second one is to have an auto-increment integer as a PK for each table.

The second approach is often driven by ORM tools which use a DB as object-persistence storage, while the first one (using key propagation) is more common for hand-crafted DB design.

In general, the model with key propagation offers better performance for "random queries", mostly because you can "skip tables" in joins. For example, in the model with key propagation you can join the Purchase table directly to the Item table to report purchases by ItemName. In the other model you would have to join Price and ItemPriceCategory tables too -- just to get to the ItemID.

Basically, the model with key propagation is essentially relational -- while the other one is object-driven. ORM tools either prefer or enforce the model with separate ID (second case), but offer other advantages for development.

Your example seems to be trying to use some kind of a combination of these two -- not necessarily bad, it would help if you could talk to original designer.


With key propagation

enter image description here


Independent keys for each table enter image description here

Damir Sudarevic
  • 21,891
  • 3
  • 47
  • 71
  • Thanks for taking the time to make the diagrams. I am working with the designers of the original database to improve it. The idea is indeed to get a best of both worlds approach, we want to have good data integrity AND use the tables in an ORM way, specifically an AR setup. Having the autoinc ID is very useful in AR as it allows knowing immediately the status of the object, but we don't want to solely trust the PHP to keep data integrity. – ianaré Jun 15 '11 at 13:45