Pros and cons of a junction table between three tables, only two of which at any given time

Question

I am going to be have a many-to-many relationship across three tables, but the relationship will be between only one table (say TableA) and only one of the other two tables (say TableB and TableC) at a time. That is, I can have either one junction table

id | TableA_id | TableB_id | TableC_id

with a constraint to make sure that either TableB_id is null or TableC_id is null, or two junction tables

id | TableA_id | TableB_id
id | TableA_id | TableC_id

What would be good criteria to determine which of these two possibilities I should use?

Care to explain why the data's laid out the way it is, and how you plan on interacting with it? Also, are TableB and TableC similar in their fields ? ... and you may want to ask this on [dba.se] — Joe, Nov 18 '11 at 16:30
TableC is actually in a many-to-one relationship with TableB, for what I am using this. The main point of the question is whether it is better to have fewer tables with more possibilities for NULLs or more tables with less possibilities for NULLs. — cm007, Nov 18 '11 at 16:37
If `TableB` is related to `TableC`, then _why_ would you create a join table relating it to `TableA`? Although you can of course have 'through' relationships (Say, `TableA` 1-to-Many `TableC` because `TableA` 1-to-1 `TableB` 1-to-many `TableC`), you usually only create the table for the _direct_ relationships. — Clockwork-Muse, Nov 18 '11 at 16:44
TableA is related to TableB and to TableC in different ways. — cm007, Nov 18 '11 at 16:49
Unless you're going for something like 5th normal form (which is... tricky/wierd), you usually define the relationships between two tables either directly, or indirectly, through 1 (or more) other tables. For instance, if my workplace wanted to know the count of all homemade lunches brought to work, they'd have to ask all the employees - they wouldn't have access to whether we brought a lunch on their own. — Clockwork-Muse, Nov 18 '11 at 17:31

score 1 · Answer 1 · answered Nov 18 '11 at 16:29

1

Good database design methodology is to never store NULL data (in the real world you can of course over normalise). Therefore option 2 is better.

answered Nov 18 '11 at 16:29

satnhak

9,407
5
63
81

Option 2 is terrible. How do you tell if the listed id is from `tableB` or `tableC`? This is a multi-domain table, which can/will lead to all sorts of problems. – Clockwork-Muse Nov 18 '11 at 16:41
@X-Zero: eh? You know because the table is called something like TableATableBJunction and because the columns are called TableA_Id and TableB_Id. It is pretty basic database design that you give your tables and columns meaningful names. – satnhak Nov 18 '11 at 17:12
#smacks self# Gaaahhh... somehow I read that as two _rows_ in the _same_ table, not two _tables_. I seem to be on a roll today... – Clockwork-Muse Nov 18 '11 at 17:22

score 1 · Answer 2 · answered Nov 18 '11 at 16:30

Since there is no stated direct relation between Table B and Table C I would keep the relations in seperate tables, which is more like a star schema / snowflake schema. From a data modelling perspective it will be far easier to understand than a table which can join in 2 different directions.

Pros and cons of a junction table between three tables, only two of which at any given time

2 Answers2