12

Suppose I have the following tables in my database:

tables

Now all my queries depend on Company table. Is it a bad practice to give every other table a (redundant) relationships to the Company table to simplify my sql queries?

Edit 1: Background is a usage problem with a framework. See Django: limiting model data.

Edit 2: No tuple would change his company.

Edit 3: I don't write the mysql queries. I use a abstraction layer (django).

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
svenwltr
  • 17,002
  • 12
  • 56
  • 68
  • 1
    The word "relation" doesn't mean what you apparently intended it to mean. A *relation* in the relational database model corresponds to what most people call a table. It has nothing to do with relationships between data in different tables. – nvogel Jul 27 '10 at 19:25
  • Yes, you're right. I've corrected it. – svenwltr Jul 27 '10 at 19:53
  • @SvenWalter please update the question with the table in a code block (`
    `) as the external image file is no longer available. I share the pain of not having proper tables in SE (the devs are hard-bent on never making it available), but at least we can emulate them with code blocks instead of relying on external tools.
    – ADTC May 14 '14 at 10:07

7 Answers7

11

It is bad practice because your redundant data has to be updated independently and therefore redundantly. A process that is fraught with potential for error. (Even automatic cascading has to be assigned and maintained separately)

By introducing this relation you effectively denormalize your database. Denormalization is sometimes necessary for the sake of performance but from your question it sounds like you're just simplifying your SQL.

Use other mechanisms to abstract the complexity of your database: Views, Stored Procs, UDFs

Paul Sasik
  • 79,492
  • 20
  • 149
  • 189
  • Many RDBMs have cascading updates. Changing the company ID in the company table will cascade to all tables that reference it. – siride Jul 27 '10 at 19:21
  • 1
    siride: That's a good point but maintaining the cascadability of your links is still an extra maintenance step that has to be done and has to be done correctly. Simpler is better in my book. Stick with your foreign keys and live with the extra JOIN in your queries. – Paul Sasik Jul 27 '10 at 20:01
  • I see it is a bad idea. Thanks for your help ... I have to find an other solution. – svenwltr Jul 27 '10 at 21:10
  • 1
    It is not always a bad practice. It is for example pretty common to do so with the tenant identifier when dealing with multitenancy - as it doesn't change (not counting tenant merging and rare stuff like that), the downsides are more or less non existent. Another valid use case would be denormalizing for performance reasons. – Maxem Aug 25 '11 at 09:35
  • PLus cascade update should never be used if you have large child tables as you can cause serious performance problems. – HLGEM Aug 15 '13 at 18:59
8

What you are asking is whether to violate Third Normal Form in your design. Doing so is not something to be done without good reason because by creating redundancy you create the possibility for errors and inconsistencies in your data. Also, "simplifying" the model with redundant data to support some operations is likely to complicate other operations. Also, constraints and other data access logic will likely need to be duplicated unnecessarily.

nvogel
  • 24,981
  • 1
  • 44
  • 82
5

Is it a bad practice to give every other table a (redundant) relation to the Company table to simplify my sql queries?

Yes, absolutely, as it would mean updating every redundant relation when you update the relations customer to company or section to company -- and if you miss any such update, you now have a database full of redundant data. It's a bad denormalization.


If your point is to just simplify your SQL, consider using views to "bring along" parent data. Here's a view that pulls company_id into contract, by join through customer:

create view contract_customer as
select 
  a.*, 
  b.contract_id, b.company_id
from 
  contract a 
  join customer b on (a.customer_id = b.customer_id);

This join is simple, but why repeat it over and over? Write it once, and then use the view in other queries.

Many (but not all) RDBMSes can even optimize out the join if you don't put any columns from customer in the select list or where clause of the query based on the view, as long as you make contract.customer_id have a foreign key referential integrity constraint on customer.customer_id. (In the absence of such a constraint, the join can't be omitted, because it would then be possible for a contract.customer_id to exist which did not exist in customer. Since you'll never want that, you'll add the foreign key constraint.)

Using the view achieves what you want, without the time overhead of having to update the child tables, without the space overhead of making child rows wider by adding the redundant column (and this really begins to matter when you have many rows, as the wider the row, the fewer rows can fit into memory at once), and most importantly, without the possibility of inconsistent data when the parent is updated but the children are not.

tpdi
  • 34,554
  • 11
  • 80
  • 120
4

If you really need to simplify things, this is where a View (or multiple views) would come in handy.

Having a column for the company in your employee view would not be poorly normalized providing it is derived from a join on section.

Mark Peters
  • 80,126
  • 17
  • 159
  • 190
2

If you mean add a Company column to every table, it's a bad idea. It'll increase the potential for data integrity issues (i.e. it gets changed in one table but not the other 6 where it should).

JNK
  • 63,321
  • 15
  • 122
  • 138
1

I'd say not in the OP's case, but sometimes it's useful (just like goto ;).

An anecdote:

I'm working with a database where most tables have a foreign key pointing to a root table for the accounts. The account numbers are external to the database and aren't allowed to be changed once issued. So there is no danger of changing the account numbers and failing to update all references in the DB. I also find that it is also considerably easier to grab data from tables keyed by account number instead of having to do complex and costly joins up the hierarchy to get to the root account table. But in my case, we don't have so much a foreign key as an external (i.e., real world) identifier, so it's not quite the same as the OP's situation and seems suitable for an exception.

siride
  • 200,666
  • 4
  • 41
  • 62
-1

That depends on your functional requirements for 'Performance'. Is your application going to handle heavy demand? Simplifying JOINS boasts performance. Besides hardware is cheap and turn-around time is important.

The more deeper you go in database normal forms - you save space but heavy on computation

ankitjaininfo
  • 11,961
  • 7
  • 52
  • 75
  • YOu would have to have huge database before joins are causing performance problems if you index correctly. Oddly the people I know who manage databases with trillions of records still use joins but that is because the database designers of those types of systems were competent. – HLGEM Aug 15 '13 at 19:01