6

Possible Duplicate:
What exactly does database normalization do?

Can someone please clarify data normalization? What are the different levels? When should I "de-normalize"? Can I over normalize? I have a table with millions of records, and I believe I over-normalized it, but I'm not sure.

Community
  • 1
  • 1
Moderator71
  • 385
  • 7
  • 16

3 Answers3

5

If you have million columns you probably under-normalized it.
What normalizing means is that

every non-key attribute "must provide a fact about the key, the whole key, and nothing but the key."

If you have a column that depends on anything but the key, you should normalize your table.
see here.

Added to reply to comment:
If you have ProductID | ProductType | ProductTypeID, where ProdcutTypeID depends only on ProductType, you should make a new table for that:
ProductID | ProductTypeID and on the other table: ProductTypeID | ProductTypeName .
So to answer your question, pertaining to Product isn't accurate enough, in my example at the first case, I was pertaining to the Product as well. All columns should pertain only to ProductID (you may say you only describe product, but not describing anything else, even if it's related to product - that's accurate).
Number of rows, generally speaking isn't relevent.

Oren A
  • 5,870
  • 6
  • 43
  • 64
  • So, if I have a key called 'ProductID', then the columns within that table should only pertain to a Product? Is that what 3 level Normalization is? I have over a million records, not columns, btw – Moderator71 Oct 06 '10 at 15:38
  • Okay, so what do you call it when I need to create new tables because the number of rows are growing rapidly in a table (i.e. Log table)? I need to create a new table for logging Login entries, Exception entries, etc. FYI, the Log table is very generic containing LogID, LogReason, LoggedBy, LoggedOn, etc. – Moderator71 Oct 06 '10 at 15:58
  • That's a different story. I suggest you ask a new question with all relevant details about your log, and what exactly you're trying to achieve. – Oren A Oct 06 '10 at 16:07
  • Here's my new question, if you can help me: http://stackoverflow.com/questions/3874603/log-table-with-millions-of-rows-now-what – Moderator71 Oct 06 '10 at 16:13
  • Got your answers there. Have nothing to do with normalizing. gl. – Oren A Oct 06 '10 at 16:16
4

Normalization is about reducing data duplication in a relational database. The most popular level is third normal form (it's the one described by "the key, the whole key, and nothing but the key"), but there are a lot of different levels, see the Wikipedia entry for a list of the main ones. (In practice people seem to think they're doing well to achieve third normal form.) Denormalizing means accepting more data duplication, typically in exchange for better performance.

Nathan Hughes
  • 94,330
  • 19
  • 181
  • 276
  • 1
    +1 for covering denormalization. – MaxVT Oct 06 '10 at 15:59
  • +1, good concise answer giving the purpose for normalization, not just the definition, i.e., why do we do it? I too like the denormalization mention. – DCookie Oct 06 '10 at 16:37
0

As others said Database normalization is about reduction of data duplication and more generic data models (that can easily answer to queries unexpected at design time). Normalisation of a database is allow a formal enough process. When you are experimented you mostly follow data analysis methods and get a normalized database at the end.

Normalizing database is usually a good idea, but there is a catch. In many case it involve creation of new tables and JOIN relationships between tables. JOIN is known to have a (very) high performance cost at runtime, henceforth for big volumes of data you may want to denormalize.

Another cost may also be the need to write more complex requests to access to the data you need and that can be a problem for SQL beginners. The best idea is probably to stick with normalization anyway (Third Normal Form is usually enough, as there is several levels of normalization as others said) and to become more skilled with SQL.

kriss
  • 23,497
  • 17
  • 97
  • 116