General SQL question about Primary Keys

Question

I know this is pretty elementary but here it goes.

I would like to know how you know what columns are a primary key in a table that does not have a primary key? Is there a technique or something that I should read?

Thank you in advance

This question is like asking what type of car I should buy. So many "it depends" variables, it isn't funny. Basically you look at the data and you ask yourself, how can I uniquely identify a row? — Aaron Bertrand, Sep 13 '11 at 16:01

score 4 · Accepted Answer · answered Sep 13 '11 at 16:07

4

You need to take a look at your data structures.

A primary key must:

never be NULL (no exceptions)
reliably and uniquely identify each single row

and it helps if it's

small and easy to use
stable (doesn't change at all, or at least not often)
a single column (or at most two)

Check your data - which columns or set of columns can fulfill these requirements??

Once you have those potential primary keys (the "candidate keys") - think about how you will access the data, and what other data might need to be associated with this one entity in question - what would make sense as a foreign key? Do you want to reference your department by its name? Probably not a good idea, since the name could be misspelled, it might change over time etc. By the department's office location? Bad choice, too. But something like a unique "department ID" might be a good idea.

If you don't find any appropriate column(s) in your actual data that could serve as primary key and would make sense, it's a common practice to introduce a "surrogate key" - an extra column, often an INT (and often something like an "auto-increment" INT) that will serve as an artificial identifier for each row. If you do this, one common best practice is to never show that artificial key on any data screen - it has no meaning whatsoever to the users of your system - so don't even show it to them.

Checking these requirements, and a lot of experience, will help you find the right primary key.

answered Sep 13 '11 at 16:07

marc_s

732,580
175
1,330
1,459

Thank you marc_s this sums it up as well as some other suggestions on here. – Maximus Sep 13 '11 at 16:37
Minimality is also a requirement (a candidate key must be *irreducibly* unique and not just unique). – nvogel Sep 14 '11 at 10:22
"a single column (or at most two)" seems to be entirely arbitrary. – onedaywhen Sep 14 '11 at 12:42
@onedaywhen: have you ever had a table with a compound primary key of six columns? Have you tried referencing that from a child table? The foreign key relation will be **madness**, and the JOIN queries across such a FK relationship are unbelievably messy.... yes - arbitrary - but based on years of (practical) experience in the trenches ... – marc_s Sep 14 '11 at 13:20
@marc_s: I am not afraid of compound keys and yes have probably got up to six columns with valid state temporal tables. Now, my turn: have you ever had a table which should have had a six column natural key but instead had to join back to five tables, each with their own surrogate key, just be able to get the key data? And have you tried writing a simple data constraint that needed data from those five tables? Madness. – onedaywhen Sep 14 '11 at 13:53
@onedaywhen: yes - and I prefer four or five tables, if the JOINs between them are simple, over a massive four or six column JOIN over two tables ..... – marc_s Sep 14 '11 at 13:55
1

@marc_s: sure these are just personal preferences. But did you consider the case of having to write a constraint using data from five tables? – onedaywhen Sep 14 '11 at 14:56

score 1 · Answer 2 · answered Sep 13 '11 at 16:03

1

It really depends on the data itself. You need to determine what fields can be used to identify the record uniquely.

answered Sep 13 '11 at 16:03

Taryn

242,637
56
362
405

Thank you bluefeet I was leaning this way but was a little confused by the data. – Maximus Sep 13 '11 at 16:08
1

@Maximus it will absolutely vary from table to table. Instead of asking a general question, why not get some practice by asking about the specific table where you're confused by the data? – Aaron Bertrand Sep 13 '11 at 16:09
@Aaron I would like to do that but the table is huge and the data is HR related so I don't think my employer would like me to post it :-) – Maximus Sep 13 '11 at 16:38
If it is HR data, I am sure there is not a whole lot of secret columns in your table, and there are probably only a few candidate columns anyway... have you considered looking into industry standards to help guide your schema decisions? Those things are out there for a reason... :-) – Aaron Bertrand Sep 13 '11 at 16:41
@Maximus you can always scrub the data prior to posting. I have to do the same thing. :) – Taryn Sep 13 '11 at 16:42

score 0 · Answer 3 · answered Sep 13 '11 at 16:03

In SQL server it'll have a key next to it. It's typically ID or something with ID in it. It's also unique and typically increments. When you look at it in SQL. Server management studio under table design you'll see it towards the top of the list of columns with the Lil key icon.

It's a unique identifier that deciphers each record from one another. Kind of like how each person has a ssn.

General SQL question about Primary Keys

3 Answers3