1

I am having a gender table as: gender_id | gender_label
Will it be a bad practice if I remove the gender_id column, as gender_label is also unique and can be used as a primary key?
I know surrogate keys should be used wherever possible as they are efficient and fast to search. But will it make any difference for above scenario? How?

The Coder
  • 3,447
  • 7
  • 46
  • 81
  • "I know surrogate keys should be used wherever possible" Why would you think that? There are two kinds of database administrators. Those that like surrogate keys and those that hate them. Both sides have good arguments. Then there are the Codd fanatics that say there is no such thing as a surrogate key, (and from the relational model point of view they are correct). The only good thing in surrogate keys is that you don't have to repeat the actual value in any table referencing your own. – Zohar Peled Feb 22 '18 at 18:09
  • 1
    Surrogate keys are good when you can't be sure if your natural keys will stay unique, or change; and you can almost never be sure of that. "gender" is a good example of this; 10 or so years ago, very few would not have thought twice about "hard-coding" M and F as a bool. – Uueerdo Feb 22 '18 at 18:15
  • See [What are the design criteria for primary keys?](https://stackoverflow.com/questions/3632726/what-are-the-design-criteria-for-primary-keys) – reaanb Feb 22 '18 at 22:06

1 Answers1

1

I have written hundreds of tables. 2/3 of them have a "natural" key; 1/3 have a "surrogate key".

From that, I deduce that the answer to your question is "it depends".

OK, so what does it depend on?

  • Is there a reliable natural key -- one that is truly UNIQUE? If so, probably you should use it.
  • Is performance an issue -- Well, this goes both ways.
  • Is the natural key "small"? A 2-character country_code taken from an international standard is, in my opinion, better than a 4-byte INT.
  • If the natural key is "big" and there are multiple secondary keys, then keep in mind that the PK is included in every secondary key. (In InnoDB)
  • A lookup by a secondary key involves (in InnoDB) first a drill-down in the secondary key's BTree, then a drill-down in the PK's BTree. So, there is a slight-to-large performance benefit in having a PK that you use a lot. (That might be surrogate or it might be natural.)

In your simple case, is gender_id strings like M/F/etc? That's what I would use. Actually, I might not have the table at all; instead have gender in other tables as ENUM('unknown', 'male', 'female', 'other')

Rick James
  • 135,179
  • 13
  • 127
  • 222