3

It seems like the "_id" suffix is quite common in names of foreign keys. I wonder about the reasons behind this.

What are the benefits of posts(id, user_id, title, text) over the simpler posts(id, user, title, text)?

Emanuil Rusev
  • 34,563
  • 55
  • 137
  • 201
  • 1
    http://stackoverflow.com/questions/2240929/what-is-a-proper-naming-convention-for-mysql-fks http://dba.stackexchange.com/questions/39817/mysql-column-naming-conventions-for-foreign-keys – sergio Jan 23 '15 at 14:55
  • 1
    There are only two hard things in computer science: cache invalidation, naming things, and off-by-one fence post errors. – spencer7593 Jan 23 '15 at 15:16
  • 1
    There are three kinds of people in the world. Those who know how to count, and those who don't. – Walter Mitty Jan 24 '15 at 12:10
  • There are two types of programmers. **1.** those that believe counting should start at one. and **1.** those that believe counting should start at zero. – spencer7593 Aug 13 '20 at 19:03

3 Answers3

5

The big benefit of following a naming convention like this: it helps makes SQL statements that are wrong "look wrong".

(This blog post from Joel Spolsky "Making Wrong Code Look Wrong" doesn't mention SQL, but I think the SQL naming convention that you asked about follows along the same principles that Joel espouses.)

http://www.joelonsoftware.com/articles/Wrong.html


The normative join is a foreign key to primary key.

When we follow the naming convention that uses id as the primary key, and use tablename_id (with _id on the end of foreign key column name), that leads to SQL that looks like this:

 FROM user
 JOIN post
   ON post.user_id = user.id

That's a very familiar pattern: surrogate primary key has a name of id, and the foreign key columns get a name of <referencedtable>_id, or sometimes <referencedtable>_<role>_id.

Looking at this SQL, we expect that user_id column in post table is a foreign key reference to the id column in the user table. When we religiously follow this style naming convention, then SQL that doesn't follow that pattern just looks "wrong".

As an example, compare:

 FROM somedoohickey
 JOIN gibberish
   ON troubador = minstrel

to:

 FROM somedoohickey s
 JOIN gibberish g 
   ON g.id = s.gibberish_id

Both may be equally valid SQL. But the second pattern conveys much more information to the reader. When we're used to the naming convention, the former just looks "wrong" to us, that is, we suspect that something might be amiss.

Compare that to:

 FROM somedoohickey s
 JOIN gibberish g 
   ON g.id = s.undecipherable_id

Something doesn't look right. Or even

 FROM somedoohickey s
 JOIN gibberish g 
   ON g.id = s.gibberish

Again, something just doesn't look right with that.

spencer7593
  • 106,611
  • 15
  • 112
  • 140
4

The biggest benefit of any naming convention is ease of learning. A maintainer who comes along later is going to have to learn what the data means, and some of that learning is going to be done subliminally, without even realizing it at the time. Having a foreign key tagged with an "Id" suffix helps the future investigator know at a glance which columns are foreign keys, and which are not.

Some shops use CamelCasing instead of underscore to make the suffix stand out from the stem.

Some shops include the Id suffix in the primary key column name. This has the advantage that a query on the metadata can locate all the primary keys that the foreign keys refer to. The same thing can be done by judicious use of named domains.

Some shops make it a point that all, or nearly all, foreign keys are declared as such in the data definitions, by including a reference constraint to prevent orphaned foreign keys. This has substantive consequences as well as aiding the investigator.

But the most important thing in making the system more learnable and therefore more accessible is consistency. If a convention is inconsistently adhered to, it can be more confusing that no convention at all.

Walter Mitty
  • 18,205
  • 2
  • 28
  • 58
3

This is very much opinion based, but I'd say it more accurately describes the contents of the column. 1902010 is not a user, it's a user ID. Calling it user would be as wrong as calling your title column title_id.

ceejayoz
  • 176,543
  • 40
  • 303
  • 368
  • "1902010 is not a user" - but isn't that clear in the context of a MySQL database? – Emanuil Rusev Jan 23 '15 at 14:57
  • 1
    @EmanuilRusev Not necessarily. I've seen some databases where a column like `user` contains contents along the lines of `{"id":1902010,"name":"ceejayoz"}`. – ceejayoz Jan 23 '15 at 14:58