2

I have a PostgreSQL 9.3 database with a users table that stores usernames in their case-preserved format. All queries will be case insensitive, so I should have an index that supports that. Additionally, usernames must be unique, regardless of case.

This is what I have come up with:

forum=> \d users
                      Table "public.users"
   Column   |           Type           |       Modifiers
------------+--------------------------+------------------------
 name       | character varying(24)    | not null
Indexes:
    "users_lower_idx" UNIQUE, btree (lower(name::text))

Expressed in standard SQL syntax:

CREATE TABLE users (
    name varchar(24) NOT NULL
);
CREATE UNIQUE INDEX "users_lower_idx" ON users (lower(name));

With this schema, I've satisfied all my constraints, albeit without a primary key. The SQL standard doesn't support functional primary keys, so I cannot promote the index:

forum=> ALTER TABLE users ADD PRIMARY KEY USING INDEX users_lower_idx;
ERROR:  index "users_lower_idx" contains expressions
LINE 1: ALTER TABLE users ADD PRIMARY KEY USING INDEX users_lower_id...
                              ^
DETAIL:  Cannot create a primary key or unique constraint using such an index.

But, I already have the UNIQUE constraint, and the column is already marked "NOT NULL." If I had to have a primary key, I could construct the table like this:

CREATE TABLE users (
    name varchar(24) PRIMARY KEY
);
CREATE UNIQUE INDEX "users_lower_idx" ON users (lower(name));

But then I'll have two indexes, and that seems wasteful and unnecessary to me. So, does PRIMARY KEY mean anything special to postgres beyond "UNIQUE NOT NULL," and am I missing anything by not having one?

  • Will you ever have tables that have a foreign key back to this table? Does your table only have one column? I guess if your example is really this simple then no you can probably get away with it. I worked in an environment with no primary keys before and it was a nightmare. – SQLChao Jul 31 '14 at 21:22
  • There will be a variety of other columns, but I omitted them here for simplicity. I do intend to have foreign keys back to this table, but... hmm. I assumed I could reference a column with a unique constraint, but maybe not in this case. Let me give it a try. [edit] Hey, it does work! – Alex French Jul 31 '14 at 22:54

3 Answers3

4

First off, practically every table should have a primary key.

citext

The additional module provides a data type of the same name. "ci" for case insensitive. Per documentation:

The citext module provides a case-insensitive character string type, citext. Essentially, it internally calls lower when comparing values. Otherwise, it behaves almost exactly like text.

It is intended for exactly the purpose you describe:

The citext data type allows you to eliminate calls to lower in SQL queries, and allows a primary key to be case-insensitive.

Bold emphasis mine.
Be sure to read the manual about limitations first. Install it once per database with

CREATE EXTENSION citext;

text

If you don't want to go that route, I suggest you add a serial as surrogate primary key.

CREATE TABLE users (
    user_id serial PRIMARY KEY
  , username text  NOT NULL
);

I would use text instead of varchar(24). Use a CHECK constraint if you need to enforce a maximum length (that may change at a later time). Details:

Along with the UNIQUE index in your original design (without type cast):

CREATE UNIQUE INDEX users_username_lower_idx ON users (lower(username));

The underlying integer of a serial is small and fast and does not have to waste time with lower() or the collation of your database. That's particularly useful for foreign key references. I mostly prefer that over some natural primary key with varying properties.

Both solutions have pros and cons.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Ah, for some reason I had dismissed citext early on. I'm glad you brought it back to my attention, because it seems like it may be the right way to go. Can you elaborate on why practically every table should have a primary key? As I understand it, a foreign key must be a PRIMARY KEY or UNIQUE. In my case, A foreign key in another table would point to user.name, which is unique. – Alex French Jul 31 '14 at 22:35
  • @AlexFrench: It's a convention, not a rule. There are a couple of functions that default to using a primary key, like syntax shorthand for FK creation. Or some clients require primary keys. for instance, pgAdmin won't let you edit a table without PK. On a different note, I had the same problem with case insensitive unique user names recently. Tough call, but I went with the second solution. – Erwin Brandstetter Jul 31 '14 at 22:46
3

I would suggest using a primary key, as you have stated you want something that is unique, and as you have demonstrated that you can put unique constraints on a username. I will assume that since this is a unique,not null username that you will use this to track your users in other parts of the Database, as well as allow usernames to be changed. This is where a primary key will come in handy, instead of having to go into all of your tables and change the value of the Username column, you will only have one place to change it. Example

   Without primary key:
   Table users
   Username
   'Test'


   Table thingsdonebyUsers
   RandomColumn AnotherColumn  Username
   RandomValue   RandomValue    Test

Now assume your user wants to change his username to Test1, well now you have to go find everywhere you used Username and change that to the new value before you change it in your users table since I'm assuming you will have a constraint there.

  With Primary Key
  Table users
  PK           Username
  1              'Test'


  Table thingsdonebyUsers
  RandomColumn   AnotherColumn     PK_Users
  RandomValue    RandomValue         1

Now you can just change your users table and be done with the change. You can still enforce unique and not null on your username column as you demonstrated. This is just one of the many advantages of having normalized tables, which requires your tables to have a Primary Key that is an unrelated value(forget what the proper name is for this right now).

As for what a PK actually signifies, it just a non nullable unique column that identifies the row, so in this sense you already have a Primary Key on your table. The thing is that usually PKs are INT numbers because of the reason that I explained above.

ZeRaTuL_jF
  • 582
  • 2
  • 4
  • 20
  • Ah, interesting, but as far as changing usernames goes, it's not a requirement. – Alex French Jul 31 '14 at 21:34
  • @Alex French like JChao commented if there is not going to be FK back to this table, then your table design would be fine. – ZeRaTuL_jF Jul 31 '14 at 21:39
  • Worth mentioning that [foreign key constraints with `ON UPDATE CASCADE`](http://www.postgresql.org/docs/current/interactive/ddl-constraints.html#DDL-CONSTRAINTS-FK) take care of updating depending entries automatically. Still, more expensive to have multiple depending rows changed ... – Erwin Brandstetter Jul 31 '14 at 22:07
0

Short answer: No, you don't need a declarative "PRIMARY KEY", since the UNIQUE index serves the same exact purpose.

Long answer:

The idea of having Primary Keys comes from database systems where the data is physically in key order. This requires having a single, "primary" key. MySQL InnoDB is this way, as are many older databases.

However, PostgreSQL does not keep the tables in key order; it separates the indexes, including the primary key index, from the heap, which is essentially unordered. As a result, in Postgres, there is no material difference between primary keys and unique indexes. You can even create a foreign key against a unique index, as long as that index covers the whole table.

That being said, some tools external to PostgreSQL look for primary keys and do not regard unique indexes as being equivalent. These tools may cause you issues because of not finding a PK.

FuzzyChef
  • 3,879
  • 2
  • 17
  • 17