Is it recommended to use a text field as userID in all tables of a database?

Question

I took a look at the database scheme of ownCloud and there is a table for all users.
The userID (their PK?) is the username, they do not have a numeric id or userID column.

Is this recommended somehow or why do they do it like this?
Because when I want to change the username of a user, I have to change all affected rows in all tables of the database and not just a single row in the users table.

In my opinion this is wrong and not good for the performance. https://github.com/owncloud/core/issues/9136 https://github.com/owncloud/core/blob/master/db_structure.xml#L1026 their uid is a text with a length of 64 but should be an integer field for example auto_increment

Is this professional/good practice or not?

Referencing also to this: Is it bad to use user name as primary key in database design?

Are they using this for multi-tenancy? (Hiding rows from other users). Are there views that lookup current user and compare to username value in rows? — Neil McGuigan, Jun 23 '14 at 17:16

score 1 · Answer 1 · answered Jun 22 '14 at 17:54

1

First...numbers are the best for storage memory Second...names maybe repeated!(duplicated)! So..Numbers are the best

answered Jun 22 '14 at 17:54

Obadah Hammoud

572
2
13

score 1 · Answer 2 · answered Jun 23 '14 at 10:31

It is perfectly okay to use the username as primary key. A username is not supposed to change anytime; it is the ID a user works with in the system. It is what identifies a user in the system, so this is the natural key and should be used as the primary key. With one exception: If you decide to have a database with technical keys (IDs) then you would create a user ID instead and have the username as a mere data column with a unique constraint.

A database with technical keys instead of natural keys is not better or worse than the other. It's just a different concept. Yes, an integer ID may be faster to index and join than a string. That's a plus for technical IDs. On the other hand you need fewer joins in a database with natural keys. For asking "give me all messages user MyName has posted", you will simply select from the messages table in a natural key based database, whereas in a technical key based db you would have to access the user table first to get the user ID you need in order to interpret the messages records.

It is also possible to mix both concepts in a database as it is obviously done in the database given. The jobs table has an ID whereas the users table uses the username. This is not at all uncommon. A database with natural keys has many advantages, but often you would like to work with codes. For instance: A country name is unique, so it seems a perfect primary key for a countries table. However, a country can be renamed; Myanmar and Burma are the same country, only renamed. So one would use IDs for the countries to make renaming easier. Same with jobs. Well, for the country you may find ISO codes you can use. For jobs you probably wouldn't, so you create a code. Thus both, the natural key database and the technical key database will use an ID for the jobs table. The only (small) difference here: In a database with natural keys you may show the ID to the user; it's an official unique code you can use with the db. In a database with technical IDs you would usually not show the IDs to the users. They are only technical gadgets with no meaning other than being a way to store relationships.

But you get very big tables very quickly when you use usernames in the columns instead of shorter numbers and that makes lookup much slower as the whole table has to be scanned. My concerns about big instances use this database design. One join is not the problem but the unneeded overhead and the text fields used as pk (think about many hundred users, much more paths - also contain the username, ... integer values and one single join should be much faster. — , Jun 23 '14 at 11:11
As to speed: I know that text is longer than just an integer, but it can be indexed just as well. To find a record in one billion records it takes 30 comparisions (binary search). Having integers instead of strings you may save some nanoseconds. But with the ID concept you would have to access a second table; that takes longer than the few nanoseconds just saved. As to memory: These are just some bytes. With terabyte harddisks nowadays and capable dbms, who cares? — Thorsten Kettner, Jun 23 '14 at 12:52

Rahul · Accepted Answer · 2014-06-22T19:25:19.590

Neither it's professional nor a good practice cause userID (their PK?) is the username

Apart from the indexing and storage criterion; First of all it's confusing, if it holds name of user then the column name must reflect the same.

second, making it PK (if at all); you are not allowing duplicate names where there could be users present with same name.

It's always recommended to use a column as PK which could uniquely determine other fields. username shouldn't be designate as PK since it may have duplicate(s).

EDIT:

Having string as primary key not recommended cause it takes more space (depends on the size) and comparison of indexes on it will be slower as it's a computationally intensive task than comparing integers; but by all means, if scenario requires the string column to be PK then that should be it.

From the GitHub link you have provided, index part of the table structure (one you are talking about) is as below

<index>
<name>pref_userid_appid_key_index</name>
<primary>true</primary>
<unique>true</unique>
<field>
<name>userid</name>
<sorting>ascending</sorting>
</field>
<field>
<name>appid</name>
<sorting>ascending</sorting>
</field>
<field>
<name>configkey</name>
<sorting>ascending</sorting>
</field>
</index>

Notice the index declaration part; from which Table structure could be depicted as

create table preferences (
userid text(64) not null,
appid text(32) not null,
configkey text(64) not null,
configvalue clob null,
constraint pref_userid_appid_key_index primary key(userid,appid,configkey)

As can be seen they are declaring both userid,appid,configkey as PK making it a composite primary key which is pretty valid and will make sure unique identification of all records.

hope this helps clearing your understanding.

Right. This is not quite the main problem, the main problem is the text field and that the username (their userid) is used in all tables so they should add a numeric and unique userid or autoincrement value (if you want to change the username, you have to change it in all affected rows in all affected tables). But you are right, you can see it in their database scheme: https://github.com/owncloud/core/blob/master/db_structure.xml#L1026 So would you use ownCloud (take a look at their trivial issues)? — , Jun 22 '14 at 18:28
@DanielRuf, sorry but I hardly could make out something from the link you mentioned in comment. — Rahul, Jun 22 '14 at 18:30
the xml is parsed to SQL, the uid field (line 1026) is a text field with a length of 64, and is used as primary key, this is where the username is stored (their interpretation of the userid - should be normally an integer value. and the value of this userid is used everywhere in their database tables — , Jun 22 '14 at 18:35
seems like the wrong index part (what you wrote is the part for the dbprefix*preferences table), I meant the *dbprefix*users table but you are right, this should be integer not text ;-) — , Jun 22 '14 at 19:48
@DanielRuf, OH! Ok. I thought you are pointing to `preferences` table. Just looked again and Yes correct, that should be a `int` column. Not sure why they chossed to be a `text` instead. that's indeed bad. Moreover, that have a separate `displayname` column. if they intend to store numeric data like `1 2 3 ...` in `uid` which is `text` column then that's real hell. — Rahul, Jun 22 '14 at 19:53

Is it recommended to use a text field as userID in all tables of a database?

3 Answers3