0

I'm storing usernames in my database but wish to retain the username casing so that users can have usernames such as "FooBAR".
I also wish for usernames to be unique (regardless of case) - "BaR" would cause "bar" to be unavailable.

I see three ways of doing this:

Firstly, I could query my Users table like so:

bool usernameAvailable = !_context.Users.Any(u => String.Equals(u.Username, username, StringComparison.OrdinalIgnoreCase));

Secondly, I could convert both usernames to upper or lower like so:

username = username.ToUpper();
bool usernameAvailable = !_context.Users.Any(u => u.Username.ToUpper() == username);

Thirdly, I could create a second column in my table that stores the upper/lowercase version of the username:

username = username.ToUpper();
bool usernameAvailable = !_context.Users.Any(u => u.UsernameUpper == username);

This would require more space in the database but would be the easiest to query.

Any insight would be appreciated.

This has almost certainly been answered previously, so please point me in the direction of any previous posts as i'm unable to find what i'm looking for.

brad
  • 1,407
  • 19
  • 33
  • 1
    The one that should be responsible to enforcing such rules should be the DB and DBMS. Doing this on the Client Side will only cause you issues. So wich DB are you using? – Christopher Apr 14 '18 at 17:08
  • A simple trick might be to store the Username both in the cased and lower-case versions. Use the Lower case as the one that must be unique. Use the higher case one as what is being displayed. Many modern Login systesm ahve a "Display Name" that can be freely choosen. – Christopher Apr 14 '18 at 17:09
  • I'm using EntityFrameworkCore. – brad Apr 14 '18 at 17:09
  • Yes, that was the method that I outlined in the second code snippet! – brad Apr 14 '18 at 17:09
  • 2
    EntityFrameworkCore is not DB, DB is sql server, postgresql and so on. Note that first query is especially bad, since EF cannot understand your `StringComparison.OrdinalIgnoreCase` comparision, so it pulls whole users table into memory and performs comparision there. – Evk Apr 14 '18 at 17:27
  • Sorry, my bad, i'm actually using sqlite for development. Interesting, so would you suggest the second approach? – brad Apr 14 '18 at 17:41

2 Answers2

2

Neither is a good option. This problem is a perfect match for a simple UNIQUE INDEX on the database side, together with a case-insensitive COLLATION on the username column. Thus, the client just needs to do a simple equality check and the DB will use the index to find if a matching name exists or not, including case-insensitive checks if you want.

Consider that you need such a check when you add a new user (to check for duplicates) and also when a user tries to login (so you check if the account exist). In both cases an index ensures the uniqueness needed for usernames and speed up the queries.

For the particular case of SQLite, an unique index that is case insensitive can be created like this:

CREATE TABLE users (
    UserName TEXT
    /*Other columns go here*/
    UNIQUE (UserName COLLATE NOCASE)
)

That'll prevent adding two usernames that only differ in case, in addition to speeding up retrieval. On the client side, just don't bother about casing. When users login, just do a normal comparison, and the DB will match them even if they just differ in casing. For the registration, do the INSERT right away, the DB will blow up and reject it if there is a casing difference, from where you can present the user a "username already taken" message.

Also look at this post (from where I took the basic idea): https://stackoverflow.com/a/20914992/2557263

Alejandro
  • 7,290
  • 4
  • 34
  • 59
  • Thank you. I'm using SQLite at the moment (to make development a breeze) with EntitiyFrameworkCore. – brad Apr 14 '18 at 17:57
  • @bradley-varol I've edited with an example for SQLite. – Alejandro Apr 14 '18 at 19:59
  • Thank you for updating your answer. I'll mark you as the best answer, but without writing SQL in strings (I'm using EntityFramework and LINQ), a Collation isn't possible. – brad Apr 14 '18 at 21:51
  • 1
    @bradley-varol Just issue the DDL for creating the DB, that allows for creating all the tables and fully utilizes all the capabilities SQLite provides. Don't let the ORM dictate your schema to its own needs. – Alejandro Apr 14 '18 at 22:15
  • I am using Code First. Would you recommend using Code Behind / Database First so that I can create my schema manually? – brad Apr 17 '18 at 12:46
  • 1
    @bradley-varol If they allow full DB control, yes. Anything that locks you into ORM-generated structure is going to give you problem down the line, specially because it precludes usage of any DB-specific construct that goes beyond the very basics ORMs know how to handle. – Alejandro Apr 17 '18 at 17:16
  • I'll look into moving to a database-behind approach to .net core. Thanks. – brad Apr 18 '18 at 08:40
1

It depends on what your priority is, should your app be fast or should it use less database space, i think the time difference is negligable here, you should use the first method.

Here's an article on wikipedia about this dilemma https://en.wikipedia.org/wiki/Space%E2%80%93time_tradeoff

David
  • 1,636
  • 19
  • 27
  • I'm surprised to hear that the difference is negligible. I'm inclined to test this myself. Interesting article, thank you. – brad Apr 14 '18 at 17:11
  • As always, you can do a simple test yourself: write a program that times how long it takes to write for example 10.000 records to the db with each method and compare the results. Also, how often do users log in and you have to add a new record... I'd say not THAT often, so those couple of miliseconds per 1k logins dont matter here if you ask me. – David Apr 14 '18 at 17:16
  • Doing a comparison together with a conversion will immediately cause any index on the username column to be ignored and a full table scan to be performed. This is the kind of well-known bad practices that seems to be good on a 20 users system and slows down to a crawl when you put 100k users. Not only this check has to be performed on user registration, but also on user login, as the username must be case insensitive. – Alejandro Apr 14 '18 at 17:27
  • I just tested this with a List of users containing a random string (10 chars, A-Za-z0-9) username and the same username as uppercase. The difference between Equals and ToUpper is an average of 6ms less using ToUpper when searching 1 million user objects. – brad Apr 14 '18 at 17:39