42

When I use IdentityUser model in Asp.Net Identity with EntityFramework, it creates some standard fields in the database. All the fields are self explanatory except for the below two fields.

  • NormalizedUsername - Which contains the uppercase value of the Username
  • NormalizedEmail - Which contains the uppercase value of the Email

My doubts are:

  1. Why do we need these Normalized fields? Where does it get used?
  2. What is the purpose of persisting it in the database?
Faraj Farook
  • 14,385
  • 16
  • 71
  • 97
  • 1
    Partially relevant to the [NormalizedUserName VS Username in DotNet Core](http://stackoverflow.com/questions/39651299/normalizedusername-vs-username-in-dotnet-core) – Divyang Desai Dec 31 '16 at 16:14
  • The reason for normalizing in uppercase rather than lowercase is due to culture specific issues. There's a good explanation of that [here](https://gingter.org/2018/07/10/how-to-correctly-normalize-strings-and-how-to-compare-them-in-net/). – Griff Apr 12 '23 at 07:07

2 Answers2

50

By my understanding, both fields are there for performance reasons. It's sort of explained in the following thread Normalization on UserName and Email causes slow performance and are used to validate the case insensitive uniqueness of the UserName and Email fields. They are persisted in the database in order be able to create index on them, thus making the lookups by the normalized user name and email sargable.

There is no other reason or usage of these fields.

Community
  • 1
  • 1
Ivan Stoev
  • 195,425
  • 15
  • 312
  • 343
  • 1
    whats the advantage of having indexes on normalized names rather than, indexing names without normalizing them ? – brainoverflow98 Dec 16 '18 at 21:04
  • 4
    @brainoverflow98 to be able to use the index for case insensitive searches – Ivan Stoev Dec 17 '18 at 07:04
  • 1
    It bugs me that it's using capitalized version in normalization. Performance-wise, we'd achieve the same result if we'd go lowercased too. It doesn't go any faster to index just because the computer SCREAMS AT US... – Konrad Viltersten Aug 02 '21 at 13:42
  • 3
    @KonradViltersten A wild guess could be for uppercase selection is that ordinary users generally tend to write their email in all lowercase. So, it may increase developer awareness about normalized fields when the developer look at data on the table and see normalized fields are all in uppercase. If lowercase was chosen for normalization, especially first-time developers could look into the table and say "Normalized fields are same with original fields, so what is the point?". When they saw all uppercase, they can at least say "something is fishy here". As I say, this is a only a guess by me. – MÇT Jan 21 '22 at 08:49
  • @MÇT Interesting point. I've got a comment from an (older) developer who had the idea that originally, waaay back, everything was upper-case for the simple reason that lower-case didn't exist in the set of available characters being processed. We're talking the days when T-Rex would be typing the code if the arms' length allowed. Apparently, according to him, they didn't have a full ASCII table to play with and mainly could process capitals and digits. Hence, the shouting syntax of SQL with it's `SELECT X FROM Y WHERE`... – Konrad Viltersten Jan 21 '22 at 16:21
  • @MÇT I'm not saying you are wrong. I'm just saying that when someone roams with T-Rex and lives to tell the story, you should take them seriously. I've seen that in *Jurassic Park*, you know... :) – Konrad Viltersten Jan 21 '22 at 16:22
  • @KonradViltersten "... your _software engineers_ were so preoccupied with whether or not they could that they didn't stop to think if they should." :) – MÇT Jan 21 '22 at 21:52
  • My guess is they're safeguarding against the collation of the username field. Some databases are case sensitive by default and some are not. If you use EF Core e.g., I believe you will get a case insensitive column in MS SQL and a case sensitive one in MySQL. They're avoiding this issue simply by adding an additional column and uppercasing everything. And you can get the original username with the correct casing the user used when signing up by accessing the non-normalized column. – HelloWorld Jun 11 '22 at 13:15
2

Because in String.Compare("a", "A"), "a" is greater than "A" based on the binary values. The above compare statment translates to String.Compare(01100001, 01000001). As you can see one value is greater than the other.

Normalization brings the property to UpperCase (I believe) where SQL can index the UserName binary values sequentially without any surprise out of sequence lower case letters and their respective binary values.

This improves performance when searching or querying the database.

Anchorman
  • 82
  • 3