13

Using nodejs and crypto, right now, when a user logs in, I generate a random auth token:

var token = crypto.randomBytes(16).toString('hex');

I know it's unlikely, but there is a tiny chance for two tokens to be of the same value.

This means a user could, in theory, authenticate on another account.

Now, I see two obvious methods to get pass this:

  • When I generate the token, query the user's database and see if a Token with the same value already exists. If it does, just generate another one. As you can see, this is not perfect since I am adding queries to the database.
  • Since every user has a unique username in my database, I could
    generate a random token using the username as a secret generator key. This way, there is no way of two tokens having the same value. Can crypto do that? Is it secure?

How would you do it?

Reza Mousavi
  • 4,420
  • 5
  • 31
  • 48
Dany D
  • 1,189
  • 2
  • 18
  • 44

2 Answers2

21

It's too unlikely to worry about it happening by chance. I would not sacrifice performance to lock and check the database for it.

Consider this excerpt from Pro Git about the chance of random collisions between 20-byte SHA-1 sums:

Here’s an example to give you an idea of what it would take to get a SHA-1 collision [by chance]. If all 6.5 billion humans on Earth were programming, and every second, each one was producing code that was the equivalent of the entire Linux kernel history (1 million Git objects) and pushing it into one enormous Git repository, it would take 5 years until that repository contained enough objects to have a 50% probability of a single SHA-1 object collision. A higher probability exists [for average projects] that every member of your programming team will be attacked and killed by wolves in unrelated incidents on the same night.

(SHA-1 collisions can be directly constructed now, so the quote is now less applicable to SHA-1, but it's still valid when considering collisions of random values.)

If you are still worried about that probability, then you can easily use more random bytes instead of 16.

But regarding your second idea: if you hashed the random ID with the username, then that hash could collide, just like the random ID could. You haven't solved anything.

Macil
  • 3,575
  • 1
  • 20
  • 18
  • You just gave me another idea... when the user logs in, I will search for which user has that certain token.. if the query finds more then one result, just make the user log in again - this way I can handle even the worst scenario - what do you think? – Dany D Sep 26 '14 at 19:36
  • 3
    The chance that your server will randomly generate a collision in tokens between two users is much less than the chance of a motivated malicious user guessing another user's token. If you believe the first case to be a potential problem, then you should believe the second case to be more of a problem and address it by using more random bytes. If you believe that 16 bytes is likely enough to stop a motivated brute-force attacker from guessing a targeted user's token, then it should follow that 16 bytes is likely enough to stop your server from generating a collision. – Macil Sep 26 '14 at 20:03
  • awesome answer in general to crypto – hauron Mar 06 '15 at 11:16
  • why dont you just hash it to the time it was created , that way it would be UNIQUE – TheAnimatrix Jul 26 '16 at 17:26
  • Multiple things can happen at the same time, and frequently do. – Macil Jul 26 '16 at 18:08
  • 1
    @TheAnimatrix, hashes of the time are no more unique than any other method. Pointless advice at best. – MacroMan Mar 12 '18 at 13:31
  • it's been a really long time however i guess i was talking about something like a salt format – TheAnimatrix Apr 07 '18 at 13:34
  • maybe you could also prefix `crypto.randomBytes` with the hashed unique username, so instead of `specialhash(randomBytes, username)` it would just be `hash(username) + randomBytes` that's how they try to prevent MAC address collisions I think – erp May 18 '18 at 01:00
2

You should always add a UNIQUE constraint to your database column. This will create an implicit index to improve searches for this column and it will make sure that none of two records will ever has the same value. So, in the worst-case scenario you will get a database exception and not a security violation.

Also, depending on how frequently unique tokens are needed to be created, I think it's perfectly fine in most cases to use database lookups during generation. If your column, again, is properly indexed, it will be a pretty fast query. Most databases a very well horizontally scalable, so if your are building a next Facebook it is again an option. Furthermore, you will probably need to do a query to check for E-Mail uniqueness anyway.

Finally, if you are really concerned about performance you could always pre-generate a one-million of unique tokens and store them in the separate database table for quick use. Just setup a routine to periodically check it's usage and insert more records to it as needed. However, as @MacroMan stated in the comments, this could have a security implications if someone will get access to the list of pre-generated tokens, so this practice should be avoided.

Slava Fomin II
  • 26,865
  • 29
  • 124
  • 202
  • "worst-case scenario you will get a database exception". No, worst case, you'll leak the info that the record already exists and allow session hijacking. It's also pretty bad advice to pre-generate tokens. What if the database is hacked and you don't know? Please don't follow this advice. – MacroMan Mar 12 '18 at 13:29
  • 1
    @MacroMan thank you for the point regarding pre-generated tokens list exposure. It totally makes sense, so I agree this practice should be avoided if tokens are used in security-sensitive environment (e.g. authentication). However, could you please explain the first case you've mentioned about session hijacking? The generating code should definitely watch for such exceptions and regenerate the token in case of failure (that's how I do it). But, even if exception will reach the surface and the client will see the error, how could she use it to hijack the session? – Slava Fomin II Mar 25 '18 at 17:26
  • Thanks. You helped confirm my point-of-view that the cost of insert for unique index on the DB is negligible. There will be many more reads than writes in my case, and this ensures no user can accidentally authenticate as another, even preventing being able to store two duplicate keys by copy/paste or manual row insert. – Lukus Sep 01 '22 at 02:35