0

Consider the following scenario:

  • Users enter unique codes (say something like a gift card) on a website.
  • The code corresponds to an object in the database which must be retrieved.
  • The code is a secret and cannot be stored as plain text.
  • Instead, the code will be hashed and stored in the database. The hash algorithm will be sha-512 or bcrypt combined with some salting strategy.

In order to look up the code, a hash of the user entered code must be taken. Typically, in the case of password authentication, the identity of the user is already known and thus the salt can be retrieved from the database before computing the hash. In the above scenario though it's not possible to load the salt associated with the code since we don't know which object in the database the code corresponds to. This seems to imply there is no such salting strategy for this scenario for which the salts can be random.

I would like input on the following ideas:

Can we hash (say sha2) of the user entered code to act as the salt?

salt = sha2(code)
hashedCode = hash(code + salt)

If there are vulnerabilities of the above, can including some additional global secret as part of the hash help alleviate the risk?

salt = sha2(code + globalSecret)
hashedCode = hash(code + salt)

Thanks!

TimJ
  • 426
  • 4
  • 12

2 Answers2

3

Hashing is the absolute wrong approach to this. I'll get to that in a sec, but first let me address the lookup in the database issue.

Lookup secrets in a db issue

The reason most gift cards have a code on the front and a scratch off secret on the back solves this.

The user inputs the code on the front. Then your app pulls up the database record. Then the user inputs the scratch off code on the back. And you can compare that code with the one in your database.

Another way to do this is to split the code so the first part is the database record ID, and the second part is the code to compare. Ex:

1234 5511 2121 1234 --- 12345511 is the record ID, and 21211234 is the secret code.


Hashing is the wrong approach

All that is kinda irrelevant to the question which is about hashing. We suggest you hash passwords because we don't want someone to get a hold of the database and reverse the passwords and then use those passwords for other websites.

In your scenario, you could end up in a state where someone steals your database, and then is able to guess all the codes. Well, here is the problem with your solution. If i know the codes are between 1 and 100000000, then i can innumerate 1 through 100000000, compute the hash/bcrypt for each of them, and recover all of the codes. I can do that in minutes. Unless you're sending out 2^128 length codes (which i'd never type in), the solution is totally flawed.

Of course, this is why they have the salt value. So that I have to go to each giftcode hash, and run through all 10000000 values. But this is still a couple hours of work, not years which is what you'd like. A salt doesn't fix the problem at all.


How to do this thing then?

If your risk is someone stealing the database, and then using all the gift codes to steal money, one of two things happens. You either cancel all the codes (then a bunch of people holding gift cards get pissed), or you protect the codes.

Instead of storing the code in plain text, or the code hashed (which is basically plaintext because I can enumerate all the possible values in a few minutes), you want to use HMAC. HMAC is a hash AND a secret key.

giftcode_in_db = HMAC(<SECRETKEY>, giftcode_from_user)

Now, to protect all of your gift codes, you simply need to protect the secret key. Only use it in memory, and have an operator enter it by hand or use Hashicorp Vault to do the actual HMAC operation.

If someone steals your database, they also need to steal the secret key. If someone steals the secret key, they also need to steal the database.

Jonathan
  • 5,736
  • 2
  • 24
  • 22
  • Let's consider the scenario of someone stealing the database. Using Hmac what prevents someone from computing HMACs for enumerations of possible values for and gift codes? How does this differ from using that SECRETKEY as a pepper in a salting strategy? I'm aware of some of the differences, such as `hmac("code1", "secret") != hmac("code", "1secret")` but `sha("code1" + "secret") == sha("code" + "1secret")`. Are there other reasons as to why these differ? – TimJ May 31 '18 at 00:11
  • I should also mention the codes are likely to be 16 alphanumeric characters long. – TimJ May 31 '18 at 00:15
  • Very similar. Salt as a concept is a value stored in plaintext. Secret key must be secret. Practical attacks on database like sql injection or loss of a backup would not have the secret key with the Data. Hmac has some extra steps to keep the secret key secret too. – Jonathan May 31 '18 at 00:16
1

If your 16 character alphanumeric codes (0-9 a-z A-Z) are generated really randomly, they are strong enough to be hashed and stored without salting, even with a fast hash algorithm like SHA-256.

You can improve security in using a server side key, whether you encrypt the hashes, or use a HMAC is not that important. Important is, that an attacker needs additional privileges on the server to get the key, before he can start cracking (SQL-injection is not enough).

If the codes where weaker (shorter), you could also use key-stretching, to increase the necessary time for brute-forcing. Even some milliseconds for a single calculation thwarts brute-force attacks, as long as the codes are not too short. Key-stretching can be done independend of salting. Better of course is using strong enough codes.

martinstoeckli
  • 23,430
  • 6
  • 56
  • 87
  • Thank you for your explanation. Suppose there could be a scenario for which shorter codes are required (perhaps in the case of a user uploading short existing codes. They might even be just numeric). We'll use a `hashingSecret` which won't be stored in the database. Can you give any insight in using `HmacSHA512(code, hashingSecret)` vs. `SHA512(code+hashingSecret)`? I'm a little confused about whether Hmac is necessary since we don't need to validate the authenticity of the hash. We're purely using it internally for lookup purposes. – TimJ Jun 01 '18 at 18:20
  • @TimJ - You should be aware that the server side key adds only a small advantage, as soon as the attacker gets privileges on the server there is nothing to prevent him to successfully brute-force the codes, but the necessary time to calculate a single hash. So for short codes key stretching becomes more important (eg PBKDF2). Instead of using the secret in the HMAC or as pepper, I would recommend to encrypt the hashes (eg AES), this allows to change the key if this should become necessary. – martinstoeckli Jun 01 '18 at 19:02
  • @TimJ - Short numeric codes cannot be protected properly, they are too easy to brute force. – martinstoeckli Jun 01 '18 at 19:08