-1

I am new to cryptography.

In my application we are planning on securing various data using AES GCM to encrypt the data.

Additional to that, let’s say I have a requirement to save some information in a database. This information must not be readable when looked up directly in the database. It’s not critical information like, say, a password.

There is a need that given the same input I should be able to look up the row of the table where this data is stored in the unreadable format in an indexed column. It’s not an issue that anyone can run a query and match this unreadable format with another same value and identify its original input.

If that is not an issue, is it OK to use SHA-256 hashing for this? If not, what's your suggestion for a better alternative?

I have tried searching for this. It appears the trend keeps evolving across the years. To me it looks like it should be OK to do this. But in general I do see many posts discouraging use of SHA. That said, I also saw a post which said even bitcoin uses SHA-256.

I have noted the very helpful inputs. My thanks for the same. As of now, I have understood this much:

  1. Depending on the sensitivity of the data one can use SHA256 - when not worried about cracking of the data itself too much (but then why even hash it? That said, there may be cases where we can use it)
  2. Add salt when hashing using SHA-256 if we want to avoid a lookup based on the same password in plain text input. But this can be vulnerable to brute-force attack because of the speed of SHA-256. So don't use this approach also if it’s a password kind of sensitive data
  3. Use something like Argon2, bcrypt, scrypt which have inbuilt (at least bcrypt has by adding salt into the hash itself) salting. And it can be configured to perform slowly and use more CPU, memory, thereby making it safer.
  4. Possibly safest which I need to explore further is what was suggested by Topaco—namely a blind index.
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Raster R
  • 77
  • 7
  • Conventional hashing algorithms like SHA256 render data unreadable to everyone including you. If it's unreadable what's the point of storing it? In other words, I'm a little unclear on what you're storing in the database. – President James K. Polk May 22 '23 at 14:55
  • You say "discouraging use of SHA." I think you've misread your sources. Almost certainly what they're discouraging is **SHA-1**, and recommending one of the SHA-2 series (particularly SHA-256 or SHA-512) in its place. [SHA-1 has known (and demonstrated) attacks.](https://shattered.io/) SHA-2 does not. – Rob Napier May 22 '23 at 15:00
  • @PresidentJamesK.Polk This requirement is being given to me. Grappling with it. My understanding so far is- think of it as for an example- hashing a username and storing it and later looking up some information of the username. The secrecy of the username is a nice to have but not so critical that it must be secured like we would if it were a password. Does that make any sense or am I going totally in the wrong direction? – Raster R May 22 '23 at 15:36
  • @RobNapier Should I understand that you are saying yes go ahead use the hash as data in the indexed column for looking up the row? – Raster R May 22 '23 at 15:38
  • *...Its not an issue that anyone can run a query and match this unreadable format with another same value and identify its original input...* When security does not play a major role, why are you even concerned about whether the digest is insecure or not? Kind of inconsistent. Maybe I misunderstand something. Note that usually for a *blind index* an HMAC (with e.g. SHA256) is used (by the key additional security is introduced), s. [here](https://stackoverflow.com/q/4961603/9014097). – Topaco May 22 '23 at 16:29
  • @Topaco Thanks for your advice. I am new to all this. Yes I will have to study HMAC with SHA256. – Raster R May 23 '23 at 10:53

1 Answers1

1

If your intent is to have a unique key that can be consistently derived from an arbitrary input, but does not leak information about that input, then SHA-256 is an excellent choice. This is one of the key features of a secure hash, and unless you have specialized needs, you default choice of a secure hash should be SHA-256.

Depending on what you're hashing, you may want to add a salt to your hash. You may also need to stretch your hash if the input is over a small domain (input space). It's not completely clear from your question what you're planning on hashing.

Rob Napier
  • 286,113
  • 34
  • 456
  • 610