0

I want to create a method in c# which will return a string of max 10-12 characters. I have tried using SHA1 and MD5 but they are 160 and 128 bits respectively and generates 32 characters string which doesn't fulfill my requirements. Security is not the issue. I just need a small string that will remain unique

Ask
  • 3,076
  • 6
  • 30
  • 63
  • https://en.wikipedia.org/wiki/List_of_hash_functions also contains information about the output length. – thehennyy May 29 '18 at 11:35
  • 2
    Obviously you can't have a 32 or 64 bit hash code that is *unique* for all possible strings of 10-12 characters. – Matthew Watson May 29 '18 at 11:41
  • 1
    If you are not storing the hash code, you could just use `string.GetHashCode()`... (This is not good if you are storing it because the implementation of `string.GetHashCode()` may return different results for different runs of the same program.) – Matthew Watson May 29 '18 at 11:43
  • thanks for your response. Tell me one thing if I truncate a 32 characters hexadecimal string to 10-12 characters. What are the chances of collisions? I tried with 500 different inputs and the result string was unique each time. I have to generate almost 5000 strings – Ask May 29 '18 at 11:48
  • erm, 2 ^ 32 perhaps – Jodrell May 29 '18 at 11:58
  • If you want to get more binary data in a shorter string use a more compressed encoding. Ascii85 has a 5:4 compression ratio https://stackoverflow.com/questions/31817721/a-more-compact-representation-than-base64-for-byte-arrays – Jodrell May 29 '18 at 12:10

1 Answers1

2

You can truncate the string (the hash) to the length you want. You'll only make it weaker (as an extreme example, if you truncate it to one byte, you'll probably have a collision after 16 elements are hashed, thanks to the birthday problem). Each part of a good hash is as much "good" as every other part. So take the first x characters/bytes and live happy. See for example a discussion about this in security. There is an explanation here about how much secure will be a truncated hash.

xanatos
  • 109,618
  • 12
  • 197
  • 280
  • thanks for your response. Tell me one thing if I truncate a 32 characters hexadecimal string to 10-12 characters. What are the chances of collisions? I tried with 500 different inputs and the result string was unique each time. I have to generate almost 5000 strings – Ask May 29 '18 at 11:48
  • 1
    @ask It depends on the number of bits of the hash :-) If it is 6 bits/char (a base64), 10 chars is 60 bits, so using the square root approximation, sqrt(2^60) = 1,073,741,824... With 1 billion elements you have a 50% chance of a collision. – xanatos May 29 '18 at 11:53