3

I want to create a unique hash for a given string and I was wondering if there is a difference in duplicate hashes for md5 and sha1.

Lets for the sake of argument assume the following code:

foo = "gdfgkldng"
bar = "fdsfdsf"
md5(foo)
>>>> "25f709d867523ff6958784d399f138d9"
md5(bar)
>>>> "25f709d867523ff6958784d399f138d9"

Is there a difference in the probability of this occurring between sha1 and md5? Also: if I use strings that have a big overlap ("blabla1", "blabla2") is there a difference?

BTW. I am not interested in the security of the algorithms I just want to create a hash that is as unique as possible.

RickyA
  • 15,465
  • 5
  • 71
  • 95
  • If this is not security related, you can consider to use the original string instead. If the string is shorter than its hash value, then there is no advantage in calculating a hash, the string will be more unique in every case. – martinstoeckli Feb 06 '13 at 20:08
  • That is true, but the string is not shorter, and I pass it in a get request so I dont want it "readable". This also has the nice side effect the hash is already url escaped.. – RickyA Feb 06 '13 at 21:25

1 Answers1

5

MD5 has a digest size of 128 bits. SHA-1 has a digest size of 160 bits. Even ignoring discovered weaknesses, MD5 is going to produce more collisions just because it has a smaller output space.

Consider using SHA-256 instead; it has a digest size of 256 bits (obviously), and furthermore hasn't been broken in a meaningful way.

Cairnarvon
  • 25,981
  • 9
  • 51
  • 65