2

I have an md5 function which i have confirmed to work well for both files and strings. But when i use it on variable sized chunks of very large files it generates md5 values which are the same but the size of the chunks is different.

I wonder if there is a probability that two chunks with different lengths but may be with the same content result in similar md5 fingerprints.

John
  • 794
  • 2
  • 18
  • 34

2 Answers2

6

The odds that this happens is 1 / (2^128), since MD5 is a 128-bit hash. That means 1/(3.4 x 10^38), so it's very unlikely but not impossible.

It's more probable, I think, that you're doing something wrong and you are actually calculating the MD5 of the same text/file every time.

Roy Dictus
  • 32,551
  • 8
  • 60
  • 76
  • may be you can help me with http://stackoverflow.com/questions/9633322/what-is-wrong-with-the-following-code – John Mar 09 '12 at 11:31
2

You have no chance to have the same MD5 hash without try to do it.

Check here for more information about collision: http://www.mscs.dal.ca/~selinger/md5collision/

Arnaud Bessems
  • 515
  • 5
  • 21