0

More exactly, let's say we have two images and we do the MD5 or SHA-256 of each. Is there an algorithm to calculate the percentage difference/similarity between the two checksums? To say image_1 is 26% similar to image_2 ?

I don't necessarily want MD5 or SHA, any other fast mechanism will do.

*LE : ANY fast mechanism for determining percentage of difference/similarity between two large data strings will do (I think Damerau-Levenshtein would prove slow)

Pobosa A
  • 41
  • 7
  • 2
    You can compute the difference between the hashes, but it won't tell you anything about the difference between the original images. – Oliver Charlesworth Mar 06 '18 at 22:21
  • @OliverCharlesworth I don't need to know exact differences, only how different. So you state that distance between two hash strings won't be identical in percentage to distance between the source images. – Pobosa A Mar 06 '18 at 22:22
  • 1
    Good checksums are explicitly designed to prevent this kind of comparison. – teolandon Mar 06 '18 at 22:23
  • 2
    There won't be any relationship at all - on average changing a single input bit will change 50% of the bits in the output hash. – Oliver Charlesworth Mar 06 '18 at 22:28

1 Answers1

1

Can you calculate the difference between two hashes?

Sure.

Can you use that difference to infer anything about the original files?

No, that's the entire point of a cryptographic hash; even a minor change should generate a significantly different hash. Otherwise they would lose their usefulness in security related aspects.

Herohtar
  • 5,347
  • 4
  • 31
  • 41