0

Suppose H is some hash function (such as MD5 or SHA256 or whatever) and I have a collision for this hash: two different pieces of data x and y, that have the same hash.

In other words x≠y but H(x)=H(y).

Now if I concatenate some random data z, will H(x+z) be the same as H(y+z) ?

The idea is: x and y being a collision may imply that they happen to bring the H function in the same state (thus resulting in the same hash). From that point on, it doesn't matter what other data we append, their hashes will remain equal.

I tested the above for this MD5 collision and it seemed to work there. But I don't know if this is true in general?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Rogier
  • 153
  • 5
  • Short version: It depends on the hashing function. See [this question][1] [1]: http://stackoverflow.com/questions/996495/hash-collision-and-appending-data?rq=1 – Oliver Matthews Oct 29 '13 at 15:15

3 Answers3

1

This particular technique is called a length extension attack. Whether or not a hash function is vulnerable obviously depends on the particular hash function. Hash functions based on the Merkle–Damgård construction, such as MD5 and SHA-1, are vulnerable. SHA-3 is not vulnerable, and HMAC constructions are also not vulnerable.

Adam Rosenfield
  • 390,455
  • 97
  • 512
  • 589
0

Depends on the has function. Since hash functions are not homomorphic (that is, where: f(x) = f(y) implies x = y), it does not follow that f(x + z) and f(y + z) will map to the same item. Consider a counter example:

Given the hash function

f(x) = (x * 3) + 1 mod 6

then f(2) = 1 and f(6) = 1. Let z = 1. Then:

f(2 + z) = 4 and f(6 + z) = 1

thus f(2) = f(6) but f(2 + z) ≠ f(6 + z).

However, if the hash function were homomorphic, then by definition of homomorphism:

f(p + q) = f(p) + f(q)

and therefore:

f(x + z) = f(x) + f(z) 
f(y + z) = f(y) + f(z)

which but since f(x) = f(y) as you initially stated:

f(x) + f(z) = f(y) + f(z)

and so their hashes would be the same.

sircodesalot
  • 11,231
  • 8
  • 50
  • 83
0

(Please be indulgent, it's my first answer :D) Not necessarily:

Consider the following data (as lists of numbers)

x = [8 0 4]
y = [8 1 0]
z = [5]

and the hashing function:

H([a b c]) = a + b*c
H([a b c d]) = H([b c d]) + H([a b c]) 

Then, a collision for x and y occurs:

H(x) = H([8 0 4]) = 8 + 0*4 = 8
H(y) = H([8 1 0]) = 8 + 1*0 = 8

But when appending data, the hashes aren't equal:

H(z + x) = H([5 8 0 4]) = H([5 0 8]) + H([8 0 4]) = 5 + 8 = 13
H(z + y) = H([5 8 1 0]) = H([5 8 1]) + H([8 1 0]) = 13 + 8 = 21