How to test an MD5 implementation?

Question

I am considering using a JS MD5 implementation.

But I noticed that there are only a few tests. Is there a good way of verifying that implementation is correct?

I know I can try it with a few different values and see if it works, but that only means it is correct for some inputs. I would like to see if it is correct for all inputs.

There is the [these Test Cases MD5](https://www.cs.bris.ac.uk/Research/CryptographySecurity/MPC/md5-test.txt) which should provide ample coverage of a working implementation. However you should not use MD5 at all since it's insecure. — Sani Huttunen, Oct 21 '15 at 23:57
There's no way to test any application and ensure it works for all inputs. You just have to try enough different inputs to convince yourself. If you test your code with a few MB of random data, it's very unlikely that an erroneous implementation would succeed. — Barmar, Oct 21 '15 at 23:59
@BSeven Is requirement to generate unique strings containing 32 alphanumeric characters , without duplicates ? — guest271314, Oct 22 '15 at 00:31
You are looking for a [formal verification](https://en.wikipedia.org/wiki/Formal_verification), which is by far more complex than testing. And I don't know any framework that works with Js. — Bergi, Oct 22 '15 at 00:35
Generate a bunch pairs of raw and hashed values from a good known implementation, and just verify if you generate the same thing. Or, just port over their tests for their implementation and see how you fare. — Joseph, Oct 22 '15 at 00:46

score 3 · Accepted Answer · edited Oct 07 '21 at 05:57

The corresponding RFC has a good description of the algorithm, an example implementation in C, and a handful of test values at the end. All three together let you make a good guess about the quality of the examined implementation and that's all you can get: a good guess.

Testing an applications with an infinite or at least a very large input set as a black box is hard, impossible even in most cases. So you have to check if the code implements the algorithm correctly. The algorithm is described in RFC-3121 (linked to above). This description is sufficient for an implementation. The algorithm itself is well known (in the scientific sense, i.e.: many papers have been written about it and many flaws have been found) and simple enough to skip the formal part, just inspect the implementation.

Problems to expect with MD5 in JavaScript: input of one or more zero bytes (you can check the one and two bytes long inputs thoroughly), endianess (should be no problem but easy to check) and the problem of the unsigned integer used for bit-manipulation in JavaScript (">>" vs. ">>>" but also easy to check for). I would also test with a handful of data with all bits set. The algorithm needs padding, too, you can check it with all possible input of length shorter than the limit.

Oh, and for all of you dismissing the MD5-hash: it still has its uses as a fast non-cryptographic hash with a low collision-rate and a good mixing (some call the effect of the mixing "avalanche", one bit change in the input changes many bits in the output). I still use it for larger, non-cryptographic Bloom-filters. Yes, one should use a special hash fitting the expected input but constructing such a hash function is a pain in the part of the body Nature gave us to sit on.

How to test an MD5 implementation?

1 Answers1