I have a requirement to hash input strings and produce 14 digit decimal numbers as output.
The math I am using tells me I can have, at maximum, a 46 bit unsigned integer.
I am aware that a 46 bit uint means less collision resistance for any potential hash function. However, the number of hashes I am creating keeps the collision probability in an acceptable range.
I would be most grateful if the community could help me verify that my method for truncating a hash to 46 bits is solid. I have a gut feeling that there are optimizations and/or easier ways to do this. My function is as follows (where bitLength is 46 when this function is called):
public static UInt64 GetTruncatedMd5Hash(string input, int bitLength)
{
var md5Hash = MD5.Create();
byte[] fullHashBytes = md5Hash.ComputeHash(Encoding.UTF8.GetBytes(input));
var fullHashBits = new BitArray(fullHashBytes);
// BitArray stores LSB of each byte in lowest indexes, so reversing...
ReverseBitArray(fullHashBits);
// truncate by copying only number of bits specified by bitLength param
var truncatedHashBits = new BitArray(bitLength);
for (int i = 0; i < bitLength - 1; i++)
{
truncatedHashBits[i] = fullHashBits[i];
}
byte[] truncatedHashBytes = new byte[8];
truncatedHashBits.CopyTo(truncatedHashBytes, 0);
return BitConverter.ToUInt64(truncatedHashBytes, 0);
}
Thanks for taking a look at this question. I appreciate any feedback!