7

I need to produce a Hash value based off of a variable length string that I can store within a field no longer than 16 (due to vendor requirements).

I am concatenating together several strings that are being passed through a C# script transformation in order to calculate the Hash. I am constrained by the vendor's file specification in that the output of the hash cannot be any longer than 16 characters.

Does anyone have any suggestions? As an example the string conversion of the MD5 algorithm (128-bits) has a hex-encoded length of 32 characters.

Ian Boyd
  • 246,734
  • 253
  • 869
  • 1,219
Matt
  • 85
  • 1
  • 1
  • 5

5 Answers5

19

The cryptographic has functions are designed such that you may truncate the output to some size and the truncated hash function remains a secure cryptographic hash function. For example, if you take the first 128 bits (16 bytes) of the output of SHA-512 applied to some input, then the first 128 bits are a cryptographic hash as strong as any other 128-bits cryptographic hash.

The solution is to choose some cryptographic hash function - SHA-256, SHA-384, and SHA-512 are good choices - and truncate the output to 128 bits (16 bytes).

--EDIT--

Based on the comment that the hash value must, when encoded to ASCII, fit within 16 ASCI characters, the solution is

  • first, to choose some cryptographic hash function (the SHA-2 family includes SHA-256, SHA-384, and SHA-512)
  • then, to truncate the output of the chosen hash function to 96 bits (12 bytes) - that is, keep the first 12 bytes of the hash function output and discard the remaining bytes
  • then, to base-64-encode the truncated output to 16 ASCII characters (128 bits)
  • yielding effectively a 96-bit-strong cryptographic hash.
Community
  • 1
  • 1
yfeldblum
  • 65,165
  • 12
  • 129
  • 169
  • 16 bytes converted to HEX is still 32 characters. Could you also provide a link to back your assertion about truncating part of a computed hash is secure as using the lower version of the hash? This may be true for SHA (I don’t know either way) but I don't think you can make a statement that is carries true across all hashes. – Matthew Whited Dec 02 '10 at 22:27
  • @Matthew Whited: I think @Justice's claim is generally considered true. It is hard to imagine an attack that works on the truncated hash but not on the full hash, other than brute force. – President James K. Polk Dec 03 '10 at 01:30
  • Are you saying that you cannot simply store the hash bytes, as a sequence of bytes? That you are required to encode bytes into, e.g., hex encoding or base64 encoding? If you can store the raw bytes, you should store the raw bytes, and take up all 16 bytes' worth of space. – yfeldblum Dec 03 '10 at 02:49
  • I added a link to another SO question whose title is "Is it okay to truncate a SHA256 hash to 128 bits?" – yfeldblum Dec 03 '10 at 02:50
  • The requirements for the file that I must produce are a tab delimited ASCII flat file and since I'm not overly familiar with the ramifications of outputing the sequence of bytes to a text file I was extremely interested in these types of suggestions. – Matt Dec 03 '10 at 14:27
  • Edited. The solution is a combination of choosing a strong cryptographic hash, truncating the output to 12 bytes, and then base64-encoding the result to 16 bytes of ASCII. – yfeldblum Dec 04 '10 at 15:30
  • I am using the SHA256Managed to create the hash but I was curious about what the implications are of substringing the base64-encoded string to 16 vs. truncating to 12 bytes prior to encoding. Is this mainly a performance issue or is there a security risk as well? – Matt Dec 20 '10 at 15:40
  • Generally speaking, it is instructive always to think about hash values as bit-sequences or byte-sequences, with the various text encoding schemes (hex, base64) being used only as the final step. In this case, either order will work out to the same thing (truncating to 12 bytes and then base64-encoding, or base64-encoding and then truncating to 16 bytes). Note that encoding-then-truncating will require encoding the entire hash value, whereas truncating-then-encoding will not. – yfeldblum Dec 20 '10 at 22:23
  • Thanks Justice, I appreciate the explanation. – Matt Dec 21 '10 at 15:54
  • 1
    Truncated hashes are significantly weaker than the originals. You can more easily break the truncated hash, without having an attack on the full hash: http://csrc.nist.gov/groups/ST/hash/documents/Kelsey_Truncation.pdf – naasking Nov 04 '11 at 16:58
3

You can easily use an MD5 hash for this, but you will have to alter the way it is stored. An MD5 is 128-bits, which is typically displayed as 32 4-bit (hexadecimal) values. A standard char is 8 bits, however, so 16 characters is exactly enough to store the value of an MD5 hash.

To convert it, try the following:

String hash32 = "d41d8cd98f00b204e9800998ecf8427e"
String hash16 = ""

for(int i = 0; i < 32; i+=2)
{
  uint high = Convert.ToUInt32(hash32[i], 16);
  uint low = Convert.ToUInt32(hash32[i+1], 16);
  char c = (char) ((high << 4) | low);

  hash16 += c;
}
Wade Tandy
  • 4,026
  • 3
  • 23
  • 31
  • Using OR, XOR or any other binary function will increase the chance of collisions. If you are just using this for a check sum this may work, but you would probably be safer with XOR. Otherwise you might as well just use a parity check. – Matthew Whited Dec 02 '10 at 22:30
  • 3
    If you notice, I am shifting a number that will always be 4 bytes or fewer to the left 4 bytes, so the low and high bits will never collide. – Wade Tandy Dec 02 '10 at 22:34
  • **Do not use MD5 hashes anymore.** They haven't been considered secure for years and years. Just if you're arriving from Google or something. – Qix - MONICA WAS MISTREATED Aug 16 '22 at 16:43
  • In the case of security-minded people ending up on this thread, this is important information to convey, regardless of OP's intent. Hence why I said "secure". So no, sorry, but my comment is 100% warranted. – Qix - MONICA WAS MISTREATED Sep 03 '22 at 23:39
1

Any comments about this code? Seems works well...

var p = new MD5CryptoServiceProvider();
var dic = new Dictionary<long, string>();

for (var i = 0; i < 10000000; i++)
{
    if (i%25000 == 0)
        Console.WriteLine("{0:n0}", i);

    var h = p.ComputeHash(Encoding.UTF8.GetBytes(Guid.NewGuid().ToString()));
    var b = BitConverter.ToInt64(h, 0);

    // "b" is hashed Int64

    if (!dic.ContainsKey(b))
        dic.Add(i, null);
    else
        throw new Exception("Oops!");
}
Efe Erdoğru
  • 121
  • 2
  • 10
1

I've noticed this question is relatively old, but I'm sure someone will find this answer to it valuable.

My suggestion would be to use Blake2b which has the ability to use 8 bit through 512 bits. If no key size is used, the default value is used "512" in this case. Blake2s default value 256 bits.

        // BLAKE2b
        // using System.Data.HashFunction;
        //
        // String message to use.
        string str = "The quick brown fox jumps over the lazy dog";
        // Initialize
        System.Data.HashFunction.Blake2B Blake2B = new System.Data.HashFunction.Blake2B();
        // Get string hash bytes; create 64 bit hash.
        var HashBytes = Blake2B.ComputeHash(str, 64);
        // Convert bytes to string and remove the dashes.
        string hexString = BitConverter.ToString(HashBytes).Replace("-", string.Empty);
        // Display results.
        MessageBox.Show(hexString);
        /*
         * "The quick brown fox jumps over the lazy dog" produces a hash value of
         * "A8ADD4BDDDFD93E4877D2746E62817B116364A1FA7BC148D95090BC7333B3673F82401CF7AA2E4CB1ECD90296E3F14CB5413F8ED77BE73045B13914CDCD6A918"
         * and "2FD0F3FB3BD58455" hash for 64 bits.
         */

Hope this helps!

DemarcPoint
  • 183
  • 1
  • 9
0

If you have 16 bytes storing a 128bit number is not an issue. Store the 128bit value as a 16byte value instead of a 32 character string that stored the 16 byte value as HEX.

As a note I have used GUID/UUID fields in databases to store MD5 hashes. While no longer cryptographically secure, 128bit MD5 hashes are fine for Checksums (and is much better than 64 bits.)

var result = MD5.Create().ComputeHash(new byte[] { 0 });

Console.WriteLine(result.Length);
Console.WriteLine(Convert.ToBase64String(result));
Console.WriteLine(result.Aggregate(new StringBuilder(),
                                    (sb, v) => sb.Append(v.ToString("x2"))));

//16
//k7iFrf4NoInN9jSQT9WfcQ==
//93b885adfe0da089cdf634904fd59f71

File.WriteAllBytes("tempfile.dat", result);

var input = File.ReadAllBytes("tempfile.dat");

Console.WriteLine(input.Length);
Console.WriteLine(Convert.ToBase64String(input));
Console.WriteLine(input.Aggregate(new StringBuilder(), 
                                    (sb, v) => sb.Append(v.ToString("x2"))));

//16
//k7iFrf4NoInN9jSQT9WfcQ==
//93b885adfe0da089cdf634904fd59f71

Note that I don't show the file content because there is a good chance that it will contain "unprintable" characters.

Matthew Whited
  • 22,160
  • 4
  • 52
  • 69