0

I'm using Microsoft's FCIV utility to compute SHA1 hash values of xml files. The xml files are periodically re-released in batches, but many of the files don't change from one release to the next. By comparing a newly released file's SHA1 hash value against a stored hash value for a previously released version of the same file I can determine whether the file has changed and thus avoid reprocessing unchanged files. The processing results are stored in a SQL Server 2017 table. I will add a column to that table to store the SHA1 hash values. The question is "What data type should I use for that column?"

The FCIV utility outputs SHA1 hash values in base64-encoded format, like this:

<FILE_ENTRY>
    <name>c:\test hash\test_file1.xml</name>
    <SHA1>a+Q7DhN+0s4+ugp5rFwjzhAT4Sw=</SHA1>
</FILE_ENTRY>
<FILE_ENTRY>
    <name>c:\test hash\test_file2.xml</name>
    <SHA1>R931jYktQJ0orCjSAvlh0ng3eis=</SHA1>
</FILE_ENTRY>
<FILE_ENTRY>
    <name>c:\test hash\test-file3.xml</name>
    <SHA1>V7rg/B8wUxwXX+D3M49oHAteVGw=</SHA1>
</FILE_ENTRY>

So, the output SHA1 hash values are 28-character case-sensitive strings.

It seems to me that it would be appropriate in this case to store the SHA1 hash values output by FCIV in a char(28) column having a case-sensitive collation setting. Is that the best solution, given the above background?

[I'm aware that Microsoft no longer supports FCIV and that SHA1 is insecure. However I'm not concerned about security--all I need are reasonably collision-free hash values.]

BRW
  • 187
  • 1
  • 10
  • You can do that. Binary will be faster internally however as you don’t care about collation rules at all and just want simple binary comparison – Conor Cunningham MSFT Sep 30 '19 at 00:00
  • Thanks Conor. Since my OP, I've found [this post](https://stackoverflow.com/questions/42143245/store-sha-1-in-database-in-less-space-than-the-40-hex-digits) which suggests that blob is the way to go, but I prefer your suggestion, so I'll give binary(20) a try. – BRW Sep 30 '19 at 00:25

0 Answers0