0

For the purpose of getting the content-derived key of a longer text, I do calculate HASHBYTES('SHA1', text). It returns 20 bytes long varbinary. As I know the length of the result, I am storing it as binary(20).

To make it shorter (to be used as a key), I would like to follow the Git idea of a short hash -- as if the first (or last) characters of the hexadecimal representation. Instead of characters, I would like to get binary(5) value from the binary(20).

When trying with the SQL Server 2016 it seems that the following simple way:

DECLARE @hash binary(20) = HASHBYTES('SHA1', N'příšerně žluťoučký kůň úpěl ďábelské ódy')
DECLARE @short binary(5) = @hash

SELECT @hash, @short

Returns the leading bytes (higher order bytes):

(No column name)                            (No column name)
0xE02C3C55FBA0DF13ADA1B626B1E31746D57B4602  0xE02C3C55FB

However, the documentation (https://learn.microsoft.com/en-us/sql/t-sql/data-types/binary-and-varbinary-transact-sql?view=sql-server-ver15) warns that:

Conversions between any data type and the binary data types are not guaranteed to be the same between versions of SQL Server.

Well, this is not exactly a conversion. Still, does this uncertainty hold also for getting shorter version of binary from the longer version of binary? What should I expect for future versions of SQL Server?

Dale K
  • 25,246
  • 15
  • 42
  • 71
pepr
  • 20,112
  • 15
  • 76
  • 139
  • 1
    That comment is what we in the industry would call a CYA -- Cover Thine Behind. More complex, wibbly types like `FLOAT`, `HIERARCHYID`, `DATETIME2` and `GEOGRAPHY` might conceivably get different serialization schemes between versions (and note that this is separate from the on-disk format, which isn't always equivalent). Simple truncation of a binary value is not likely to suddenly change semantics. However, you don't need to rely on it: `SUBSTRING` is one of the few functions that supports processing `BINARY` values directly, so `SUBSTRING(@hash, 1, 5)` is an unambiguous way of doing it. – Jeroen Mostert Mar 05 '21 at 19:04
  • @JeroenMostert: I like that SUBSTRING solution as it also is more explicit when read in the souce code. Please, copy your comment as the answer, and I will accept it. – pepr Mar 05 '21 at 19:45

0 Answers0