2

I want to has on both the c# level and the sql server 2012 level.

How do I know if the SHA1 implementation will produce the same results?

In sql server I am doing:

HASHBYTES('SHA1', @data)

I haven't written my c# function yet.

I'm just worried if they don't produce the same hash value, I will get some inconsistencies in my application.

Note: The columns I will be using to create a hash are datetime, nvarchar and int. I have to convert these to strings correct?

cool breeze
  • 4,461
  • 5
  • 38
  • 67
  • The only difference you might find is one in upper-case vs. one in lower-case when converted to strings. `SHA` is a defined standard, and as a result **any** language/construct claiming to provide a `SHA1` hash **must** return the same byte-valued ouput. As I stated, the only difference *might* be the `string` representation. (You may see `FEFF` instead of `feff` for the same two bytes, `254` and `255` respectively.) You may also find this a good read: http://stackoverflow.com/questions/14146636/how-do-i-calculate-the-equivalent-to-sql-server-hashbytessha1-columnname – Der Kommissar May 19 '15 at 19:55
  • The response is yes. Here there is a nearly functional example code: http://stackoverflow.com/q/30155754/613130 Be aware of the encoding (collation) of the `char`/`varchar` columns! And note that NUMBER columns and DATETIME, SMALLDATETIME will surely need some special handling. And in the end, anything that isn't a varchar/nvarchar/int/smallint/bigint will require special handling. – xanatos May 19 '15 at 19:58
  • might be useful: http://weblogs.sqlteam.com/mladenp/archive/2009/04/28/Comparing-SQL-Server-HASHBYTES-function-and-.Net-hashing.aspx – Cadburry May 19 '15 at 20:12
  • @Cadburry That link is written by someone that doesn't seem to comprehend the effect of collation on hashing: he wrote *just to show that collation doesn't change the hash*, but `SELECT '€' COLLATE SQL_Latin1_General_Cp1250_CS_AS, HASHBYTES('sha1', '€' COLLATE SQL_Latin1_General_Cp1250_CS_AS), '€' COLLATE SQL_Latin1_General_Cp1251_CS_AS, HASHBYTES('sha1', '€' COLLATE SQL_Latin1_General_Cp1251_CS_AS)` (sqlfiddle http://sqlfiddle.com/#!6/9eecb7db59d16c80417c72d1/1021), and that in more than one page of blog doesn't spend one word for "standard" collations (like SQL_Latin1_General_CP1_CI_AS) – xanatos May 19 '15 at 20:54

1 Answers1

0

You won't have any problems using this code:

    public static byte[] GetHash(string inputString)
    {
        HashAlgorithm algorithm = SHA1.Create();
        return algorithm.ComputeHash(Encoding.Unicode.GetBytes(inputString));
    }

It creates just the same hash as you can create on SQL Server:

HASHBYTES('SHA1', CAST(@data AS NVARCHAR(100)))
Alexandre Severino
  • 1,563
  • 1
  • 16
  • 38
  • If there is something sure, is that SQL Server doesn't use UTF8 as its encoding. – xanatos May 19 '15 at 20:08
  • Weird. I never ever had any problems with that and I've been using this code for years now... – Alexandre Severino May 19 '15 at 20:10
  • "In SQL Anywhere, the NVARCHAR data type contains characters encoded using UTF-8 (so normal ASCII characters are one byte in size)." http://sqlanywhere-forum.sap.com/questions/19446/sha1-hash-doesnt-seem-to-match-other-implementations – Alexandre Severino May 19 '15 at 20:12
  • 1
    SQL Anywhere isn't SQL Server: http://en.wikipedia.org/wiki/SQL_Anywhere : *SAP SQL Anywhere is a proprietary relational database management system (RDBMS) product from SAP. SQL Anywhere was known as Sybase SQL Anywhere prior to the acquisition of Sybase by SAP.* – xanatos May 19 '15 at 20:16
  • try hashing something like *perché* (it is an italian word, it means *why*) – xanatos May 19 '15 at 20:16
  • Oh. I see. Thanks for the tip. Does SQL Server use UTF16, then? – Alexandre Severino May 19 '15 at 20:20
  • NCHAR/NVARCHAR are UTF16. CHAR/VARCHAR have a specific encoding (called collation, because it is an encoding + sorting rules). UTF8 isn't supported as string data. XML data (columns) can be UTF8 – xanatos May 19 '15 at 20:22
  • @xanatos, now I'm converting the string directly to a byte array, which is 16 bits by default. I also force the `HASHBYTES` to act with `NVARCHAR`. Is anything else required to avoid problems with this function? – Alexandre Severino May 19 '15 at 20:31
  • 1
    No from what I know.... But if you want an array of bytes, `Encoding.Unicode.GetBytes(inputString)` is enough. – xanatos May 19 '15 at 20:32
  • @xanatos So is this answer correct/updated with your suggestion? My hash will have datetime, nvarchar and int columns. – cool breeze May 19 '15 at 21:36