0

When I am working with none-English characters the generated hash is not the same but otherwise, everything is fine. does anyone have a solution to get the same result?

MYSQL

  SELECT  MD5( 'سلام')
------------- result: 78903c575b0dda53c4a7644a2dd36d0e

SQL-Server

SELECT CONVERT(VARCHAR(50), HASHBYTES('MD5',  'سلام'), 2) 
-------------- result:3C50F6458899B3C0988BE358290F5F24


SELECT CONVERT(nVARCHAR(50), HASHBYTES('MD5',  N'سلام'), 2) 
-------------- result:0381CA5081FBC68B2F55F2F2C21399D7

Hooman Nemati
  • 97
  • 1
  • 7
  • 1
    I think part of your problem is figuring out which character encoding MySQL is using so you can use the same encoding on SQL Server. On MySQL if you `SELECT HEX(CAST('سلام' as binary))` you get `D8B3D984D8A7D985`. Now on SQL Server if you do `declare @blob varbinary(8) = 0xD8B3D984D8A7D985; select hashbytes('md5', @blob);` you get the same hash you were expecting, `0x78903C575B0DDA53C4A7644A2DD36D0E `. – AlwaysLearning May 15 '20 at 00:40
  • yeap that's one way of doing it but I can't figure it out. – Hooman Nemati May 15 '20 at 09:40

1 Answers1

2

Based on MySQL returning D8B3D984D8A7D985 from SELECT HEX(CAST('سلام' as binary)) it appears that MySQL is using UTF-8 character encoding.

On SQL Server 2019 you can use the LATIN1_GENERAL_100_CI_AS_SC_UTF8 collation (it's not supported on SQL Server 2017) such as the following:

create table #Test (
  UTF16 nvarchar(max),
  UTF8 varchar(max)  COLLATE LATIN1_GENERAL_100_CI_AS_SC_UTF8
)
insert #Test values (N'سلام', N'سلام');

select UTF16 from #Test;
select CAST(UTF16 as varbinary) as [UTF16-Bytes] from #Test;
select UTF8 from #Test;
select CAST(UTF8 as varbinary) as [UTF8-Bytes] from #Test;

Which returns:

UTF16
سلام

UTF16-Bytes
0x3306440627064506

UTF8
سلام

UTF8-Bytes
0xD8B3D984D8A7D985

And then with hashbytes():

select hashbytes('MD5', cast(UTF16 as varbinary)) as [UTF16-Hash] from #Test;
select hashbytes('MD5', cast(UTF8 as varbinary)) as [UTF8-Hash] from #Test;

Which returns:

UTF16-Hash
0x0381CA5081FBC68B2F55F2F2C21399D7

UTF8-Hash
0x78903C575B0DDA53C4A7644A2DD36D0E

Hope this helps!

AlwaysLearning
  • 7,915
  • 5
  • 27
  • 35
  • ``` SELECT CONVERT(VARCHAR(50), HASHBYTES('MD5', CAST(N'سلام' COLLATE LATIN1_GENERAL_100_CI_AS_SC_UTF8 AS VARCHAR(max)) ), 2) ``` if you want to convert instead of changing collation of table – Hooman Nemati May 18 '20 at 15:32