0

I'm trying to hash some data includes Turkish characters inside. For example; when I hashed "aaç" string with Polarssl, sha-1 result comes like that :

10 bf 94 7f 94 65 9f b0 66 76 97 b d4 25 de 9d e4 85 8e ca

but I looked from internet same string's(aaç) hash result comes like :

97 dd 7a 00 e8 ff 49 09 47 60 03 50 83 db 7c ba 87 07 0f d9

why could these two sha1 results be different?

Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
nerd
  • 115
  • 1
  • 7

1 Answers1

1

Text encoding differences. The character ç is encoded differently in the ISO 8859-1 and UTF-8 encodings, and this difference causes the SHA-1 hashes of the resulting byte sequences to be different:

SHA1("aa\xe7")     = 10bf947f94659fb06676970bd425de9de4858eca (ISO 8859-1)
SHA1("aa\xc3\xa7") = 97dd7a00e8ff49094760035083db7cba87070fd9 (UTF-8)
  • Thank you for your answer, it's true for this question but i have problems with 'ş' 'ğ' 'İ'.. etc letters,too. All of these characters belongs to ISO8859-9 charset. and now i can not convert txt file encoded with UTF-8 to ISO8859-9 although i used iconv. – nerd Dec 26 '14 at 10:32
  • Like `ç`, those characters are also encoded differently in UTF-8 and ISO8859-1 (or ISO8859-9). –  Dec 26 '14 at 17:00