2

I am calling MessageDigest.digest() method to get the hash of the password. If the password contains a Norwegian character e.g. 'ø', this method returns same hash for different strings with different last character. "Høstname1" and "Høstname2" have same hash but "Hostnøme1" will have a different hash as 'ø' location is different. This is with "utf-8" encoding. For "iso-8859-1" encoding, I am not seeing this issue. Is this a known problem or am I missing something here?

This is my code:

    import java.security.MessageDigest;

    String password = "Høstname1";
    String salt = "6";

    MessageDigest messageDigest = MessageDigest.getInstance("SHA-256");
    byte[] hash = new byte[40];
    messageDigest.update(salt.getBytes("utf-8"), 0, salt.length());
    messageDigest.update(password.getBytes("utf-8"), 0, password.length());
    hash = messageDigest.digest();
Liran Funaro
  • 2,750
  • 2
  • 22
  • 33
namang029
  • 23
  • 4

1 Answers1

0

You shouldn't pass the length of the string to messageDigest.update

messageDigest.update(password.getBytes("utf-8"), 0, password.length());

but the length of the byte array since the utf-8 encoded string usually has more bytes than the number of characters in the string:

byte[] pwd = password.getBytes("utf-8");
messageDigest.update(pwd, 0, pwd.length);

or even shorter (thanks @Matt)

messageDigest.update(password.getBytes("utf-8"));

Same for salt.

Therefore your code was only hashing the beginning of the password.

wero
  • 32,544
  • 3
  • 59
  • 84
  • Thanks, it is working now. I am just wondering why was working earlier with all the English characters. – namang029 May 16 '17 at 14:58
  • @namang029 if all characters in the password are < 128 then `password.getBytes("utf-8").length == password.length()` so you didn't notice the bug – wero May 16 '17 at 15:21
  • @namang029 use the overload without the length so you don't have to worry about whether or not you're making this mistake – Matt Timmermans May 16 '17 at 16:22