I have a byte array, which is the hash of a file. This is made with messageDigest
, so there is a padding. Then I make a shorthash, which is just the two first bytes of the hash, like this:
byte[] shorthash = new byte[2];
System.arraycopy(hash, 0, shortHash, 0, 2);
To make it readable for the user and to save it in a DB, I'm converting it to String with a Base64 Encoder
:
Base64.getUrlEncoder().encodeToString(hash); //Same for shorthash
What I don't understand is:
Why is the String representing my shorthash four characters long? I thought a char was one or two bytes, so since I'm copying only two bytes, I shouldn't have more than two chars, right?
Why isn't my shorthash String the same as the start of the hash String?
For example, I'll have :
Hash: LE5D8vCsMp3Lcf-RBwBRbO1v4soGq7BBZ9kB_2SJnGY=
Shorthash: Rak=
You can see the =
at the end of each; it certainly comes from the MessageDigest
padding, so it is normal for the hash, but why for the shorthash? It should be the two FIRST bytes, and the =
is at the end!
Moreover: since I wanted to get rid of this Padding, I decided to do that:
String finalHash = Base64.getUrlEncoder().withoutPadding().encodeToString(hash);
byte[] shorthash = new byte[2];
System.arraycopy(hash.getBytes(), 0, shortHash, 0, 2);
String finalShorthash = Base64.getUrlEncoder().encodeToString(shorthash);
I didn't wanted to copy directly the String, since, I'm not really sure what would be two bytes in a string.
Then, the =
is gone for my hash, but not for my shorthash. I guess I need to add the "withoutPadding" option to my shorthash, but I don't understand why, since it's a copy of my hash who shouldn't have padding anymore. Except if the padding is gone only on the String representation and not in the Byte behind it?
Can someone explain this behavior? Does it comes from the conversion between byte[] and String?