4

I'm trying to understand this JavaScript base64 decoding code but I'm puzzled by this loop at lines 70-84:

for (i=0; i<bytes; i+=3) {  
    //get the 3 octects in 4 ascii chars
    enc1 = this._keyStr.indexOf(input.charAt(j++));
    enc2 = this._keyStr.indexOf(input.charAt(j++));
    enc3 = this._keyStr.indexOf(input.charAt(j++));
    enc4 = this._keyStr.indexOf(input.charAt(j++));

    chr1 = (enc1 << 2) | (enc2 >> 4);
    chr2 = ((enc2 & 15) << 4) | (enc3 >> 2);
    chr3 = ((enc3 & 3) << 6) | enc4;

    uarray[i] = chr1;           
    if (enc3 != 64) uarray[i+1] = chr2;
    if (enc4 != 64) uarray[i+2] = chr3;
}

Specifically, I'd like to know why there are only 3 octets in 4 ascii chars. Shouldn't there be 4 octets?

cdmckay
  • 31,832
  • 25
  • 83
  • 114

2 Answers2

3

Because 3 octets requires 24 bits of storage. In base 64 you have 6 bits per character (ascii character). 4 characters * 6 bits = 24 bit, so 24-bits requires 4 base64 characters.

Goz
  • 61,365
  • 24
  • 124
  • 204
  • @cdmckay: It is but due to the fact all modern systems use 8-bit bytes its quite immaterial ... the top bit gets wasted or is used to indicate extra info. You could pack 8 characters into 7 bytes but memory is cheap and the encode/decode would be fairly expensive by comparison. – Goz Jul 27 '13 at 19:19
  • Right, but you said that you use 6 bits per character in your answer. I was just wondering where the 7th bit went. – cdmckay Jul 28 '13 at 20:33
  • @cdmckay: Base64 uses 6 bits per character – Goz Jul 28 '13 at 23:14
2

The comment refers to the Base64 algorithm itself. Since you are encoding binary strings into 7-bit US-ASCII, the string has to grow, and that's the exact ratio: 3 source bytes become 4 target characters.

Álvaro González
  • 142,137
  • 41
  • 261
  • 360