1

I come across the below statement while studying about HTML Character Sets and Character Encoding :

Since ASCII used 7 bits for the character, it could only represent 128 different characters.

When we convert any decimal value from the ASCII character set to its binary equivalent it comes down to a 7-bits long binary number. E.g. For Capital English Letter 'E' the decimal value of 69 exists in ASCII table. If we convert '69' to it's binary equivalent it comes down to the 7-bits long binary number 1000101

Then, why in the ASCII Table it's been mentioned as a 8-bits long binary number 01000101 instead of a 7-bits long binary number 1000101 ?

This is contradictory to the statement

Since ASCII used 7 bits for the character, it could only represent 128 different characters.

The above statement is saying that ASCII used 7 bits for the character.

Please clear my confusion about considering the binary equivalent of a decimal value. Whether should I consider a 7-bits long binary equivalent or a 8-bits long binary equivalent of any decimal value from the ASCII Table? Please explain to me in an easy to understand language.

Again, consider the below statement :

Since ASCII used 7 bits for the character, it could only represent 128 different characters.

According to the above statement how does the number of characters(128) that ASCII supports relates to the fact that ASCII uses 7 bits to represent any character?

Please clear the confusion.

Thank You.

  • The ASCII table on that web site includes Extended ASCII which, as the name suggests, is an extension of the "normal" ASCII. Also, binary numbers are often written in lengths that are multiples of 4 or 8 regardless of what the number range happens to be. – JJJ Jun 11 '18 at 05:45
  • https://en.wikipedia.org/wiki/Extended_ASCII – mplungjan Jun 11 '18 at 05:54

1 Answers1

2

In most processors, memory is byte-addressable and not bit-addressable. That is, a memory address gives the location of an 8-bit value. So, almost all data is manipulated in multiples of 8 bits at a time.

If we were to store a value that has by its nature only 7 bits, we would very often use one byte per value. If the data is a sequence of such values, as text might be, we would still use one byte per value to make counting, sizing, indexing and iterating easier.

When we describe the value of a byte, we often show all of its bits, either in binary or hexadecimal. If a value is some sort of integer (say of 1, 2, 4, or 8 bytes) and its decimal representation would be more understandable, we would write the decimal digits for the whole integer. But in those cases, we might lose the concept of how many bytes it is.

BTW—HTML doesn't have anything to do with ASCII. And, Extended ASCII isn't one encoding. The fundamental rule of character encodings is to read (decode) with the encoding the text was written (encoded) with. So, a communication consists of the transferring of bytes and a shared understanding of the character encoding. (That makes saying "Extended ASCII" so inadequate as to be nearly useless.)

An HTML document represents a sequence of Unicode characters. So, one of the Unicode character encodings (UTF-8) is the most common encoding for an HTML document. Regardless, after it is read, the result is Unicode. An HTML document could be encoded in ASCII but, why do that? If you did know it was ASCII, you could just as easily know that it's UTF-8.

Outside of HTML, ASCII is used billions—if not trillions—of times per second. But, unless you know exactly how it pertains to your work, forget about it, you probably aren't using ASCII.

Tom Blodget
  • 20,260
  • 3
  • 39
  • 72