As I know 'a'
is an 8 bits character, 'â'
is a 16 bits character.
Not really. Java char
is an unsigned 16-bit type, so both 'a'
and 'â'
are 16-bit characters. It is true that 'a'
's top 8 bits are set to zero, but these bits are there nevertheless. Same goes for 'â'
(see below).
How to know a character is 8 bits or 16 bits or higher?
Compare ch & 0xFF00
to zero. If it is zero, the upper 8 bits are all zeros; otherwise, some of these eight bits are non-zeros.
Why 'â'
character could not present at 8 bits?
It can be presented as using 8-bit: 'â'
's code is 0xE2, or 226. It fits in 8 bits, but it does not fit in 7 bits. Here is a convenient table for looking up character codes.
'a'
or 'â'
just UI form, how do they look like in bits form?
Since char
is an integral type, you can convert it to int
and print them in binary, decimal, hex or other radix to see the bit patterns behind the character representations.
97 is the code of 'a'
, how to calculate this number or it's just the ordinal number of character?
Cast 'a'
to an int
:
int a = (int)'a';