So I already have a class that will read 8 bits from a file each time when I call the method read(). All the characters' corresponding decimal number is in ASCII table. Now I encountered a character 'É' whose ASCII code binary code is 11001001. And the result is correct when i call
System.out.println(Integer.toBinaryString('É'));
However, when I open the file in binary format the actual bits is 11000011 10001001 00001010. I understand that 00001010 is a line feed. But 11000011 and 10001001 definitely don't match the 11001001. I changed the file and made it only contain 'a' and now the file only contains 01100001 for a , which is correct. The character encoding is UTF-8. Here's my code to put the character and its frequency into a map
while ((bit = readInputStream()) != -1) {
if (!bitOccurrence.containsKey(bit))
bitOccurrence.put(bit, 1);
else
bitOccurrence.put(bit, bitOccurrence.get(bit) + 1);
}
Here is the private readInputStream method
private int readInputStream() throws IOException {
InputStreamReader r = new InputStreamReader(i); // i is the InputStream
return r.read();
}
So my question is how does this problem occur and what is the work around for this problem if I can only read 8 bits each time?