A question on reading text files in Java. I have a text file saved with UTF-8 encoding with only the content:
Hello. World.
Now I am using a RandomAccessFile
to read this class. But for some reason, there seems to be an "invisible" character at the beginning of the file ...?
I use this code:
File file = new File("resources/texts/books/testfile2.txt");
try(RandomAccessFile reader = new RandomAccessFile(file, "r")) {
String readLine = reader.readLine();
String utf8Line = new String(readLine.getBytes("ISO-8859-1"), "UTF-8" );
System.out.println("Read Line: " + readLine);
System.out.println("Real length: " + readLine.length());
System.out.println("UTF-8 Line: " + utf8Line);
System.out.println("UTF-8 length: " + utf8Line.length());
System.out.println("Current position: " + reader.getFilePointer());
} catch (Exception e) {
e.printStackTrace();
}
The output is this:
Read Line: ?»?Hello. World.
Real length: 16
UTF-8 Line: ?Hello. World.
UTF-8 length: 14
Current position: 16
These (1 or 2) characters seem to appear only at the very beginning. If I add more lines to the file and read them, then all the further lines are being read normally. Can someone explain this behavior? What is this character at the beginning?
Thanks!