2

I'm trying to learn about RandomAccessFile but after creating a test program I'm getting some bizarre output.

import java.io.File;
import java.io.IOException;
import java.io.RandomAccessFile;

public class RandomAccessFileTest
{
    public static void main(String[] args) throws IOException
    {
        // Create a new blank file
        File file = new File("RandomAccessFileTest.txt");
        file.createNewFile();
        
        // Open the file in read/write mode
        RandomAccessFile randomfile = new RandomAccessFile(file, "rw");
        
        // Write stuff
        randomfile.write("Hello World".getBytes());
        
        // Go to a location
        randomfile.seek(0);
        
        // Get the pointer to that location
        long pointer = randomfile.getFilePointer();
        System.out.println("location: " + pointer);
        
        // Read a char (two bytes?)
        char letter = randomfile.readChar();
        System.out.println("character: " + letter);
        
        randomfile.close();
    }
}

This program prints out

location: 0

character: ?

Turns out that the value of letter was '䡥' when it should be 'H'.

I've found a question similar to this, and apparently this is caused by reading one byte instead of two, but it didn't explain how exactly to fix it.

Community
  • 1
  • 1
  • Why not use [`writeChars`](https://docs.oracle.com/javase/8/docs/api/java/io/RandomAccessFile.html#writeChars-java.lang.String-)? Always read and write with the same encoding. – Radiodef Jan 15 '15 at 20:49

1 Answers1

2

You've written "Hello World" in the platform default encoding - which is likely to use a single byte per character.

You're then reading RandomAccessFile.readChar which always reads two bytes. Documentation:

Reads a character from this file. This method reads two bytes from the file, starting at the current file pointer. If the bytes read, in order, are b1 and b2, where 0 <= b1, b2 <= 255, then the result is equal to:

   (char)((b1 << 8) | b2)

This method blocks until the two bytes are read, the end of the stream is detected, or an exception is thrown.

So H and e are being combined into a single character - H is U+0048, e is U+0065, so assuming they've been written as ASCII character, you're reading bytes 0x48 and 0x65 and combining them into U+4865 which is a Han character for "a moving cart".

Basically, you shouldn't be using readChar to try to read this data.

Usually to read a text file, you want an InputStreamReader (with an appropriate encoding) wrapping an InputStream (e.g. a FileInputStream). It's not really ideal to try to do this with RandomAccessFile - you could read data into a byte[] and then convert that into a String but there are all kinds of subtleties you'd need to think about.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Ahhh, I see. Is there an alternate method for writing? I've tried using RandomAccessFile.writeChars but it gave me an unwanted NULL character after every char. – RandomPerson78642 Jan 15 '15 at 20:51
  • @RandomPerson78642: Well we don't really know enough about your context, or why you're using `RandomAccessFile` to start with. I would personally try to avoid that for text data, in most cases. – Jon Skeet Jan 15 '15 at 20:53
  • 1
    writeChar does not write unwanted null characters, it writes your characters as two byte. – eckes Jan 15 '15 at 20:55
  • Okay, nevermind. You were correct, turns out my problems were being caused by mixing up bytes and chars together. I switched everything to bytes and it works perfectly now. – RandomPerson78642 Jan 15 '15 at 21:06