-1

I've got a problem with input from user. I need to save input from user into binary file and when I read it and show it on the screen it isn't working properly. I dont want to put few hundreds of lines, so I will try to dexcribe it in more compact form. And encoding in NetBeans in properties of project is "UTF-8"

I got input from user, in NetBeans console or cmd console. Then I save it to object made up of strings, then add it to ArrayList<Ksiazka> where Ksiazka is my class (basically a book's properties). Then I save whole ArrayList object to file baza.bin. I do it by looping through whole list of objects of class Ksiazka, taking each String one by one and saving it into file baza.bin using method writeUTF(oneOfStrings). When I try to read file baza.bin I see question marks instead of special characters (ą, ć, ę, ł, ń, ó, ś, ź). I think there is a problem in difference in encoding of file and input data, but to be honest I don't have any idea ho to solve that.

Those are attributes of my class Ksiazka:

private String id;
private String tytul;
private String autor;
private String rok;
private String wydawnictwo;
private String gatunek;
private String opis;
private String ktoWypozyczyl;
private String kiedyWypozyczona;
private String kiedyDoOddania;

This is method for reading data from user:

static String podajDana(String[] tab, int coPokazac){
    System.out.print(tab[coPokazac]);
    boolean podawajDalej = true;
    String linia = "";
    Scanner klawiatura = new Scanner(System.in, "utf-8"); 
    do{
        try {   
            podawajDalej = false; 
            linia = klawiatura.nextLine();
        }
        catch(NoSuchElementException e){
            System.err.println("Wystąpił błąd w czasie podawania wartości!"
                    + " Spróbuj jeszcze raz!");
        }
        catch(IllegalStateException e){
            System.err.println("Wewnętrzny błąd programu typu 2! Zgłoś to jak najszybciej"
                    + " razem z tą wiadomością");
        }
    }while(podawajDalej);
    return linia; 
}

String[] tab is just array of strings I want to be able to show on the screen, each set (array) has its own function, int coPokazac is number of line from an array I want to show.

and this one saves all data from ArrayList<Ksiazka> to file baza.bin:

static void zapiszZmiany(ArrayList<Ksiazka> bazaKsiazek){
     try{
        RandomAccessFile plik = new RandomAccessFile("baza.bin","rw");
        for(int i = 0; i < bazaKsiazek.size(); i++){
            plik.writeUTF(bazaKsiazek.get(i).zwrocId());
            plik.writeUTF(bazaKsiazek.get(i).zwrocTytul());
            plik.writeUTF(bazaKsiazek.get(i).zwrocAutor());
            plik.writeUTF(bazaKsiazek.get(i).zwrocRok());
            plik.writeUTF(bazaKsiazek.get(i).zwrocWydawnictwo());
            plik.writeUTF(bazaKsiazek.get(i).zwrocGatunek());
            plik.writeUTF(bazaKsiazek.get(i).zwrocOpis());
            plik.writeUTF(bazaKsiazek.get(i).zwrocKtoWypozyczyl());
            plik.writeUTF(bazaKsiazek.get(i).zwrocKiedyWypozyczona());
            plik.writeUTF(bazaKsiazek.get(i).zwrocKiedyDoOddania());
        }

        plik.close();
            }
        catch (FileNotFoundException ex){
            System.err.println("Nie znaleziono pliku z bazą książek!");
        }
        catch (IOException ex){
            System.err.println("Błąd zapisu bądź odczytu pliku!");
        }
}

I think that there is a problem in one of those two methods (either I do something wrong while reading it or something wrong when it is saving data to file using writeUTF()) but even tho I tried few things to solve it, none of them worked.

After quick talk with lecturer I got information that I can use at most JDK 8.

Daniel1490
  • 79
  • 6

1 Answers1

1

You are using different techniques for reading and writing, and they are not compatible.

Despite the name, the writeUTF method of RandomAccessFile does not write a UTF-8 string. From the documentation:

Writes a string to the file using modified UTF-8 encoding in a machine-independent manner.

First, two bytes are written to the file, starting at the current file pointer, as if by the writeShort method giving the number of bytes to follow. This value is the number of bytes actually written out, not the length of the string. Following the length, each character of the string is output, in sequence, using the modified UTF-8 encoding for each character.

writeUTF will write a two-byte length, then write the string as UTF-8, except that '\u0000' characters are written as two UTF-8 bytes and supplementary characters are written as two UTF-8 encoded surrogates, rather than single UTF-8 codepoint sequences.

On the other hand, you are trying to read that data using new Scanner(System.in, "utf-8") and klawiatura.nextLine();. This approach is not compatible because:

  • The text was not written as a true UTF-8 sequence.
  • Before the text was written, two bytes indicating its numeric length were written. They are not readable text.
  • writeUTF does not write a newline. It does not write any terminating sequence at all, in fact.

The best solution is to remove all usage of RandomAccessFile and replace it with a Writer:

Writer plik = new FileWriter(new File("baza.bin"), StandardCharsets.UTF_8);
for (int i = 0; i < bazaKsiazek.size(); i++) {
    plik.write(bazaKsiazek.get(i).zwrocId());
    plik.write('\n');
    plik.write(bazaKsiazek.get(i).zwrocTytul());
    plik.write('\n');
    // ...
VGR
  • 40,506
  • 4
  • 48
  • 63
  • So changing reading and writing to Writer should work. But is there any option to read data in a way compatible with RadomAccessFile? – Daniel1490 Jan 19 '21 at 15:20
  • 1
    @Daniel1490 Yes. You can use a new RandomAccessFile to read the file, using the [readUTF](https://docs.oracle.com/en/java/javase/15/docs/api/java.base/java/io/RandomAccessFile.html#readUTF()) method. – VGR Jan 19 '21 at 15:24
  • Okay, I was in hurry, so I missed the thing: is there any option to read data from user, like from cmd or console in NetBeans, in a way compatible with RadomAccessFile? I'm asking, because I have also a way to read books from txt file and then add them to base.bin using RandomAccessFile and this txt file is, I think by default, encoded as UTF-8. So if Scanner is not compatible with RandomAccessFIle wouldn't changing RandomAccessFile to Writer be making another problem? – Daniel1490 Jan 19 '21 at 15:43
  • 1
    Normally you read user input as Strings. You can do anything you want with those Strings, including writing them using a RandomAccessFile. If you’re asking whether it’s possible to have the same code be capable of reading user input and reading from a file created using RandomAccessFile.writeUTF, the answer is no. – VGR Jan 19 '21 at 15:52
  • It was more like "Is there any way to read from .txt file, save/read .bin file and read input from user (like from cmd) and reach same encoding so no matter what I do with the data (read/save to file/show on screen) it will look the same way (no question marks or squares)". So, if I understanded it correctly, Writer and Scanner sholud be compatible and shouldn't cause problems? – Daniel1490 Jan 19 '21 at 16:02
  • 1
    Yes. Writer and Scanner do not have any special format: UTF-8 text is just text. – VGR Jan 19 '21 at 16:31