2

I need to encode my files in "ISO-8859-1". I know how to do this with a Reader like this:

BufferedReader br = new BufferedReader(new InputStreamReader(
            new FileInputStream(src), "ISO-8859-1"))

But I'm asking how to encode a DataInputStream like this.

My decleration right now:

DataInputStream dit = new DataInputStream(new BufferedInputStream(
            new FileInputStream(src)))

I would prefer a solution, where the encoding-parameter is in the decleration. The data I want to read has been written with a DataOutputStream.

Import-method and export-method for DataStreams:

public void importDST(String src) throws FileNotFoundException, IOException{
    try (DataInputStream dit = new DataInputStream(new BufferedInputStream(new FileInputStream(src)))) {
        while(dit.available() > 0) {
            pupils.add(new Pupil(dit.readInt(), dit.readInt(), dit.readUTF(), dit.readUTF(), dit.readChar(),
                    dit.readUTF(), dit.readInt(), dit.readInt(), dit.readInt(), dit.readUTF(), dit.readUTF(), dit.readUTF(), dit.readUTF(),
                    dit.readUTF(), dit.readUTF()));
        }
    } catch (FileNotFoundException e) {
        throw e;
    } catch (IOException e) {
        throw e;
    }
}

public void exportDST(String dest, ArrayList<Pupil> pupils) throws FileNotFoundException, IOException{
    this.pupils = pupils;
    try (DataOutputStream dot = new DataOutputStream(new BufferedOutputStream(new FileOutputStream(dest)))) {
        for (Pupil p : this.pupils) {
            dot.writeInt(p.getId());
            dot.writeInt(p.getNumber());
            dot.writeUTF(p.getFirstname());
            dot.writeUTF(p.getLastname());
            dot.writeChar(p.getGender());
            dot.writeUTF(p.getReligion());
            dot.writeInt(p.getDay());
            dot.writeInt(p.getMonth());
            dot.writeInt(p.getYear());
            dot.writeUTF(p.getStreet());
            dot.writeUTF(p.getPlz());
            dot.writeUTF(p.getLocation());
            dot.writeUTF(p.getShortName());
            dot.writeUTF(p.getClassName());
            dot.writeUTF(p.getKvLastname());
        }
    } catch (FileNotFoundException e) {
        throw e;
    } catch (IOException e) {
        throw e;
    }
}

class Pupil:

public class Pupil implements Serializable{
private int id;
private int number;
private String firstname;
private String lastname;
private char gender;
private String religion;
private int day;
private int month;
private int year;
private String street;
private String plz;
private String location;
private String shortName;
private String className;
private String kvLastname;

public Pupil() {}

public Pupil(int id, int number, String firstname, String lastname, char gender,
             String religion, int day, int month, int year, String street, String plz, String location,
             String shortName, String className, String kvLastname) {
    this.id = id;
    this.number = number;
    this.firstname = firstname;
    this.lastname = lastname;
    this.gender = gender;
    this.religion = religion;
    this.day = day;
    this.month = month;
    this.year = year;
    this.street = street;
    this.plz = plz;
    this.location = location;
    this.shortName = shortName;
    this.className = className;
    this.kvLastname = kvLastname;
}
}
Stefan
  • 1,122
  • 3
  • 14
  • 38
  • As an aside, there is no point in your catches: you always rethrow the exception. You occasionally need to catch an exception if you want it to "leap-frog" a broader catch (e.g. if you wanted to propagate FileNotFoundException, but catch and handle IOException). Here, you don't, so just drop all your catches. – Andy Turner Sep 17 '18 at 06:26

3 Answers3

0

As for ObjectInputStream, the documentation states that

An ObjectInputStream deserializes primitive data and objects previously written using an ObjectOutputStream.

Also, note:

Only objects that support the java.io.Serializable or java.io.Externalizable interface can be read from streams.

That is, the read data has previously been serialized (or externalized) using ObjectOutputStream and given objects that implement Serializable (or Externalizable). You would thus deal with charset encoding for any String attributes in the readObject and `writeObject' methods of your Serializable objects.

As for DataInputStream, see this answer: DataInputStream and UTF-8

You would have to specify the encoding when creating a String from the read bytes.

MikkelRJ
  • 181
  • 8
  • Thanks for your supply. This does not really fit to my code, because I use methods like readInt() and readUTF(). So I do not read bytes – Stefan Sep 16 '18 at 15:22
  • @Stefan, I guess it would help with some more details about your case. Have you written the objects that you need to read? If not, what do the files look like? If you produce the files yourself and want them in human-readable format, why not serialize to XML or JSON instead of using a DataInputStream? If you want to read/write in a binary format, you have to deal with delimitation of the data yourself, including how long Strings are, etc. Charset encoding (like ISO-8859-1) does not make sense for storing integers. – MikkelRJ Sep 16 '18 at 16:29
  • The data from the file is made by a DataOutputStream – Stefan Sep 16 '18 at 16:39
  • Then what are the data and how do you write it to that stream? – MikkelRJ Sep 16 '18 at 17:58
0

Your question doesn't really make sense.

Streams model streams of bytes. They don't have a character encoding, they are just bytes.

Readers read streams of characters. These are ultimately streams of bytes too, but there is a character encoding which says how to convert those bytes to chars. As such, it makes sense to be able to specify this encoding in the constructor.

DataInputStreams are Streams: they read a binary so they don't have a character encoding.

Andy Turner
  • 137,514
  • 11
  • 162
  • 243
  • Why then can't I read all characters correctly. I have chars like ö, ä, ü in my file and those are not displayed after reading from the DataInputStream – Stefan Sep 16 '18 at 16:59
  • @Stefan please show a [mcve], showing how the data is written, and subsequently read (not simply how you create the stream). – Andy Turner Sep 16 '18 at 18:28
  • @Stefan your examples do not show an example which writes accented characters and then reads them back incorrectly. Please provide a [mcve], not just some of the relevant code. – Andy Turner Sep 17 '18 at 06:20
0

I bypassed this problem by writing and reading just Ints and Bytes, instead of Strings. I read bytearrays and made a new String with the encoding out of it. Here is the changed code:

Reading:

public void importDST(String src) throws IOException{
    try (DataInputStream dit = new DataInputStream(new BufferedInputStream(new FileInputStream(src)))) {
        while (dit.available() > 0) {
            Pupil p = new Pupil();
            byte[] arr;
            int len;

            p.setId(dit.readInt());
            p.setNumber(dit.readInt());
            len = dit.readInt();
            arr = new byte[len];
            dit.readFully(arr);
            p.setFirstname(new String(arr, "ISO-8859-1"));
            len = dit.readInt();
            arr = new byte[len];
            dit.readFully(arr);
            p.setLastname(new String(arr, "ISO-8859-1"));
            p.setGender(dit.readChar());
            len = dit.readInt();
            arr = new byte[len];
            dit.readFully(arr);
            p.setReligion(new String(arr, "ISO-8859-1"));
            p.setDay(dit.readInt());
            p.setMonth(dit.readInt());
            p.setYear(dit.readInt());
            len = dit.readInt();
            arr = new byte[len];
            dit.readFully(arr);
            p.setStreet(new String(arr, "ISO-8859-1"));
            p.setPlz(dit.readInt());
            len = dit.readInt();
            arr = new byte[len];
            dit.readFully(arr);
            p.setLocation(new String(arr, "ISO-8859-1"));
            len = dit.readInt();
            arr = new byte[len];
            dit.readFully(arr);
            p.setShortName(new String(arr, "ISO-8859-1"));
            len = dit.readInt();
            arr = new byte[len];
            dit.readFully(arr);
            p.setClassName(new String(arr, "ISO-8859-1"));
            len = dit.readInt();
            arr = new byte[len];
            dit.readFully(arr);
            p.setKvLastname(new String(arr, "ISO-8859-1"));

            pupils.add(p);
        }
    }
}

Writing:

public void exportDST(String dest, ArrayList<Pupil> pupils) throws IOException{
    this.pupils = pupils;
    try (DataOutputStream dot = new DataOutputStream(new BufferedOutputStream(new FileOutputStream(dest)))) {
        for (Pupil p : pupils) {
            dot.writeInt(p.getId());
            dot.writeInt(p.getNumber());
            dot.writeInt(p.getFirstname().length());
            dot.writeBytes(p.getFirstname());
            dot.writeInt(p.getLastname().length());
            dot.writeBytes(p.getLastname());
            dot.writeChar(p.getGender());
            dot.writeInt(p.getReligion().length());
            dot.writeBytes(p.getReligion());
            dot.writeInt(p.getDay());
            dot.writeInt(p.getMonth());
            dot.writeInt(p.getYear());
            dot.writeInt(p.getStreet().length());
            dot.writeBytes(p.getStreet());
            dot.writeInt(p.getPlz());
            dot.writeInt(p.getLocation().length());
            dot.writeBytes(p.getLocation());
            dot.writeInt(p.getShortName().length());
            dot.writeBytes(p.getShortName());
            dot.writeInt(p.getClassName().length());
            dot.writeBytes(p.getClassName());
            dot.writeInt(p.getKvLastname().length());
            dot.writeBytes(p.getKvLastname());
        }
    }
}

Thanks for all your responses!

Stefan
  • 1,122
  • 3
  • 14
  • 38