ObjectOutputStream writes same instances differently depending on how i open stream

Question

For the unknown reason, instances written into the same objectOutputStream and instances written separately (If I open objectOutputStream on each iteration) both create two different files.

ObjectOutputStream objectOutputStream = new ObjectOutputStream(new FileOutputStream(filepath1));
for (User user : users)
{
    objectOutputStream.writeObject(user);
}

objectOutputStream.close();

for (User user : users)
{
    objectOutputStream = new ObjectOutputStream(new FileOutputStream(filepath2, true));
    objectOutputStream.writeObject(user);
    objectOutputStream.close();
}

So when I read files in a loop like this it works fine only for the first file.

for( int i = 0; i< users.length; i++)
{
    readUsers[i] = (User)objectInputStream.readObject();
}

Reading the second file gives me ONE correctly read user which is followed by an Exception java.io.StreamCorruptedException: invalid type code: AC. I've inspected the content of these files and it seems there's an excessive data at the start of each record in the second one (It takes almost twice as much space as the first one). So how to combine second way of writing instances to file and read them in a simple loop afterwards?

***"...there's an excessive data at the start of each record..."*** -- That is because ObjectOutputStream uses a `Modified UTF-8`. This encoding only support characters up to `0xFFFF`. And each call of `writeUTF()` will add a header which represents the size of the string. I would personally avoid using these streams (input/output) entirely unless you know what you are doing. — Darkman, Mar 02 '22 at 08:41
"*different files*" -> from the [documentation](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/io/ObjectOutputStream.html#%3Cinit%3E(java.io.OutputStream)) of constructor of `ObjectOutputStream`: "*This constructor writes the serialization stream header to the underlying stream*" - in other words, every time a new one is created an additional header is written to the output - the `writeObject` methods also does not duplicate objects (when called from same instance), see documentation of `writeUnshared()` — user16320675, Mar 02 '22 at 09:03

score 1 · Accepted Answer · answered Mar 02 '22 at 10:25

Java serialization, implemented by Object{Output,Input}Stream, in addition to metadata and data for each object, has a stream header that only occurs once and not for each object. Normally in a file this amounts to a file header. If you want to put multiple streams in one file, you must manage the stream boundaries yourself:

//nopackage
import java.io.*;

public class SO71319428MultipleSerial {
  public static void main (String[] args) throws Exception {
    User[] a = { new User("Alice",1), new User("Bob",2), new User("Carol",3) };
    for( User u : a )
      try( ObjectOutputStream oo = new ObjectOutputStream(new FileOutputStream(filename,true)) ){
        oo.writeObject(u);
      }
    System.out.println("reading "+new File(filename).length());
    try( InputStream fi = new FileInputStream(filename) ){
      for( int i = 0; i < 3; i++ ){
        ObjectInputStream oi = new ObjectInputStream(fi);
        System.out.println( oi.readObject() );
        // DON'T close because that closes the underlying FileInputStream; just leak instead
      }
    }
  }
  public static String filename = "SO71319428.out";

  static class User implements Serializable {
    String name; int id;
    public User(String name, int id){ this.name=name; this.id=id; }
    public String toString(){ return name+" is #"+id; }
  }
}
->
reading 283
Alice is #1
Bob is #2
Carol is #3

This is less flexible than the single-stream (with multiple objects) approach, because that allows you to write any sequence of objects whose types and end can be recognized when reading without any help; with the multiple-stream approach your code must be able to determine when a stream ends and needs to be restarted.

I stay corrected, will update my answer. – GhostCat Mar 02 '22 at 10:27 — GhostCat, Mar 02 '22 at 10:27

GhostCat · Answer 2 · 2022-03-02T10:28:29.620

0

You can't easily write the data from two different ObjectOutputStreams into the same file.

Meaning:

either you want to write to different files OR
you use the same ObjectOutputStream to write all your objects into that single file.

This is binary data structure that has an implicit layout, the corresponding protocol simply doesn't expect that another serialized byte stream shows up in a file after a first sequence of objects has been processed completely - unless you "massage" the exact content going into that file (see the other answer for details).

edited Mar 02 '22 at 10:28

answered Mar 02 '22 at 08:26

GhostCat

137,827
25
176
248

1

You can, but it's harder. – dave_thompson_085 Mar 02 '22 at 10:20

ObjectOutputStream writes same instances differently depending on how i open stream

2 Answers2