0

Alright, before i come to my question i want to point out first that i know the difference between Serializable and Exernalizable so you do not need to give an explanation!

What i am basically trying to do is saving a class with all its data in a file. We already have come to the time where Java 9 is out and the JVM is very fast but there are still people (in whose opinions i belive) that using Serializable on a huge amount of data is very inefficient compared to using Exernalizable.

If i would have only like 10 fields which represent ordinary data types like integers or booleans i would definitely use Serializable.

But now i got a little bit more data to store and load, e.g. a 3-Dimensional byte array which contains around 3.3 Million fields and i think it would be very inefficient to save data like this via the reflection-way implemented by the Serializable class. But since i am not 100% sure about the Exernalizable way being more efficient in storing such huge amount of data i would like to ensure myself first before i start using my program because it does not need to save the data fast but load it very fast (and not only one time, it needs to do some calculations first and then load it during the programm multiple times because depending on what state the programm is at it needs to load different datasets). So basically my idea is that i would load the byte-array via asynchronous multithreading in the Externalizable#readExternal() function.

Please correct me if im wrong with my opinion that using Exernalizable here is not the more efficient way because i want the programm to run as fluent as possible when it is loading the data!

King Regards,

Fabian Schmidt!

Fabian Schmidt
  • 334
  • 1
  • 13
  • 1
    Easy enough to test & measure. `Externalizable` doesn't have to do all the Reflection that `Serializable` does, but it's a lot more work in terms of coding and maintenance. – user207421 Oct 20 '17 at 00:54
  • 1
    I think the idea that `Serializable` is slow is fairly old. Modern JVMs like 7 and 8 implement a lot of speed-ups to help `Serializable` run much faster. I would start with that and only investigate further if it was in fact running slower than acceptable. – markspace Oct 20 '17 at 00:57
  • Well then i think the best way is just to compare both methods and the times to save/load data. – Fabian Schmidt Oct 20 '17 at 00:58
  • I think @markspace is right on the money here. You don't need it to be as fast as possible, you need it to be fast *enough*. In the old days we had to make sort-merges fast enough so they didn't run into a second operator shift. Any faster than that there was really no payback. – user207421 Oct 20 '17 at 01:18
  • Well ass you two guys see now it makes clearly a difference up until now except something in my code could be done more efficient in the export implementation but i do think i implemented it the best way possible! – Fabian Schmidt Oct 20 '17 at 02:07

1 Answers1

-1

Basically what i have done now was comparing the time it takes to save/load via reflection/my own implementation.

The code for the test:

Main Class (Comparision.class)

package de.cammeritz.chunksaver.util;

import java.io.File;

/**
 * Created by Fabian / Cammeritz on 20.10.2017 at 03:15.
 */

public class Comparision {

    public static void main(String args[]) {

        long start;
        long end;

        //Preparing datasets

        DataSerializable dataSerializable = createSerializable();
        DataExternalizable dataExternalizable = createExternalizable();

        //Storage files

        File sFile = new File(System.getProperty("user.dir"), "sFile.dat");
        File eFile = new File(System.getProperty("user.dir"), "eFile.dat");

        //Saving via reflection

        start = System.currentTimeMillis();

        FileUtil.save(dataSerializable, sFile);

        end = System.currentTimeMillis();

        System.out.println("Time taken to save via reflection in milliseconds: " + (end - start));

        //Saving via my own code

        start = System.currentTimeMillis();

        FileUtil.save(dataExternalizable, eFile);

        end = System.currentTimeMillis();

        System.out.println("Time taken to save via my own code in milliseconds: " + (end - start));

        //Loading via reflection

        start = System.currentTimeMillis();

        dataSerializable = (DataSerializable) FileUtil.load(sFile);

        end = System.currentTimeMillis();

        System.out.println("Time taken to load via reflection in milliseconds: " + (end - start));

        //Loading via my own code

        start = System.currentTimeMillis();

        dataExternalizable = (DataExternalizable) FileUtil.load(eFile);

        end = System.currentTimeMillis();

        System.out.println("Time taken to save via my own code in milliseconds: " + (end - start));

    }

    private static DataSerializable createSerializable() {
        DataSerializable data = new DataSerializable(7);
        for (int cx = 0; cx < data.getSideSize(); cx++) {
            for (int cz = 0; cz < data.getSideSize(); cz++) {
                for (int x = 0; x < data.getX(); x++) {
                    for (int y = 0; y < data.getY(); y++) {
                        for (int z = 0; z < data.getZ(); z++) {
                            data.setValue(cx, cz, x, y, z, (byte) 0x7f);
                        }
                    }
                }
            }
        }
        return data;
    }

    private static DataExternalizable createExternalizable() {
        DataExternalizable data = new DataExternalizable(7);
        for (int cx = 0; cx < data.getSideSize(); cx++) {
            for (int cz = 0; cz < data.getSideSize(); cz++) {
                for (int x = 0; x < data.getX(); x++) {
                    for (int y = 0; y < data.getY(); y++) {
                        for (int z = 0; z < data.getZ(); z++) {
                            data.setValue(cx, cz, x, y, z, (byte) 0x7f);
                        }
                    }
                }
            }
        }
        return data;
    }

}

Serialization via reflections:

package de.cammeritz.chunksaver.util;

import java.io.Serializable;

/**
 * Created by Fabian / Cammeritz on 20.10.2017 at 02:59.
 */

public class DataSerializable implements Serializable {

    private final int x = 16;
    private final int y = 256;
    private final int z = 16;

    private byte[][][][][] ids = null;
    private int sideSize;

    public DataSerializable(int sideSize) {
        this.sideSize = sideSize;
        ids = new byte[sideSize][sideSize][16][256][16];
    }

    public int getX() {
        return x;
    }

    public int getY() {
        return y;
    }

    public int getZ() {
        return z;
    }

    public int getSideSize() {
        return sideSize;
    }

    public byte getValue(int cx, int cz, int x, int y, int z) {
        return ids[cx][cz][x][y][z];
    }

    public void setValue(int cx, int cz, int x, int y, int z, byte value) {
        ids[cx][cz][x][y][z] = value;
        return;
    }

}

Seralization via my own implementation:

package de.cammeritz.chunksaver.util;

import java.io.Externalizable;
import java.io.IOException;
import java.io.ObjectInput;
import java.io.ObjectOutput;

/**
 * Created by Fabian / Cammeritz on 20.10.2017 at 02:58.
 */

public class DataExternalizable implements Externalizable {

    private final int x = 16;
    private final int y = 256;
    private final int z = 16;

    private byte[][][][][] ids = null;
    private int sideSize;

    public DataExternalizable() {

    }

    public DataExternalizable(int sideSize) {
        this.sideSize = sideSize;
        ids = new byte[sideSize][sideSize][16][256][16];
    }

    public int getX() {
        return x;
    }

    public int getY() {
        return y;
    }

    public int getZ() {
        return z;
    }

    public int getSideSize() {
        return sideSize;
    }

    public byte getValue(int cx, int cz, int x, int y, int z) {
        return ids[cx][cz][x][y][z];
    }

    public void setValue(int cx, int cz, int x, int y, int z, byte value) {
        ids[cx][cz][x][y][z] = value;
        return;
    }

    @Override
    public void writeExternal(ObjectOutput out) throws IOException {
        out.writeObject(ids);
    }

    @Override
    public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException {
        ids = (byte[][][][][]) in.readObject();
    }
}

Basically i can agree with what @markspace said above ("I think the idea that Serializable is slow is fairly old. Modern JVMs like 7 and 8 implement a lot of speed-ups to help Serializable run much faster. I would start with that and only investigate further if it was in fact running slower than acceptable") and also what @EJP said ("I think @markspace is right on the money here. You don't need it to be as fast as possible, you need it to be fast enough. In the old days we had to make sort-merges fast enough so they didn't run into a second operator shift. Any faster than that there was really no payback.")

The problem of the test now is that the results are very confusing and also showing that i definitely will use Externalizable here.

Results from 3 Tests with the same values and exact sizes of datasets i will need later in my project:

Time taken to save via reflection in milliseconds: 746
Time taken to save via my own code in milliseconds: 812
Time taken to load via reflection in milliseconds: 3191
Time taken to save via my own code in milliseconds: 2811

Time taken to save via reflection in milliseconds: 755
Time taken to save via my own code in milliseconds: 934
Time taken to load via reflection in milliseconds: 3545
Time taken to save via my own code in milliseconds: 2671

Time taken to save via reflection in milliseconds: 401
Time taken to save via my own code in milliseconds: 784
Time taken to load via reflection in milliseconds: 3065
Time taken to save via my own code in milliseconds: 2627

What confuses me about this is that the reflection implementation is saving significantly faster than my own implementation but in the opposite it takes around 1 second longer to load the data.

The point now is that this 1 second is very significant for what i am planning to do since the saving does not really matter but the loading has to be done quick. So the outcome clearly shows me that i should use the Externalizable way here.

But can anyone here tell me why exactly the reflection way is saving faster and how i could improve my own implementation of saving the data?

Thanks to all!

Fabian Schmidt
  • 334
  • 1
  • 13
  • Please indicate *clearly* where you are using `Serializable` and where you are using `Externalizable`. The part about 'using Reflection' versus 'using my own code' is meaningless, as you are in fact using both `Serializable` (via `writeObject(byte[][][][][])` and therefore reflection in both cases. – user207421 Oct 20 '17 at 02:17
  • Alright lemme clear that out what i maybe explained wrongly. I am getting confused that saving the DataSerializable is faster than saving the DataExternalizable class. But on the opposite Loading the DataSerializable takes significant more time than loading the DataExternalizable class! – Fabian Schmidt Oct 20 '17 at 02:28
  • This is really of no interest, especially as you have failed to clarify as requested. It would be far more interesting if the `Externalizable` code bypassed serialization altogether and just wrote out the array dimensions and the actual lowest-level byte arrays, and read them back the same way, using nothing more than the methods of `DataInput` and `DataOutput`. . – user207421 Oct 20 '17 at 05:04