8

I have a quite large Java object that represents a graph, with vertices and edges, in memory. Each vertex has an ArrayList of other vertices that it is connected to (and has a HashMap data structure as well for other purposes). The graph can have a few thousand vertices, and many more edges.

When trying to serialize the graph using Java's built-in serialization (implements Serializable, etc.), I always run into a StackOverflowError. Setting other attributes of the graph to transient does not help, and nor does setting the stack size to be larger (i.e. -Xss1g or -Xss512m).

I would not think that I need to make a custom writeObject method since ArrayList and HashMap already have their own implementations, which are called upon serialization.

My question is: is there a way to serialize a large Java object already in memory without getting a StackOverflowError?

Edit: Here is the stack trace:

Exception in thread "main" java.lang.StackOverflowError
at java.lang.reflect.Method.invoke(Method.java:575)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:950)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1482)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1413)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1159)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1535)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1413)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1159)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:329)
at java.util.ArrayList.writeObject(ArrayList.java:570)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:950)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1482)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1413)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1159)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1535)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1413)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1159)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:329)
at java.util.ArrayList.writeObject(ArrayList.java:570)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:950)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1482)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1413)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1159)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1535)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1413)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1159)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:329)
at java.util.ArrayList.writeObject(ArrayList.java:570)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:950)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1482)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1413)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1159)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1535)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1413)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1159)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:329)
at java.util.ArrayList.writeObject(ArrayList.java:570)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:950)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1482)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1413)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1159)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1535)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1413)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1159)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:329)
at java.util.ArrayList.writeObject(ArrayList.java:570)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:950)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1482)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1413)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1159)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1535)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1413)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1159)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:329)
at java.util.ArrayList.writeObject(ArrayList.java:570)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:950)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1482)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1413)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1159)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1535)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1413)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1159)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:329)
at java.util.ArrayList.writeObject(ArrayList.java:570)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:950)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1482)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1413)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1159)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1535)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1413)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1159)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:329)
at java.util.ArrayList.writeObject(ArrayList.java:570)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:950)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1482)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1413)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1159)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1535)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1413)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1159)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:329)
at java.util.ArrayList.writeObject(ArrayList.java:570)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
// Many more lines after this

Here is an overview of my Graph class:

public class Graph implements Serializable {

/**
 * 
 */
private static final long serialVersionUID = -2632163054149021990L;
private ArrayList<Vertex> vertices;

private HashMap<Integer, Set<Vertex>> map;

public Graph(int rowMax, int colMax)
{
    map = new HashMap<Integer, Set<Vertex>>();

    this.vertices = new ArrayList<Vertex>();
}

public void connectVertices(Vertex u, Vertex v)
{
    u.addNeighbor(v);
    v.addNeighbor(u);
}

// other unrelated methods after this

And here is my Vertex class:

public class Vertex implements Serializable {

/**
 * 
 */
private static final long serialVersionUID = 8520500010710631610L;
public int row;
public int col;
private ArrayList<Vertex> neighbors; // may change this to Set<Vertex>

public Vertex(int i, int j)
{
    this.row = i;
    this.col = j;
    this.neighbors = new ArrayList<Vertex>();
}

public boolean addNeighbor(Vertex v)
{
    this.neighbors.add(v);
    return true;
}

// unrelated methods after this

Edit 2: Also, for graphs of smaller sizes, but that have "neighbors", do not have this problem.

Ryan Dougherty
  • 528
  • 6
  • 21
  • Recursion with respect to what? Also, writing a custom writeObject method with guaranteed no recursion in that method does not change anything. – Ryan Dougherty Aug 05 '14 at 20:21
  • Could it be that you have a circular reference somewhere? If it is already in memory i dont think the problem is the object size. – Arno van Lieshout Aug 05 '14 at 20:23
  • That may be true, I did not know that could happen. I do have a generated `serialVersionUID` for vertices, though. How would I avoid this problem? – Ryan Dougherty Aug 05 '14 at 20:24
  • Manually design how to serialize/deserialize your objects to avoid these problems. – Luiggi Mendoza Aug 05 '14 at 20:26
  • @LuiggiMendoza Do you have references on how to serialize collections with some unique identifier to avoid recursions? – Ryan Dougherty Aug 05 '14 at 20:28
  • 2
    Guys, serialisation of objects that have already been serialized does *not* cause stack overflow. Java serializes a handle instead of the object. Cyclic object graphs can be serialised. You are all barking up the wrong tree. @OP you will have to post some code. Do you have custom writeObject() methods? – user207421 Aug 05 '14 at 20:31
  • @EJP I did not use custom writeObject() methods because I thought each of the data structures I had already have that method. – Ryan Dougherty Aug 05 '14 at 20:40
  • Some do; some don't: some just use default Serialization. You will still have to post some code, and the stack trace come to think of it. – user207421 Aug 05 '14 at 20:42
  • @EJP I will post some code and stack trace. – Ryan Dougherty Aug 05 '14 at 20:43
  • 1
    I would certainly change the neighbour list to a a Set. This will reduce the data size very considerably. – user207421 Aug 05 '14 at 20:57
  • @EJP the way that it is set up, using a set or arraylist shouldn't make a difference. I'll see if that helps, though. – Ryan Dougherty Aug 05 '14 at 21:25
  • It will reduce the data size. Of course it will make a difference. It may not solve this problem, but I didn't claim it would. – user207421 Aug 05 '14 at 21:26
  • I think I would consider just serializing the co-ordinates, and reconstructing the neighbour sets/lists on deserialization. They're only a computational convenience after all, they're not essential to define the graph. – user207421 Aug 05 '14 at 21:35
  • @EJP A good idea, but the computations that I only want to do once are the edge connections between vertices, since those take by far the longest time for what I'm doing. – Ryan Dougherty Aug 05 '14 at 21:45
  • Well you're certainly going to have to trade space for time somehow. – user207421 Aug 06 '14 at 00:35

3 Answers3

4

Serialization will throw a StackOverflowError if the depth of your graph is too large for the default serialization to handle. This is due to the default serialization recursively serializing each node as it parses your graph.

Flat structures will work fine (e.g. a parent node with 2000 children), but deep structures will fail (e.g. a node with 2000 descendant levels).

E.g. The following will stack overflow:

public class Node implements Serializable
{
    private ArrayList<Node> nodes = new ArrayList<Node>();

    public static void main(String[] args) throws Exception
    {
        Node node = new Node();
        int depth = 3000;

        // Add nodes chained down to specified depth
        Node last = node;
        for (int i = 0; i < depth; i++)
        {
            Node temp = new Node();
            last.nodes.add(temp);
            last = temp;
        }

        System.out.println("starting");

        ByteArrayOutputStream bos = new ByteArrayOutputStream();
        ObjectOutputStream out = new ObjectOutputStream(bos);
        // Below line will cause a stack overflow.
        out.writeObject(node);

        System.out.println("done");
    }
}

You will need to either reduce the depth of your graph in order to limit the number of recursive serialization calls, or write custom serialization to work around this. The custom serialization will need to be non-recursive in nature, and unfortunately at first glance seems non-trivial to implement.

Trevor Freeman
  • 7,112
  • 2
  • 21
  • 40
  • Thanks for the answer. What would be custom serialization so that I just have to call readObject on an ObjectInputStream? – Ryan Dougherty Aug 05 '14 at 21:48
  • Custom serialization involves implementing `writeObject`, `readObject` , and `readObjectNoData` on the objects to be serialized. As long as these methods are implemented then you can use readObject on an ObjectInputStream, the problem is that the implementation you need here is at least somewhat complicated. – Trevor Freeman Aug 05 '14 at 22:06
0

I had a similar problem. After much hunting, I found a fork of Kryo designed to handle deeply nested objects. Via https://github.com/EsotericSoftware/kryo/issues/103 , clone and mvn clean install https://github.com/romix/kryo/tree/kryo-2.23-continuations . It's currently com.esotericsoftware.kryo:kryo:2.23-SNAPSHOT.

(Side note: there are a number of questions on SO that can be answered with this. Should I post copies of the answer (like https://stackoverflow.com/a/43327778/513038), or comment a link to this answer, or flag the questions as duplicate pointing to this one, or what?)

Community
  • 1
  • 1
Erhannis
  • 4,256
  • 4
  • 34
  • 48
0

I know that this is years late, but I came across this via google, when trying to fix the issue myself.

I wanted a solution that required a far smaller amount of code than writing custom serialisation code.

The problem is that Nodes are connected to Nodes are connected to Nodes. A very large part of the network is reachable from any given node - for an undirected graph it's every node.

When you attempt to serialise a Node every reachable node will end up on the stack.

The simplest solution is not to connect a node directly to its neighbours, but to add a layer of indirection:

class Node{

    private final Map<NodeRef, Connection> forwardConnections = new HashMap<>();
    private final Map<NodeRef, Connection> reverseConnections = new HashMap<>();

    ...
}

class Connection{

    private final NodeRef source;
    private final NodeRef dest;
    private final ConnectionMetadata meta;

    public Connection(Node source, Node dest, ConnectionMetadata meta){
        this.source = new NodeRef(source);
        this.dest = new NodeRef(dest);
    }

    public Node getSource(){
        return source.resolve();
    }

    public Node getDest(){
        return dest.resolve();
    }

    public ConnectionMetadata getMeta(){
        return meta;
    }

}

public class NodeRef{

    private transient Node node;
    private final Network network;
    private final int[] uid;

    public NodeRef(Node node){
        this.node = node;
        this.network = node.getNetwork();
        this.uid = node.getUID();
    }

    //Won't be called during deserialisation
    public Node resolve(){
        if(node == null){
            node = network.resolve(uid);
        }
        return node;
    }

    //Will be called when connections maps are deserialised.
    public boolean equals(Object o){
        //you might also want to check that the networks are equal
        return (o instanceof NodeRef) && Arrays.equals(uid, ((NodeRef)o).uid);
    }


    //Will be called when connections maps are deserialised.
    public int hashCode(){
        return Arrays.hashCode(uid);
    }
}
user1837841
  • 316
  • 1
  • 8