1

I am trying to run a java program that uses WEKA libraries on a cluster.

This cluster times out submitted jobs after 12 hours, and I can't change this fact because I am a student and not in charge of the cluster.

What I want to do is save the state of my JVM, and reload it. Basically close the program for a time, and pick up where I left off.

Is this possible?

I don't think I can (easily at least) output the state of the variables in the WEKA objects themselves to a file with OOS and reload them because I'm using the WEKA libraries, and it would be extremely complicated to rewrite the code for these machine learning programs. (though that might be what I have to do)

I tried using a library called javaflow that I thought from reading around might accomplish this, but I cannot get it to work. When try to do its counting example I am met with this error:

Apr 20, 2016 9:15:12 PM org.apache.commons.javaflow.bytecode.StackRecorder execute
SEVERE: stack corruption. Is class test_javaflow.MyRunnable instrumented for javaflow?
java.lang.IllegalStateException: stack corruption. Is class test_javaflow.MyRunnable instrumented for javaflow?
    at org.apache.commons.javaflow.bytecode.StackRecorder.execute(StackRecorder.java:102)
    at org.apache.commons.javaflow.Continuation.continueWith(Continuation.java:170)
    at org.apache.commons.javaflow.Continuation.startWith(Continuation.java:129)
    at org.apache.commons.javaflow.Continuation.startWith(Continuation.java:102)
    at test_javaflow.Test_Javaflow.main(Test_Javaflow.java:16)

Googling this error come up with a few pages relating to something called JasperSoft, which I'm fairly certain isn't what I'm looking for.

Paras
  • 3,197
  • 2
  • 20
  • 30
  • Have you tried to run the JVM on a virtual machine ? As Virtualization tools like Vmware and VirtualBox provide a solution to save the state of the entire machine. You can run the machine and save the state, then a few days later just load it as is. – 11thdimension Apr 21 '16 at 03:11
  • Maybe duplicated with http://stackoverflow.com/questions/611134/can-the-jvm-provide-snapshot-persistence. – Duong Nguyen Apr 21 '16 at 03:50
  • I have not yet tried running it on a virtual machine. I suppose this is possible. I'll have to ask the person in charge of the cluster. – Arthur Dunbar Apr 21 '16 at 04:06
  • To answer the question in your title, you write the state to a file when your program ends, and read the file when you start the program again. You can use parameters, XML, JSON, or any other format that makes sense for your file. – Gilbert Le Blanc Apr 21 '16 at 07:13
  • This is not quite what I'm looking for. I need it to basically recover from a crash. – Arthur Dunbar Apr 23 '16 at 01:39
  • Also, the virtual machine thing isn't going to work sadly. The administrator of the cluster said he didn't want to have one set up on it. – Arthur Dunbar Apr 23 '16 at 01:40

1 Answers1

0

Have a look at docker's checkpoint command. It provides the ability to save the current state of a docker container and then resume it. I've been using it for a similar use - a JVM based system. In my case I use it because initialization takes a long time. By using a checkpoint I can restart the container at a known state multiple times.

alanr
  • 21
  • 3