4

I am trying to run a tagger through batch file for different files. This is my code:

String runap1="cd spt1"+"\n"+"java -Xss8192K -Xms128m -Xmx640m -classpath stanford-postagger.jar edu.stanford.nlp.tagger.maxent.MaxentTagger -model models/bidirectional-wsj-0-18.tagger -textFile "+fff[g]+">tag\\"+r1+"\nexit" ;

FileWriter fw1 = new FileWriter("ac.bat");
BufferedWriter bw1 = new BufferedWriter(fw1);

bw1.write(runap1);
bw1.close();

Runtime rx = Runtime.getRuntime();
Process p = null;

try {
    p = rx.exec("cmd.exe /c start ac.bat");
} catch(Exception e) {
    System.out.println("Error");
} // TODO

try {
    Thread.sleep(15000);
} catch (InterruptedException e) {
    System.out.println("Thread interrupted");
}

This takes a long time to process, and my PC hangs several times. I want to make a shared memory for the tagger to load it only once and all batch files will use that shared tagger; they should not load the tagger each time. How can I do this?

Ry-
  • 218,210
  • 55
  • 464
  • 476
Manoj Gupta
  • 298
  • 1
  • 4
  • 20

1 Answers1

1

If you store you data in a memory mapped file, you can load it multiple times across processes without additional copies. You can even make changes in one process and see than changes in another.

The problem with this is you have to work with off heap memory. This works simplest if the data file is already in a binary form you can use. i.e. you don't need to parse it.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130