2

I want to use "GATE" through web. Then I decide to create a SOAP web service in java with help of GATE Embedded.

But for the same document and saved Pipeline, I have a different run-time duration, when GATE Embedded runs as a java web service. The same code has a constant run-time when it runs as a Java Application project.

In the web service, the run-time will be increasing after each execution until I get a Timeout error.

Does any one have this kind of experience?

This is my Code:

@WebService(serviceName = "GateWS")
public class GateWS {

    @WebMethod(operationName = "gateengineapi")
    public String gateengineapi(@WebParam(name = "PipelineNumber") String PipelineNumber, @WebParam(name = "Documents") String Docs) throws Exception {

        try {

            System.setProperty("gate.home", "C:\\GATE\\");
            System.setProperty("shell.path", "C:\\cygwin2\\bin\\sh.exe");

            Gate.init();

            File GateHome = Gate.getGateHome();
            File FrenchGapp = new File(GateHome, PipelineNumber);
            CorpusController FrenchController;
            FrenchController = (CorpusController) PersistenceManager.loadObjectFromFile(FrenchGapp);

            Corpus corpus = Factory.newCorpus("BatchProcessApp Corpus");
            FrenchController.setCorpus(corpus);

            File docFile = new File(GateHome, Docs);
            Document doc = Factory.newDocument(docFile.toURL(), "utf-8");
            corpus.add(doc);

            FrenchController.execute();

            String docXMLString = null;
            docXMLString = doc.toXml();
            String outputFileName = doc.getName() + ".out.xml";           
            File outputFile = new File(docFile.getParentFile(), outputFileName);
            FileOutputStream fos = new FileOutputStream(outputFile);
            BufferedOutputStream bos = new BufferedOutputStream(fos);

            OutputStreamWriter out;
            out = new OutputStreamWriter(bos, "utf-8");
            out.write(docXMLString);

            out.close();
            gate.Factory.deleteResource(doc);

            return outputFileName;

        } catch (Exception ex) {
            return "ERROR: -> " + ex.getMessage();
        }
    }
}

I really appreciate any help you can provide.

Shi
  • 4,178
  • 1
  • 26
  • 31
  • It could be a memory leak problem. Do you call `gate.Factory.deleteResource(Resource)` for temporary resources (Documents)? – dedek Jun 05 '14 at 11:03
  • There isn't enough information here to diagnose the problem - you'll need to show some code. Are you loading your pipeline afresh for every request or are you loading N copies up-front and pooling them? Are you properly freeing any temporary resources you have created? Have you looked through the tutorial documentation on this (module 8 - track 2 Thursday - from [the training course](http://gate.ac.uk/wiki/TrainingCourseJune2013/))? – Ian Roberts Jun 05 '14 at 11:48

1 Answers1

2

The problem is that you're loading a new instance of the pipeline for every request, but then not freeing it again at the end of the request. GATE maintains a list internally of every PR/LR/controller that is loaded, so anything you load with Factory.createResource or PersistenceManager.loadObjectFrom... must be freed using Factory.deleteResource once it is no longer needed, typically using a try-finally:

FrenchController = (CorpusController) PersistenceManager.loadObjectFromFile(FrenchGapp);
try {
  // ...
} finally {
  Factory.deleteResource(FrenchController);
}

But...

Rather than loading a new instance of the pipeline every time, I would strongly recommend you explore a more efficient approach to load a smaller number of instances of the pipeline but keep them in memory to serve multiple requests. There is a fully worked-through example of this technique in the training materials on the GATE wiki, in particular module number 8 (track 2 Thursday).

Ian Roberts
  • 120,891
  • 16
  • 170
  • 183