0

I am very comfortable with UIMA, but my new work require me to use GATE

So, I started learning GATE. My question is regarding how to calculate performance of my tagging engines (java based).

With UIMA, I generally dump all my system annotation into a xmi file and, then using a Java code compare that with a human annotated (gold standard) annotations to calculate Precision/Recall and F-score.

But, I am still struggling to find something similar with GATE. After going through Gate Annotation-Diff and other info on that page, I can feel there has to be an easy way to do it in JAVA. But, I am not able to figure out how to do it using JAVA. Thought to put this question here, someone might have already figured this out.

  1. How to store system annotation into a xmi or any format file programmatically.
  2. How to create one time gold standard data (i.e. human annotated data) for performance calculation.

Let me know if you need more specific or details.

Watt
  • 3,118
  • 14
  • 54
  • 85

1 Answers1

0

This code seems helpful in writing the annotations to a xml file. http://gate.ac.uk/wiki/code-repository/src/sheffield/examples/BatchProcessApp.java

        String docXMLString = null;
        // if we want to just write out specific annotation types, we must
        // extract the annotations into a Set
        if(annotTypesToWrite != null) {
            // Create a temporary Set to hold the annotations we wish to write out
            Set annotationsToWrite = new HashSet();

            // we only extract annotations from the default (unnamed) AnnotationSet
            // in this example
            AnnotationSet defaultAnnots = doc.getAnnotations();
            Iterator annotTypesIt = annotTypesToWrite.iterator();
            while(annotTypesIt.hasNext()) {
                // extract all the annotations of each requested type and add them to
                // the temporary set
                AnnotationSet annotsOfThisType =
                        defaultAnnots.get((String)annotTypesIt.next());
                if(annotsOfThisType != null) {
                    annotationsToWrite.addAll(annotsOfThisType);
                }
            }

            // create the XML string using these annotations
            docXMLString = doc.toXml(annotationsToWrite);
        }
        // otherwise, just write out the whole document as GateXML
        else {
            docXMLString = doc.toXml();
        }

        // Release the document, as it is no longer needed
        Factory.deleteResource(doc);

        // output the XML to <inputFile>.out.xml
        String outputFileName = docFile.getName() + ".out.xml";
        File outputFile = new File(docFile.getParentFile(), outputFileName);

        // Write output files using the same encoding as the original
        FileOutputStream fos = new FileOutputStream(outputFile);
        BufferedOutputStream bos = new BufferedOutputStream(fos);
        OutputStreamWriter out;
        if(encoding == null) {
            out = new OutputStreamWriter(bos);
        }
        else {
            out = new OutputStreamWriter(bos, encoding);
        }

        out.write(docXMLString);

        out.close();
        System.out.println("done");
Watt
  • 3,118
  • 14
  • 54
  • 85