0

My main program looks something like:

    public static void main(String args[]) throws UIMAException, IOException{
    //TypeSystemDescription tsd = TypeSystemDescriptionFactory.createTypeSystemDescription(Question.class);

    AggregateBuilder builder = new AggregateBuilder();
    //builder.add(SentenceAnnotator.getDescription());
    builder.add(AnalysisEngineFactory.createPrimitiveDescription(POSAnnotator1.class,
            ExampleComponents.TYPE_SYSTEM_DESCRIPTION,
            GenericJarClassifierFactory.PARAM_CLASSIFIER_JAR_PATH, outputDirectory + File.separator + "model.jar",
            CleartkAnnotator.PARAM_IS_TRAINING, true,
            DefaultDataWriterFactory.PARAM_DATA_WRITER_CLASS_NAME, InstanceDataWriter.class.getName(),
            DirectoryDataWriterFactory.PARAM_OUTPUT_DIRECTORY, new File(outputDirectory)));

    JCas jcas = JCasFactory.createJCas();
    jcas.setDocumentText(testData);

    SimplePipeline.runPipeline(jcas, builder.createAggregateDescription());

}

Can anyone explain this error?

Caused by: java.lang.IllegalArgumentException: Errors initializing [class org.cleartk.classifier.jar.DefaultSequenceDataWriterFactory] Field 'dataWriterClassName' is required

I have tried to replace InstanceDataWriter with other data writers, but they do not work.

demongolem
  • 9,474
  • 36
  • 90
  • 105
VJune
  • 1,195
  • 5
  • 16
  • 26

1 Answers1

1

the POS Annotator uses a Sequence Data Writer to write the examples. You set the DataWriter class name Parameter from DefaultDataWriterFactory.PARAM_DATA_WRITER_CLASS_NAME. It should be DefaultSequenceDataWriterFactory.PARAM_DATA_WRITER_CLASS_NAME for a sequence data writer. Sequence means that you have more than 1 label in a cas (many POS Tags) in contrast to a "normal" classifier which gives only 1 label for the whole document.

http://cleartk.googlecode.com/svn-history/r4142/tags/cleartk-release-1.2.0/apidocs/org/cleartk/classifier/jar/DefaultSequenceDataWriterFactory.html

Andreas
  • 707
  • 1
  • 6
  • 23