3

I am trying to call ClearTK's StanfordCoreNLPAnnotator from within UIMA RUTA, but cannot get it to work. I am using eclipse with a maven-enabled RUTA project in which I also have Java code for auxiliary tasks. I have imported cleartk-stanford-corenlp 0.8 using maven.

I tried using this line in my script:

ENGINE utils.MyStanfordEngine;

... where utils/MyStanfordEngine.xml is an XML descriptor file created using this java code:

MyStanfordAnnotator.getDescription().toXML(new FileOutputStream("descriptor/utils/MyStanfordEngine.xml"));

No errors appear, but upon execution I get:

Exception in thread "main" org.apache.uima.resource.ResourceInitializationException: Initialization of annotator class ... failed.  
(Descriptor: file:.../descriptor/mainScriptEngine.xml)
...
Caused by: org.apache.uima.resource.ResourceInitializationException: Annotator class 
"org.cleartk.stanford.StanfordCoreNLPAnnotator" was not found. 
(Descriptor: file:.../descriptor/utils/MyStanfordEngine.xml)
...

I think I understand that the RUTA project does not find it in the Maven dependencies, but I need to stick to Maven as my dependency tool because of collaboration purposes.

Can someone help?


UPDATE:

When I encountered the problem, I was using RUTA 2.1.0. I have updated to 2.2.0rc1 since then, but the problem persisted.

With Peter's suggestion below (Thanks!), in the Java build path, I referenced a blank Maven-enabled Java project that does nothing but imports cleartk-stanford-corenlp 0.8. I can now run the following RUTA code:

TYPESYSTEM utils.CleartkRutaTypeSystem;
ENGINE utils.MyStanfordEngine;
Document{-> CALL(MyStanfordEngine)};

... successfully does what looks like all intended annotations for all documents in the input folder, but eventually crashes with this Exception:

[Stanford Tools Logging output ...]
22.02.2014 12:44:22 org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl        callAnalysisComponentProcess(406)
SCHWERWIEGEND: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.    
at org.apache.uima.ruta.engine.RutaEngine.process(RutaEngine.java:477)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:374)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:298)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
at org.apache.uima.ruta.ide.launching.RutaLauncher.processFile(RutaLauncher.java:168)
at org.apache.uima.ruta.ide.launching.RutaLauncher.main(RutaLauncher.java:129)
Caused by: java.lang.NullPointerException
at org.apache.uima.cas.impl.CASImpl.createFS(CASImpl.java:483)
at org.apache.uima.cas.impl.CASImpl.createAnnotation(CASImpl.java:3837)
at org.apache.uima.ruta.action.CallAction.callEngine(CallAction.java:192)
at org.apache.uima.ruta.action.CallAction.execute(CallAction.java:62)
at org.apache.uima.ruta.rule.AbstractRuleElement.apply(AbstractRuleElement.java:130)
at org.apache.uima.ruta.rule.RuleElementCaretaker.applyRuleElements(RuleElementCaretaker.java:111)
at org.apache.uima.ruta.rule.ComposedRuleElement.applyRuleElements(ComposedRuleElement.java:547)
at org.apache.uima.ruta.rule.AbstractRuleElement.doneMatching(AbstractRuleElement.java:84)
at org.apache.uima.ruta.rule.ComposedRuleElement.fallback(ComposedRuleElement.java:468)
at org.apache.uima.ruta.rule.ComposedRuleElement.fallbackContinue(ComposedRuleElement.java:377)
at org.apache.uima.ruta.rule.RutaRuleElement.startMatch(RutaRuleElement.java:100)
at org.apache.uima.ruta.rule.ComposedRuleElement.startMatch(ComposedRuleElement.java:73)
at org.apache.uima.ruta.rule.RutaRule.apply(RutaRule.java:47)
at org.apache.uima.ruta.rule.RutaRule.apply(RutaRule.java:40)
at org.apache.uima.ruta.rule.RutaRule.apply(RutaRule.java:29)
at org.apache.uima.ruta.RutaScriptBlock.apply(RutaScriptBlock.java:63)
at org.apache.uima.ruta.RutaModule.apply(RutaModule.java:48)
at org.apache.uima.ruta.engine.RutaEngine.process(RutaEngine.java:475)
... 6 more
Exception in thread "main" org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.    
at org.apache.uima.ruta.engine.RutaEngine.process(RutaEngine.java:477)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:374)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:298)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
at org.apache.uima.ruta.ide.launching.RutaLauncher.processFile(RutaLauncher.java:168)
at org.apache.uima.ruta.ide.launching.RutaLauncher.main(RutaLauncher.java:129)
Caused by: java.lang.NullPointerException
at org.apache.uima.cas.impl.CASImpl.createFS(CASImpl.java:483)
at org.apache.uima.cas.impl.CASImpl.createAnnotation(CASImpl.java:3837)
at org.apache.uima.ruta.action.CallAction.callEngine(CallAction.java:192)
at org.apache.uima.ruta.action.CallAction.execute(CallAction.java:62)
at org.apache.uima.ruta.rule.AbstractRuleElement.apply(AbstractRuleElement.java:130)
at org.apache.uima.ruta.rule.RuleElementCaretaker.applyRuleElements(RuleElementCaretaker.java:111)
at org.apache.uima.ruta.rule.ComposedRuleElement.applyRuleElements(ComposedRuleElement.java:547)
at org.apache.uima.ruta.rule.AbstractRuleElement.doneMatching(AbstractRuleElement.java:84)
at org.apache.uima.ruta.rule.ComposedRuleElement.fallback(ComposedRuleElement.java:468)
at org.apache.uima.ruta.rule.ComposedRuleElement.fallbackContinue(ComposedRuleElement.java:377)
at org.apache.uima.ruta.rule.RutaRuleElement.startMatch(RutaRuleElement.java:100)
at org.apache.uima.ruta.rule.ComposedRuleElement.startMatch(ComposedRuleElement.java:73)
at org.apache.uima.ruta.rule.RutaRule.apply(RutaRule.java:47)
at org.apache.uima.ruta.rule.RutaRule.apply(RutaRule.java:40)
at org.apache.uima.ruta.rule.RutaRule.apply(RutaRule.java:29)
at org.apache.uima.ruta.RutaScriptBlock.apply(RutaScriptBlock.java:63)
at org.apache.uima.ruta.RutaModule.apply(RutaModule.java:48)
at org.apache.uima.ruta.engine.RutaEngine.process(RutaEngine.java:475)
... 6 more

Sorry for the whole stack trace, but I thought if a RUTA developer is reading this they may want the whole thing.

Is there a way to solve this? What am I doing wrong?

demongolem
  • 9,474
  • 36
  • 90
  • 105
  • Can you try to separate the UIMA Ruta project and the Maven project. I assume that you are using UIMA Ruta 2.1.0, which does not yet resolve the maven dependencies of the main Ruta Project, only of the dependent ones. (This is just a guess without testing) – Peter Kluegl Feb 22 '14 at 11:47
  • I see. I ran into this problem using RUTA 2.1.0, but upgraded to 2.2.0rc1 in hope that it would solve the problem, which it did not. I will try separating the projects and write about it here. – Matthias Grabmair Feb 22 '14 at 17:21
  • 2.2.0rc1 is unfortunately not enough, because the solution should be integrated in 2.2.0rc3, which does not yet exist. The new problem indicates that a type used in the analysis engine is missing. UIMA Ruta does not import the types of the analysis engine. They need to be imported separately as you do. I will try to reproduce the exception in order to take a look what the actual problem is. – Peter Kluegl Feb 23 '14 at 11:41
  • I created a simple project that should reproduce the desired functionality and added a short howto as an answer. Your second problem is most likely caused by the type system import not able to resvole inner imports or missing some types. – Peter Kluegl Feb 23 '14 at 14:38

1 Answers1

1

There are several limitations to consider:

  • UIMA Ruta 2.1.0 does not support mixin projects: maven dependencies need to be specified in another project. The Ruta project then has to depend on the additional java project.
  • UIMA Ruta Workbench 2.1.0 has some problems validating imported type system that import again other type systems by name. Here, rather import by location should be used.
  • UIMA CAS Editor 2.5.0 has some problems resolving type system imports using the datapath, which causes problems visualizing the created annotations if the type system descriptor needs additional information such as the datapath. Here, the creation of a type system descriptor of a script should include (not only import) all types of imported type systems. This can be configured in the preferences (I have not used that for a while). This problem can again be prevented by using import by location.
  • UIMA Ruta 2.2.0 supports mixin projects. Here, only the problem with the CAS Editor remains.

This described project can be created the following way (with UIMA Ruta 2.2.0):

  1. Create a new UIMA Ruta Project
  2. Make it a maven project: popup->Configure->Convert to Maven Project
  3. Add a dependency to cleartk-stanford-corenlp in the pom

    <dependency>
    <groupId>org.cleartk</groupId>
    <artifactId>cleartk-stanford-corenlp</artifactId>
    <version>0.8.0</version>
    </dependency>
    
  4. Provide the type systems in the descriptor folder or in a dependent project, e.g., copy the org folder of cleartk-type-system-1.2.0 to the descriptor folder. Mind that the CAS Editor will have problems resolving the imports, if the descriptors are not adapted.
  5. Create a simple script that imports the type system, imports the analysis engine and excutes the analysis engine. Here, the uimaFIT component is directly imported instead of a descriptor. The EXEC action need to be extended with interesting types if later rules should be able to operate on the result of the imported analysis engine.

    TYPESYSTEM org.cleartk.TypeSystem;
    UIMAFIT org.cleartk.stanford.StanfordCoreNLPAnnotator;
    Document{->EXEC(StanfordCoreNLPAnnotator)};
    
  6. If there is a text file in the import folder, then running this script should be able to annotate it.

This example directly uses the StanfordCoreNLPAnnotator instead of an additional analysis engine, but switching to another implementation or analysis engine should be straightforward.

Peter Kluegl
  • 3,008
  • 1
  • 11
  • 8
  • Since UIMA Ruta 2.2.0 is not yet available, I built a snapshot update site and put it there until the release is available: `http://people.apache.org/~pkluegl/temp/eclipse-update-site/`. Eclipse is sometimes not able to update these due to the missing timestamps. You may have to deinstall the other one before install the snapshot update site. – Peter Kluegl Feb 23 '14 at 14:35
  • sorry, the actual update site is of course `http://people.apache.org/~pkluegl/temp/eclipse-update-site/ruta/` – Peter Kluegl Feb 23 '14 at 14:57
  • Thanks, Peter! This is very helpful. I took the Cleartk type system jar from the project website and things import just fine. I still get the second exception, although. And yes, the cad editor has some issues with the imported type system. I will see that I resolve the imports or find a clever way around it. I will post here once I have things all working. Once I have 15 reputation, I will vote your answer up ;) – Matthias Grabmair Feb 24 '14 at 06:12
  • The "resolve imports" option in the RUTA preferences seems to work, but I get the impression it properly resolves only when the imported descriptor is in the same directory as the importing one. I solved the problem by using UIMAFIT's auto detect feature and saving a type system descriptor from there, putting it in the "descriptor" directory and importing it from my main script. Now the CAS editor seems to work like a charm. Thanks a lot again! – Matthias Grabmair Feb 24 '14 at 07:19