0

While invoking cTAKES parser from tika-app getting following:

java -classpath $HOME/src/ctakes-config:${TIKA_HOME}/tika-app/target/tika-app-X.Y-SNAPSHOT.jar:${CTAKES_HOME}/desc:${CTAKES_HOME}/resources:${CTAKES_HOME}/lib/* org.apache.tika.cli.TikaCLI --config=$HOME/src/ctakes-config/tika-config.xml -m Vose-2013-American_Journal_of_Hematology.pdf

Exception

Screenshot of Exception java.lang.NoSuchMethodError

enter image description here

Exception in thread "main" java.lang.NoSuchMethodError: opennlp.tools.sentdetect.SentenceModel.getMaxentModel()Lopennlp/model/AbstractModel;

I have followed the steps mentioned in this link. I am unable to understand the cause of this error and hence how to resolve this.

I am also getting following warning: Warning

Feb 16, 2020 12:19:58 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: J2KImageReader not loaded. JPEG2000 files will not be processed. See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io for optional dependencies.

Feb 16, 2020 12:19:59 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: org.xerial's sqlite-jdbc is not loaded. Please provide the jar on your classpath to parse sqlite files. See tika-parsers/pom.xml for the correct version.

I have tried to resolve it using answers in this link, but it wasn't of much help. i know these are only warnings and hope are not causing the error and am using tika only by installing it

System Information

  • OS ubuntu 16.04
  • JDK openJDK8.
  • Maven 3.3.9
  • Apache tika 1.23
  • Apache cTAKES 3.2.2
James Z
  • 12,209
  • 10
  • 24
  • 44
blueiris
  • 11
  • 3

2 Answers2

1

I've addressed this. It had to do with incompatible versions of the Apache OpenNLP library. The Tika CTAKES parser was pinned to 1.5.3, and cTAKES 3.2.2 uses that version, but Tika Parsers has since evolved to use a newer version.

The fix was to reference the older OpenNLP 1.5.3 jar in the classpath. I have updated the wiki here: https://cwiki.apache.org/confluence/display/TIKA/CTAKESParser

java -classpath $HOME/src/ctakes-config:${CTAKES_HOME}/lib/opennlp-tools-1.5.3.jar:${TIKA_HOME}/tika-app/target/tika-app-X.Y-SNAPSHOT.jar:${CTAKES_HOME}/desc:${CTAKES_HOME}/resources:${CTAKES_HOME}/lib/\* org.apache.tika.cli.TikaCLI \
--config=$HOME/src/ctakes-config/tika-config.xml \
-m Vose-2013-American_Journal_of_Hematology.pdf 
0

I was able to invoke cTAKES from tika app after installing Apache tika-1.10 Both versions of cTAKES and TIKA were incompatible

blueiris
  • 11
  • 3