0

After reading several posts on SO and Apache sites, I got the following down in my build path

tika-app-1.10.jar
poi-3.13.jar
poi-examples-3.13.jar
poi.excelant-3.13.jar
poi-ooxml-3.13.jar
poi-ooxml-schemas-3.13.jar
poi-scratchpad-3.13.jar
openxml4j-1.0-beta.jar
xmlbeans-2.6.jar

Despite having these, I cannot seem tot parse .doc and .doc files using, but PDf, JPEG work fine. I am trying to understand why it would not work properly for office documents when I have all the dependencies listed?

The relevant stack trace is also posted here

Community
  • 1
  • 1
ha9u63a7
  • 6,233
  • 16
  • 73
  • 108
  • 1
    It would be helpful if you mention why you "cannot seem to parse .doc files". What errors are thrown? Why else you cannot parse? – Axel Richter Jan 17 '16 at 05:26
  • If you want to use Ant, why not have [Apache Ivy](http://ant.apache.org/ivy/) (part of the Ant project) do the dependency stuff for you? – Gagravarr Jan 17 '16 at 06:07
  • Possible duplicate of [issues using apache tika Parser object to parse .doc and .docx file formats](http://stackoverflow.com/questions/34788989/issues-using-apache-tika-parser-object-to-parse-doc-and-docx-file-formats) – centic Jan 18 '16 at 15:19

0 Answers0