2

I am trying to extract the text for a bunch of documents(.pdf, .doc, etc) present in an "Input" using (in cygwin)

java -jar tika-app-1.14.jar -t -i /Inputfolder -o /Outputfolder 

The causeForTermination is "COMPLETED_NORMALLY" but I can't see any files in the output folder. What am I not specifying?

alapalak
  • 147
  • 2
  • 2
  • 9
  • @NicomedesE. Do all the documents present in the input folder get parsed and appear in the output folder? That doesnt seem to be happening in my case – alapalak Apr 28 '17 at 08:06
  • 1
    Rechecked! This command worked for me too! I guess I messed up the path of the folders. – alapalak Apr 28 '17 at 08:16
  • @NicomedesE. I used this command in my R script with the system command and noticed that it stops abruptly with the causeForTermination being "USER_INTERRUPTION" and only a couple of files get parsed (I have about 20). When i rerun the script, the command parses some more files. To parse all the files, I have to run the command a few times instead of just once. Any idea how to tackle this? – alapalak May 02 '17 at 21:46

0 Answers0