-1

I'm trying to use tesseract to do OCR on an image in java. I realize there are wrappers like Tess4J that provide a bunch more functionality and stuff, but I've been struggling to get it set up properly. Simply running a one-line command with Runtime is really all I need anyways since this is just a personal little project and doesn't need to work on other computers or anything.

I have this code:

import java.io.IOException;

public class Test {
    public static void main(String[] args) {
        System.out.println(scan("full-path-to-test-image"));
    }
    public static String scan(String imgPath) {
        String contents = "";
        String cmd = "[full-path-to-tesseract-binary] " + imgPath + " stdout";
        try { contents = execCmd(cmd); }
        catch (IOException e) { e.printStackTrace(); }
        return contents;
    }
    public static String execCmd(String cmd) throws java.io.IOException {
        java.util.Scanner s = new java.util.Scanner(Runtime.getRuntime().exec(cmd).getInputStream()).useDelimiter("\\A");
        return s.hasNext() ? s.next() : "";
    }
}

When it's compiled and run directly from terminal, it works perfectly. When I open the exact same file in eclipse, however, it gives an IOException:

java.io.IOException: Cannot run program "tesseract": error=2, No such file or directory

What's going on? Thank you for any help.

sc8ing
  • 369
  • 1
  • 3
  • 9
  • You will want to become familiar with the concept of “current directory” and “relative path.” These are not Java concepts, but fundamental file system concepts. – VGR Dec 15 '17 at 18:59
  • I'm not entirely sure what you're suggesting. Tesseract isn't in the current directory/working directory when the program's run directly from terminal or from eclipse, and the command works regardless of the current directory when on the command line. – sc8ing Dec 15 '17 at 19:14
  • I may have misinterpreted the output. I took it to mean that tesseract itself could not find your file. But I may be wrong. It may be that Eclipse is running in an environment with a different PATH environment variable than your terminal’s shell. – VGR Dec 15 '17 at 19:19
  • I changed the cmd string to include the full path to the binary, so the PATH variable shouldn't make a difference any more, correct? – sc8ing Dec 15 '17 at 19:25
  • You can try to print out the environment for both runs, from terminal and from IDE, and see if there's anything different there – Alex Savitsky Dec 15 '17 at 19:27
  • Correct, that should fix it. – VGR Dec 15 '17 at 19:28
  • Okay, so I thought the full path to the binary still wasn't working and was very confused. Turns out eclipse made a copy of the test class when I opened it with the same name and the modified version to include the full path wasn't the one set to run in the run configuration. Thanks for your help. – sc8ing Dec 15 '17 at 19:31

1 Answers1

2

Check the working folder in the run configuration for the Test class in Eclipse. I bet it's different from the one when you run the same program from a terminal.

Alex Savitsky
  • 2,306
  • 5
  • 24
  • 30
  • I set them to be the same, but no luck. Tesseract was installed with homebrew, so it's not located in either of the directories the java files are in anyways. – sc8ing Dec 15 '17 at 19:11
  • So then in both cases the Tesseract binary is supposed to be located via the system PATH variable? As a wild guess - restart your IDE. If you installed Tesseract after starting your IDE, it won't receive any updates to the PATH that happened after it was started – Alex Savitsky Dec 15 '17 at 19:14
  • Seems that wasn't the problem. Also, to eliminate possible problems coming from the PATH variable I replaced the cmd string with the full path to the binary. – sc8ing Dec 15 '17 at 19:23