2

tess4j is an OCR packed with native library, I made a maven project to test it, I did add the installation path of maven to eclipse. I added M2_HOME, MAVEN_HOME and JAVA_HOME env variable,

here is my parent pom

<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                             http://maven.apache.org/maven-v4_0_0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>fr.mssb.ongoing</groupId>
    <artifactId>ongoing-parent</artifactId>
    <packaging>pom</packaging>
    <version>1.0</version>
    <name>ongoing</name>

    <modules>
        <module>capcha-solver</module>
    </modules>

    <build>
        <pluginManagement>
            <plugins>
                <!-- All project will be interpreted (source) and compiled (target) in java 7 -->
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-compiler-plugin</artifactId>
                    <configuration>
                        <source>1.7</source>
                        <target>1.7</target>
                    </configuration>
                </plugin>
                <!-- this will make eclipse:eclipse goal work and make the project Eclipse compatible -->
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-eclipse-plugin</artifactId>
                    <version>2.5.1</version>
                    <configuration>
                        <downloadSources>true</downloadSources>
                        <downloadJavadocs>true</downloadJavadocs>
                        <classpathContainers>
                            <classpathContainer>org.eclipse.jdt.launching.JRE_CONTAINER/org.eclipse.jdt.internal.debug.ui.launcher.StandardVMType/JavaSE-1.7</classpathContainer>
                        </classpathContainers>
                        <additionalBuildcommands>
                            <buildcommand>net.sf.eclipsecs.core.CheckstyleBuilder</buildcommand>
                        </additionalBuildcommands>
                        <additionalProjectnatures>
                            <projectnature>net.sf.eclipsecs.core.CheckstyleNature</projectnature>
                        </additionalProjectnatures>
                    </configuration>
                </plugin>
            </plugins>
        </pluginManagement>
    </build>

    <!-- All child pom will inherit those dependancies -->
    <dependencies>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.12</version>
            <scope>test</scope>
        </dependency>
    </dependencies>
</project>

and here is my child pom

<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                             http://maven.apache.org/maven-v4_0_0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <parent>
        <groupId>fr.mssb.ongoing</groupId>
        <artifactId>ongoing-parent</artifactId>
        <version>1.0</version>
    </parent>

    <groupId>fr.mssb.ongoing</groupId>
    <artifactId>capcha-solver</artifactId>
    <version>1.0</version>
    <packaging>jar</packaging> <!-- I think this is useless -->

    <name>A capcha solver based on terassec ocr</name>

    <build>
        <plugins>
            <!-- autorun unit tests during maven compilation -->
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <configuration>
                    <argLine>-Xmx1024m -XX:MaxPermSize=256m -XX:-UseSplitVerifier</argLine>
                    <skipTests>-DskipTests</skipTests>
                </configuration>
            </plugin>

            <!--  this should make the tesseract ocr native dll work without doing anything -->
            <plugin>
                <groupId>com.googlecode.mavennatives</groupId>
                <artifactId>maven-nativedependencies-plugin</artifactId>
                <version>0.0.7</version>
                <executions>
                    <execution>
                        <id>unpacknatives</id>
                        <goals>
                            <goal>copy</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

    <dependencies>
        <!-- 
        Log4j 2 is broken up in an API and an implementation (core), where the API 
        provides the interface that applications should code to. Strictly speaking 
        Log4j core is only needed at runtime and not at compile time.
        However, below we list Log4j core as a compile time dependency to improve 
        the startup time for custom plugins. 
        -->
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-api</artifactId>
            <version>2.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-core</artifactId>
            <version>2.1</version>
        </dependency>
        <!--
        Integration of tesseract OCR
        -->
        <dependency>
            <groupId>net.sourceforge.tess4j</groupId>
            <artifactId>tess4j</artifactId>
            <version>1.4.1</version>
        </dependency>
    </dependencies>

</project>

and of course, the code (taken from tess4j example)

package test;

import java.io.File;

import net.sourceforge.tess4j.Tesseract;
import net.sourceforge.tess4j.TesseractException;

/**
 * Classe d'exemple.
 */
public class TesseractExample {

    public static void main(String[] args) {
        File imageFile = new File("C:\\DEV\\repo\\ongoing\\capcha-solver\\src\\test\\resources\\random.jpg");
        Tesseract instance = Tesseract.getInstance();  // JNA Interface Mapping
        // Tesseract1 instance = new Tesseract1(); // JNA Direct Mapping

        try {
            String result = instance.doOCR(imageFile);
            System.out.println(result);
        } catch (TesseractException e) {
            System.err.println(e.getMessage());
        }
    }
}

When I lauch it I'm getting this exception

Exception in thread "main" java.lang.NoSuchFieldError: RESOURCE_PREFIX
    at net.sourceforge.tess4j.util.LoadLibs.<clinit>(LoadLibs.java:60)
    at net.sourceforge.tess4j.TessAPI.<clinit>(TessAPI.java:40)
    at net.sourceforge.tess4j.Tesseract.init(Tesseract.java:303)
    at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:239)
    at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:188)
    at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:172)
    at test.TesseractExample.main(TesseractExample.java:19)

I don't know if this is tess4j related or a JNA/JNI problem, as you can see I have a plugin that "should" (never worked with DLLs before) make them work.

Also in the parent pom my plugin are betwen plugin managment tags, I think I should have put them betwen build tags, no?

Any idea?

Thanks.

sliders_alpha
  • 2,276
  • 4
  • 33
  • 52
  • Might be a JNA version mismatch. Try running with `-Djna.nosys=True` to avoid inadverently picking up an (older) version of JNA installed on your system. – technomage Feb 03 '15 at 22:12
  • Nope, does not change anything – sliders_alpha Feb 03 '15 at 23:16
  • [`RESOURCE_PREFIX`](http://twall.github.io/jna/4.0/javadoc/com/sun/jna/Platform.html) is a JNA constant. Do you have it installed? – nguyenq Feb 04 '15 at 00:14
  • yes I have installed the VS2013 visual c++ redistribuable as indicated here : http://tess4j.sourceforge.net/usage.html – sliders_alpha Feb 04 '15 at 06:59
  • You'll need to add jna 4.1.0 dependency. – nguyenq Feb 04 '15 at 10:49
  • does not change anything, I can't see the DLLs in target/native, maybe the maven nativedependancies plugins is not working properly – sliders_alpha Feb 04 '15 at 17:04
  • I tried exactly as they described it here for eclipse, no maven, and I get the same error http://tess4j.sourceforge.net/tutorial/ – sliders_alpha Feb 04 '15 at 17:33
  • Ok, i don't know where to start, but lets try: 1) you should not set up your a maven project using a tess4j howto for non maven projects - where you have to copy the dlls by hand (which is not the case with the tess4j 1.4.1 maven version) 2) you should provide a complete example so we can reconstruct the problem - which if doing it right with maven and letting tess4j copy the dlls is not the case – 4F2E4A2E Feb 06 '15 at 10:17

4 Answers4

1

There was 2 problems

1/ some dlls and files from tess4j had to be copied to the project root directory

2/ tess4j had a transitive dependancy toward com.sun.jna:jna:jar:3.0.9 conflicting with net.java.dev.jna:jna:jar:4.1.0 (also from tess4j) ecluding the 3.0.9 version makes everything work, the RESSOURCE_PREFIX error was coming from that

pom.xml for 32 bit version (you need a 32 bit JVM installed) which takes care of those 2 things, change win32-x86 to win32-x86-64 if you want to use this in 64 bits

<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                             http://maven.apache.org/maven-v4_0_0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>fr.mssb.ocr</groupId>
    <artifactId>tesseractOcr</artifactId>
    <version>1.0</version>
    <packaging>jar</packaging>

    <name>tesseract ocr project</name>

    <build>
        <plugins>
            <!--  
            this extract the 32 bits dll and the tesseractdata folder to 
            the project root from tess4j.jar  
            -->
            <plugin>
                <groupId>org.apache.portals.jetspeed-2</groupId>
                <artifactId>jetspeed-unpack-maven-plugin</artifactId>
                <version>2.2.2</version>
                <dependencies>
                    <dependency>
                      <groupId>net.sourceforge.tess4j</groupId>
                      <artifactId>tess4j</artifactId>
                      <version>1.4.1</version>
                    </dependency>
                </dependencies>
                <executions>
                    <execution>
                        <id>unpack-step</id>
                        <phase>compile</phase>
                        <goals>
                            <goal>unpack</goal>
                        </goals>
                        <configuration>
                            <unpack>
                                <artifact>net.sourceforge.tess4j:tess4j:jar</artifact>
                                <overwrite>true</overwrite>
                                <resources combine.children="append">
                                    <resource>
                                        <path>win32-x86</path>
                                        <destination>../</destination>
                                        <overwrite>true</overwrite>
                                        <flat>true</flat>
                                        <include>*</include>
                                    </resource>
                                    <resource>
                                        <path>tessdata</path>
                                        <destination>../tessdata</destination>
                                        <overwrite>true</overwrite>
                                        <flat>true</flat>
                                        <include>*</include>
                                    </resource>
                                    <resource>
                                        <path>tessdata/configs</path>
                                        <destination>../tessdata/configs</destination>
                                        <overwrite>true</overwrite>
                                        <flat>true</flat>
                                        <include>*</include>
                                    </resource>
                                </resources>
                            </unpack>
                            <verbose>true</verbose>
                        </configuration>
                        </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

    <dependencies>
        <dependency>
            <groupId>net.sourceforge.tess4j</groupId>
            <artifactId>tess4j</artifactId>
            <version>1.4.1</version>
              <exclusions>
                <exclusion>
                    <groupId>com.sun.jna</groupId>
                    <artifactId>jna</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
    </dependencies>

</project>
sliders_alpha
  • 2,276
  • 4
  • 33
  • 52
  • I must say that, removing the jna 3.0.9 version does not seems to affect the project, but still, i think your project is not setted up right: you should not set up your a maven project using a tess4j howto for non maven projects - where you have to copy the dlls by hand (which is not the case with the tess4j 1.4.1 maven version) – 4F2E4A2E Feb 06 '15 at 10:20
  • imho this is a wrong assunption you are making. the files does not have to be copied like you are doing it. the tess4j project are taking care of it please provide proof or an project example, thy. – 4F2E4A2E Feb 06 '15 at 10:28
  • @sliders_alpha, I stumbled upon exactly the same problem and has worked with the solution you provided. Would anyone know of a way to increase the accuracy of the OCR? I've only numbers that are present in an image that I'd like to extract. – L P May 15 '15 at 15:52
1

The child pom could be easily built without any problems and manually copying libs, this is not TESS4J related. Anyway the jna 3.0.9 could be removed if not needed anymore: https://github.com/nguyenq/tess4j/issues/8

Still, all you have to do to run tess4j is the maven dependency:

<dependency>
    <groupId>net.sourceforge.tess4j</groupId>
    <artifactId>tess4j</artifactId>
    <version>1.4.1</version>
</dependency>

and the correct use of the TESS4J-API, for example:

File imageFile = new File("C:\\random.png");
Tesseract instance = Tesseract.getInstance();

//In case you don't have your own tessdata, let it also be extracted for you
File tessDataFolder = LoadLibs.extractTessResources("tessdata");

//Set the tessdata path
instance.setDatapath(tessDataFolder.getAbsolutePath());

    try {
        String result = instance.doOCR(imageFile);
        System.out.println(result);
    } catch (TesseractException e) {
        System.err.println(e.getMessage());
    }

That's it!

4F2E4A2E
  • 1,964
  • 26
  • 28
1

The problem is caused by the conflict between net.java.dev.jna:jna and com.sun.jna:jna. Both jars contain a class com.sun.jna.Platform. Both jars are declared as tess4j dependencies. To solve this you can omit the second dependency in your pom:

<dependency>
    <groupId>net.sourceforge.tess4j</groupId>
    <artifactId>tess4j</artifactId>
    <version>1.4.1</version>
    <exclusions>
        <exclusion>
            <groupId>com.sun.jna</groupId>
            <artifactId>jna</artifactId>
        </exclusion>
    </exclusions>
</dependency>    
Marcos Pirmez
  • 256
  • 4
  • 5
0

because the JNA version mismatch. you are using more than one version in class path library. just use one version of JNA.