0

I am trying to use Tesseract to have OCR functionality in a Java application. To achieve this, I am using the Java/Tesseract bridge found here.

pom.xml dependency:

<dependency>
    <groupId>org.bytedeco.javacpp-presets</groupId>
    <artifactId>tesseract</artifactId>
    <version>3.04-1.1</version>
</dependency>

It works, I can use the library to OCRize an image. But when the Java program finishes, the JVM crashes. For a minimal example, even the very first Tesseract initialization line is enough:

import org.bytedeco.javacpp.tesseract.TessBaseAPI;

public class MinimalExample {

    public static void main(String[] args) {
        System.out.println("Hi!");
        TessBaseAPI tessAPI = new TessBaseAPI();
    }
}

If I run this main, it gives the following:

Hi!

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

And the following error message: Java(TM) Platform SE binary funktioniert nicht mehr – Windows kann online nach einer Lösung für das Problem suchen. (Java(TM) Platform SE binary does not work anymore – Windows can look for a solution to this problem online).

Problemsignatur:
  Problemereignisname:  APPCRASH
  Anwendungsname:   java.exe
  Anwendungsversion:    8.0.650.17
  Anwendungszeitstempel:    5614685f
  Fehlermodulname:  libgcc_s_dw2-1.dll
  Fehlermodulversion:   0.0.0.0
  Fehlermodulzeitstempel:   3f263ec2
  Ausnahmecode: 40000015
  Ausnahmeoffset:   000149a1
  Betriebsystemversion: 6.1.7601.2.1.0.256.49
  Gebietsschema-ID: 1031
  Zusatzinformation 1:  7309
  Zusatzinformation 2:  73092f5dbc78923c702ae5601110d2ea
  Zusatzinformation 3:  9fa1
  Zusatzinformation 4:  9fa11625863fb37077a4ab55be352b96

I've never had Java crashing before – but I've also never used natives before. ;-) Does anybody have a hint where to look for a solution to this strange behaviour?

Edit 2015-12-07: Using ListDLLs, I've seen that the DLL in question is located in C:\Users\...\AppData\Local\Temp\javacpp3256864312633\libgcc_s_dw2-1.dll, so "Wrong DLL from %PATH%" is not the answer.

technomage
  • 9,861
  • 2
  • 26
  • 40
Kurtibert
  • 638
  • 1
  • 5
  • 16
  • Looks like an issue with MSYS2: http://sourceforge.net/p/msys2/mailman/msys2-users/thread/87oat28p6z.fsf@wanadoo.es/ Sounds like this is fixed in the latest version. Would need to rebuild to find out. – Samuel Audet Dec 08 '15 at 13:45
  • @SamuelAudet: Do I understand this correctly, this would mean, _the Tesseract-Libraries_ would have to be recompiled and the error lies there? – Kurtibert Dec 09 '15 at 11:47
  • It looks like the issue lies in the C++ runtime, and we might need to rebuild. Simply replacing `libgcc_s_dw2-1.dll` with the newest version from MSYS2 might also work. – Samuel Audet Dec 10 '15 at 01:23
  • Ohhh, I just got it now: `Samiel Audet == saudet`! ;) How could I replace the dll? It lies in a jar that is loaded by Maven, I do not know how I should interfere with that process without sacrifying the whole Maven process benefit. – Kurtibert Dec 10 '15 at 11:33
  • Well, try it outside Maven and if it works, we'll figure something out. – Samuel Audet Dec 11 '15 at 02:46
  • I thought, maybe I could go to my local maven archive and replace the libgcc dll in there with a new one, renew the jar's checksum and give it a try. But I am not able to find that new libgcc file! I've looked for it in the `msys2-base-i686-20150916.tar.xz` from http://sourceforge.net/projects/msys2/files/Base/i686/, but it is not in there. Where can I get it from? – Kurtibert Dec 15 '15 at 15:42
  • The latest is in http://repo.msys2.org/mingw/i686/mingw-w64-i686-gcc-libs-5.2.0-4-any.pkg.tar.xz – Samuel Audet Dec 16 '15 at 01:58
  • Thanks a lot. :-) I've tried it: 1) CRC old dll: `8D4B1312`, CRC new dll: `9F228208`. 2) Injected the new DLL into `leptonica-1.72-1.1-windows-x86.jar` and updated its `.sha1` file. 3) Started debugging the minimal example, used listDLLs: It uses the new dll now. 4) Ended debugging: The exact same crash, only the last two numbers were new (`aee84664081fda1f8a832a07ca541b3c`). – Kurtibert Dec 16 '15 at 09:53
  • So I guess we'll need to rebuild from scratch to find out. If you feel up to it, instructions are here: https://github.com/bytedeco/javacpp-presets/wiki/Build-Environments#windows-x86-and-x86_64 And post an issue to make sure I don't forget and test it out later. Thanks! – Samuel Audet Dec 17 '15 at 01:17
  • Do I need to build it on the computer I want to use it, or can I install all the needed stuff in a virtual machine? – Kurtibert Jan 09 '16 at 13:04
  • Sure, virtual machines are fine. – Samuel Audet Jan 10 '16 at 13:52
  • The issue was never in MSYS2's gcc-libs, but rather in gcc itself, I fixed it in GCC's specs file actually. From an initial investigation a while back, upstream GCC came up with their own fix for it rendering my fix redundant but definitely harmless. I have no idea what javacpp-preset is though; we don't have OpenJDK (if that's a dependency? - external dependencies are unacceptable to us) or javacpp-preset packages in MSYS2 and none of you have approached us with PKGBUILDs so I can't help you properly until that happens (hint hint). – Ray Donnelly Jan 22 '16 at 18:19
  • Okay, now my understanding of what has to be done is fading... – Kurtibert Jan 27 '16 at 16:41
  • @Kurtibert We meed to wait until the fix percolates to MSYS2 I guess... – Samuel Audet Feb 20 '16 at 06:58
  • Okay. I had to look up what "percolate" means, but now I got it. ;-) – Kurtibert Feb 25 '16 at 09:56

1 Answers1

0

It might have a problem with libwinpthread-1.dll.

Replace current libwinpthread-1.dll in jar with latest mingw32's dll, and it works fine.

  1. install msys2-x86_64-20150916.exe downloaded from https://msys2.github.io/ .
  2. install base-devel, mingw-w64-i686-toolchain using pacman.
  3. extract a leptonica-1.72-1.1-windows-x86.jar, and put all dlls into the same folder of your application.
  4. remove leptonica-1.72-1.1-windows-x86.jar from classpath.
  5. remove libwinpthread-1.dll from the folder (or replace the libwinpthread-1.dll to installed C:\msys64\mingw32\bin\libwinpthread-1.dll). A path "C:\msys64\mingw32\bin" seems to be loaded first, so if you can install mingw32 ,there is no need to remove(or replace) it.
nakag
  • 1
  • 2