0

I´am using "rmtheis:tess-two" (tesseract) for identify numbers (only digits), but this don´t work.

My code:

import com.googlecode.tesseract.android.TessBaseAPI;
...
mTess = new TessBaseAPI();
// mTess.setPageSegMode(TessBaseAPI.PageSegMode.PSM_OSD_ONLY);
mTess.setVariable(TessBaseAPI.VAR_CHAR_WHITELIST, "0123456789");
mTess.setVariable(TessBaseAPI.VAR_CHAR_BLACKLIST,"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmopqrstuvwxyz");
...
mTess.setImage(bitmap);
String str = mTess.getUTF8Text();

I use this image: Image "number_57689.png"

And "str" var get "s7esy" instead of "57689".

Curiously, if I use "tess4j.Tesseract" (http://tess4j.sourceforge.net/) in Java outside of Android (With identical image), this works fine.

Can you help me, please? @rmtheis

Thanks!

Xococode
  • 1
  • 1
  • I also comment that I have tried with "eng.traineddata". Don´t works, but doesn´t crash. If I try with "digits.traineddata", "digits1.traineddata", "digits_comma.traineddata" or "engmorse.traineddata"..., the native API crashes directly.... – Xococode Nov 30 '19 at 00:56
  • The blacklist/whitelist issue was corrected in Tesseract 4.1.0. Does your tess-two use that version? – nguyenq Dec 07 '19 at 20:02
  • I have this in my gradle dependencies: implementation 'com.rmtheis:tess-two:9.0.0' – Xococode Dec 08 '19 at 22:20

0 Answers0