1

I am developing OCR application for two different platforms.

  1. Android
  2. Windows

I am using android studio for android application development using Tess-two library version 5.4.1. Following is my code that I am using for text recognition from image.

myBitmap = BitmapFactory.decodeFile(imagePaths[i]);
TessBaseApi tessApi = new TessBaseAPI();
tessApi.init(dirPathSDCard, "eng");
tessApi.setImage(myBitmap);
returnedText = tessApi.getUTF8Text();                   
Log.e("Returned Text", returnedText)
tessApi.end()

for pc I am using java programming language with Eclipse IDE. i am using Tess4j for my java application. Following is my code which shows how i get recognized text from the image

public static void main(String args[]) throws IOException{

  try {     
    ITesseract instance = new Tesseract();
    instance.setDatapath("tessdata");
    String result = instance.doOCR(imgFile);
    System.out.println(result);
  } 
  catch(TesseractException e) {
    System.err.println(e.getMessage());}    

  }

my question is I believe both of these wrapper classes are based on the Tesserect OCR engine written in c/c++. Sometimes I get accurate text from android application but java program does not return accurate result for the same image. And sometimes I get accurate result from java program but android app does not return accurate text for that image. if the core libraries for both the wrapper classes are same then why does both of them produce different results. I am testing the following image for which android app returns accurate text but java program does not.

enter image description here

is it possible to get same result from both of these programs. any help would be appreciated. Thank you very much for your time and assistance in this matter.

  • 2
    There is no guarantee that the two programs are doing the same thing, since the Tesseract API calls are completely different. At a guess, the underlying implementation is a neutral network. Even if the API calls ultimately call the came code, different matrices would return different results. Training the networks against each other might be possible – pojo-guy Sep 24 '17 at 12:43
  • you mean if i train a model on my own data so in that situation i can expect for same results from both the applications? – Nasir Rahim Sep 24 '17 at 13:05
  • Train the two models on a larger set of data, with the two models competing during the training phase. It greatly increases the learning rate to train with competing models. – pojo-guy Sep 24 '17 at 13:21
  • i am using same training model for both platforms but still i am getting different results. – Nasir Rahim Sep 24 '17 at 13:49
  • 1
    Yes, that's normal with neural nets, until you have a sufficient sample size to eliminate sample bias. Every training execution is different. The success of the approach is dependent on the non deterministic nature of systems with three or more independent variables, but the training time is quite extensive. – pojo-guy Sep 24 '17 at 15:40

0 Answers0