net.sourceforge.tess4j is throwing wrong results when reading data from image

Question

I am trying to work with OCR (Optical Character Reorganization). I have a sample image and i want to read data out of it. Below is my sample image file.

I have used tess4j API to read the text from image. Please find the below piece of code.

public static String crackImage(String filePath) {
        File imageFile = new File(filePath);
        ITesseract instance = new Tesseract();
        instance.setLanguage("eng");
        try {
            String result = instance.doOCR(imageFile);
            return result;
        } catch (TesseractException e) {
            System.err.println(e.getMessage());
            return "Error while reading image";
        }
    }
    public static void main(String[] args) {
       String results = crackImage("D:\\data\\testImage.PNG");
       System.out.print(results);
    }

Below is the dependency i have in my pom.xml file.

    <dependencies>
        <dependency>  
            <groupId>net.sourceforge.tess4j</groupId>  
            <artifactId>tess4j</artifactId>  
            <version>3.2.1</version>  
        </dependency>
    </dependencies>

And i have created tessdata\eng.traineddata structure in my project directory.

When i run the code. It is working fine but i am getting some wrong results (May be in different language) like below.

Creale a Voumhe metauzoa mwwer usmg szz

I am not sure, why this text printed as a result, even when i set language as ENGLISH explicitly. Can someone help me to solve this issue.

Looks like the image requires some pre-processing. Converting it to a grayscale or B/W may help. — nguyenq, Jul 18 '17 at 22:15
@nguyenq: I tried with several images. All are giving me the similar results. But i didn't tried converting the image to gray-scale or B/W. I will give a try. Thanks. Can you suggest some examples how to preprocess the image. — Manindar, Jul 19 '17 at 04:48
Try `ImageHelper.convertImageToGrayscale` method provided by Tess4J. — nguyenq, Jul 19 '17 at 20:11

net.sourceforge.tess4j is throwing wrong results when reading data from image

0 Answers0