How to extract dotted text from image?

Question

I'm working on my bachelor's degree final project and I want to create an OCR for bottle inspection with python. I need some help with text recognition from the image. Do I need to apply the cv2 operations in a better way, train tesseract or should I try another method?

I tried image processing operations on the image and I used pytesseract to recognize the characters.

Using the code bellow I got from this photo:

to this one:

and then to this one:

Sharpen function:

def sharpen(img):
  sharpen = iaa.Sharpen(alpha=1.0, lightness = 1.0)
  sharpen_img = sharpen.augment_image(img)
  return sharpen_img

Image processing code:

textZone = cv2.pyrUp(sharpen(originalImage[y:y + h - 1, x:x + w - 1])) #text zone cropped from the original image

sharp = cv2.cvtColor(textZone, cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(sharp, 127, 255, cv2.THRESH_BINARY)

#the functions such as opening are inverted (I don't know why) that's why I did opening with MORPH_CLOSE parameter, dilatation with erode and so on

kernel_open = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
open = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel_open)

kernel_dilate = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,7))
dilate = cv2.erode(open,kernel_dilate)

kernel_close = cv2.getStructuringElement(cv2.MORPH_RECT, (1, 5))
close = cv2.morphologyEx(dilate, cv2.MORPH_OPEN, kernel_close)

print(pytesseract.image_to_string(close))

This is the result of pytesseract.image_to_string:

22203;?!)

92:53 a

The expected result is :

22/03/20

02:53 A

@OznOg I'm new to tesseract and python so I'm not shure if i did it well. I modified the image_to_string call to this: pytesseract.image_to_string(close, config="-c tessedit_char_whitelist=0123456789abcdefghijklmnopqrstuvwxyz/: -psm 6") The result got a little better: 2343/20 92:53 a — Paul Szabo, May 25 '19 at 14:08

score 2 · Answer 1 · answered May 25 '19 at 14:08

From the result you got and the expected result, you can see that some of the characters are recognized correctly. Assuming you are using a different image from that shown in the tutorial, I recommend you to change the values of threshold and getStructuringElement.

These values work better depending on the image color. The tutorial author must have optimized it for his/her use (by trial and error or some other way).

Here is a video if you want to play around with those value using sliders in opencv. You can also print your result in the same loop to see if you are getting the desired result.

Thank you, I will try this out! – Paul Szabo May 25 '19 at 14:16 — Paul Szabo, May 25 '19 at 14:16

score 2 · Accepted Answer · answered May 26 '19 at 02:55

"Do I need to apply the cv2 operations in a better way, train tesseract or should I try another method?"

First, kudos for taking this project on and getting this far with it. What you have from the OpenCV/cv2 standpoint looks pretty good.

Now, if you're thinking of Tesseract to carry you the rest of the way, at the very least you'll have to train it. Here you have a tough choice: Invest in training Tesseract, or work up a CNN to recognize a limited alphabet. If you have a way to segment the image, I'd be tempted to go with the latter.

score 1 · Answer 3 · answered May 26 '19 at 03:03

One potential thing you could do to improve recognition on the characters is to dilate the characters so pytesseract gives a better result. Dilating the characters will connect the individual blobs together and can fix the / or the A characters. So starting with your latest binary image:

Original

Dilate with a 3x3 kernel with iterations=1 (left) or iterations=2 (right). You can experiment with other values but don't do it too much or the characters will all connect. Maybe this will provide a better result with you OCR.

import cv2

image = cv2.imread("1.PNG")
thresh = cv2.threshold(image, 115, 255, cv2.THRESH_BINARY_INV)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
dilate = cv2.dilate(thresh, kernel, iterations=1)
final = cv2.threshold(dilate, 115, 255, cv2.THRESH_BINARY_INV)[1]

cv2.imshow('image', image)
cv2.imshow('dilate', dilate)
cv2.imshow('final', final)
cv2.waitKey(0)

How to extract dotted text from image?

3 Answers3

Linked