4

I have this picture:

Structure with painted number

and this is my Region of Interest:

which is a number that I would like to recognize and "read".

I don't know why I can't detect it using pytesseract. Even though I preprocess it and get this image free of noise:

preprocessed and binary image

Here is the configuration I am using to read it:

  1. Only numbers;

  2. One char;

    text = pytesseract.image_to_string(number_5,  lang='eng',config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')
    

And still, I just get \n\x0c as an answer.

I would like to ask for some tips on how to recognize images with unique characters (only numbers in this case);

And also a question about number detection. Is there a model that can search for numbers in a photo and return a bounding box of where they are located?

Ahmet
  • 7,527
  • 3
  • 23
  • 47
Alexandre Tavares
  • 113
  • 1
  • 1
  • 11
  • Your questions is asking for different things. If you would like to have support regarding your `pytesseract` issue, I would suggest to upload a complete minimal working example (including the image) that demonstrates what is failing. In addition, [here](https://paperswithcode.com/task/scene-text-detection) you will find a collection of text detection models. I would assume that they also work for sequences of digits. – André Oct 13 '21 at 14:22
  • Andre, thanks for your comments. I am already taking a look at this collection of text detection models. Regarding the complete minimum work example, it is right there. This number 5, for example, cannot be read by the algorithm. When I try it with a photo downloaded from the internet, it works fine. – Alexandre Tavares Oct 14 '21 at 09:45

2 Answers2

2

One way of detecting 5 in the image would be masking the image.

You could use Thresholding Operations using inRange. First, we need to find the upper and lower bounds values for thresholding. After few trials, I figured that the following is suitable for recognition.

msk = cv2.inRange(hsv, np.array([0, 0, 175]), np.array([179, 255, 255]))
  • the lower bound is np.array([0, 0, 175])
  • the upper bound is np.array([179, 255, 255])

The result will be:

enter image description here

Here above we can see the number 5 clearly.

Now we can apply the following processing methods.

krn = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 3))
dlt = cv2.dilate(msk, krn, iterations=1)
thr = 255 - cv2.bitwise_and(dlt, msk)

The result will be:

enter image description here

Now if we apply tesseract

d = pytesseract.image_to_string(thr, config="--psm 10")

The result will be:

5

Code:

import cv2
import numpy as np
import pytesseract

# Load the img
img = cv2.imread("MjfJF.png")

# Cvt to hsv
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

# Get binary-mask
msk = cv2.inRange(hsv, np.array([0, 0, 175]), np.array([179, 255, 255]))
krn = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 3))
dlt = cv2.dilate(msk, krn, iterations=1)
thr = 255 - cv2.bitwise_and(dlt, msk)

# OCR
d = pytesseract.image_to_string(thr, config="--psm 10")
print(d)
Ahmet
  • 7,527
  • 3
  • 23
  • 47
0

I think your image has to be black on white. You could also use another value for the psm argument.

Thomas
  • 1
  • 1