How to find numbers in images and read them?

Question

I have this picture:

and this is my Region of Interest:

which is a number that I would like to recognize and "read".

I don't know why I can't detect it using pytesseract. Even though I preprocess it and get this image free of noise:

Here is the configuration I am using to read it:

Only numbers;

One char;

text = pytesseract.image_to_string(number_5,  lang='eng',config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')

And still, I just get \n\x0c as an answer.

I would like to ask for some tips on how to recognize images with unique characters (only numbers in this case);

And also a question about number detection. Is there a model that can search for numbers in a photo and return a bounding box of where they are located?

Your questions is asking for different things. If you would like to have support regarding your `pytesseract` issue, I would suggest to upload a complete minimal working example (including the image) that demonstrates what is failing. In addition, [here](https://paperswithcode.com/task/scene-text-detection) you will find a collection of text detection models. I would assume that they also work for sequences of digits. — André, Oct 13 '21 at 14:22
Andre, thanks for your comments. I am already taking a look at this collection of text detection models. Regarding the complete minimum work example, it is right there. This number 5, for example, cannot be read by the algorithm. When I try it with a photo downloaded from the internet, it works fine. — Alexandre Tavares, Oct 14 '21 at 09:45

score 2 · Answer 1 · answered Oct 14 '21 at 11:34

One way of detecting 5 in the image would be masking the image.

You could use Thresholding Operations using inRange. First, we need to find the upper and lower bounds values for thresholding. After few trials, I figured that the following is suitable for recognition.

msk = cv2.inRange(hsv, np.array([0, 0, 175]), np.array([179, 255, 255]))

the lower bound is np.array([0, 0, 175])
the upper bound is np.array([179, 255, 255])

The result will be:

Here above we can see the number 5 clearly.

Now we can apply the following processing methods.

krn = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 3))
dlt = cv2.dilate(msk, krn, iterations=1)
thr = 255 - cv2.bitwise_and(dlt, msk)

The result will be:

Now if we apply tesseract

d = pytesseract.image_to_string(thr, config="--psm 10")

The result will be:

Code:

import cv2
import numpy as np
import pytesseract

# Load the img
img = cv2.imread("MjfJF.png")

# Cvt to hsv
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

# Get binary-mask
msk = cv2.inRange(hsv, np.array([0, 0, 175]), np.array([179, 255, 255]))
krn = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 3))
dlt = cv2.dilate(msk, krn, iterations=1)
thr = 255 - cv2.bitwise_and(dlt, msk)

# OCR
d = pytesseract.image_to_string(thr, config="--psm 10")
print(d)

score 0 · Answer 2 · answered Sep 15 '22 at 20:12

0

I think your image has to be black on white. You could also use another value for the psm argument.

answered Sep 15 '22 at 20:12

Thomas

1
1

How to find numbers in images and read them?

2 Answers2