EDIT: I forgot to process the image which solves the reading issue, thanks to Nathancy. Still wondering what makes Tesseract read only the top OR the bottom line of an unprocessed image (same image, two different outcomes)
Orignal:
I have an image that contains two lines of text:
random test image for pytesseract
When I open the image within python (IDLE Python 3.6) with PIL Image and use pytesseract to extract a string, it only extracts the last/bottom line correctly. The upper line of text is scrambled garbage.(see code section below)
However, when I use opencv to open the image and use pytesseract to extract a string, it only extracts the top/upper line correctly whilst making a mess of the second/bottom line of text.(see also code section below)
Here is the code:
>>> from PIL import Image, ImageFilter
>>> import pytesseract
>>> pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
>>> import cv2
>>> img = Image.open(r"C:\Users\user\MyImage.png")
>>> img2 = cv2.imread(r"C:\Users\user\MyImage.png", cv2.IMREAD_COLOR)
>>> print(pytesseract.image_to_string(img2))
Pet Sock has 448/600 HP left
A ae eee PER eats ae
>>> print(pytesseract.image_to_string(img))
Le TL
JHE has 329/350 HP left.
When I use pytesseract.image_to_boxes
on both img
and img2
it will show the same bounding box for certain locations with a different letter (only showing 2 extracted lines which contain an identical box)
>>> print(pytesseract.image_to_boxes(img2))
A 4 6 10 16 0
>>> print(pytesseract.image_to_boxes(img))
J 4 6 10 16 0
When I use the pytesseract.image_to_data
on both img
and img2
it shows very high (95+) confidence on the line it reads correctly and very low (30-) on the garbled line.
Excel table output of image_to_data
edit: excel tables are img2 and img accordingly
I fiddled around with the psm config values (I have tried them all) and except for creating more garbage on settings: 5, 7, 8, 9, 10, 13; and some giving an error: 0, 2; it gave no different results than the default (which is 3 I believe)
I must be making some rookie mistake but I can't get my head around why this is happening. If anyone can shine a light in the right direction it would be awesome.
The image was just a fitting, but random, image for an OCR test that I had laying around. No further intentions than experimenting with pytesseract.