"Unsupported image object", using Tesseract

Question

I am building a character identifier from an image using Tesseract and Python.

This is my code:

from PIL import Image
import pytesseract as pyt
     
image_file = 'location'
im = Image.open(image_file)
text = pyt.image_to_string(image_file)
print (text)

I am getting the following error while executing this program:

TypeError: Unsupported image object

Can anyone solve this issue?

Thank you. That worked. Tesseract is having a very low accuracy is there any other method to identify characters in an image in python? — Srikanth, Jul 16 '18 at 14:49

score 3 · Answer 1 · edited Apr 16 '21 at 22:01

First, remember to add the line

 pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files/Tesseract-OCR/tesseract.exe'

where C:/Program Files/Tesseract-OCR/tesseract.exe is the location where your tesseract is installed. You have indicated that the image file is a string, which is ok, but you did not add an extension of the image! For example, you would have written image_file = 'location.png'. The extension jpeg, or any other format your image is holding. Then, instead of writing text = pyt.image_to_string(image_file), write text = pyt.image_to_string(img) because its an image you are parsing and not a string. The rest of the code is ok.

Note: You may need to specify the exact location of the image; for example 'C:/Users/Dismas/Desktop/opencv-python/image_text.png'

But if you still get the same problem, you can use the link How to install tesseract OCR. I followed the steps therein as they are. I had a similar problem like yours, but now I am sorted. The image below may be a good source of better understanding screenshot

Berk Gaffaroğlu · Answer 2 · 2021-12-03T00:11:28.270

2

from PIL import Image
import pytesseract as pyt
     
image_file = 'location'
im = Image.open(image_file)
text = pyt.image_to_string(im)
print (text)

You are passing the string but not the image itself. To fix it just change the line text = pyt.image_to_string(image_file) to text = pyt.image_to_string(im). It should work just fine.

edited Dec 03 '21 at 00:11

answered Apr 16 '21 at 22:16

Berk Gaffaroğlu

86
6

score 0 · Answer 3 · edited Oct 28 '22 at 18:04

I am also faced with the same problem. But, I don't want to try to leave this. So,I installed so many types of the tesseract. The new version of tesseract has to download correctly i.e https://github.com/UB-Mannheim/tesseract/wiki. After installed, Compulsory need to use package installation in command prompt (pip install pytesseract or pip install --user pytesseract). No need to add the path in environment variable. Just used the tesseract path directly in the code.

import cv2
import pytesseract

pytesseract.pytesseract.tesseract_cmd=r'C:\Users\Lenovo\AppData\Local\Tesseract-OCR\tesseract'

img=cv2.imread("image file")
img=cv2.resize(img,(600,500))
cv2.imshow("originalimage",img)
text=pytesseract.image_to_string(img)
print(text)
cv2.waitKey(0)
cv2.destroyAllWindows()

Note: If the above installations are correctly performed. Then the program is executed.

"Unsupported image object", using Tesseract

3 Answers3

Linked