5

I am building a character identifier from an image using Tesseract and Python.

This is my code:

from PIL import Image
import pytesseract as pyt
     
image_file = 'location'
im = Image.open(image_file)
text = pyt.image_to_string(image_file)
print (text)

I am getting the following error while executing this program:

TypeError: Unsupported image object

Can anyone solve this issue?

סטנלי גרונן
  • 2,917
  • 23
  • 46
  • 68
Srikanth
  • 237
  • 2
  • 4
  • 16

3 Answers3

3

First, remember to add the line

 pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files/Tesseract-OCR/tesseract.exe'

where C:/Program Files/Tesseract-OCR/tesseract.exe is the location where your tesseract is installed. You have indicated that the image file is a string, which is ok, but you did not add an extension of the image! For example, you would have written image_file = 'location.png'. The extension jpeg, or any other format your image is holding. Then, instead of writing text = pyt.image_to_string(image_file), write text = pyt.image_to_string(img) because its an image you are parsing and not a string. The rest of the code is ok.

Note: You may need to specify the exact location of the image; for example 'C:/Users/Dismas/Desktop/opencv-python/image_text.png'

But if you still get the same problem, you can use the link How to install tesseract OCR. I followed the steps therein as they are. I had a similar problem like yours, but now I am sorted. The image below may be a good source of better understanding screenshot

Abhi
  • 6,471
  • 6
  • 40
  • 57
Dismas
  • 377
  • 1
  • 3
  • 10
2
from PIL import Image
import pytesseract as pyt
     
image_file = 'location'
im = Image.open(image_file)
text = pyt.image_to_string(im)
print (text)

You are passing the string but not the image itself. To fix it just change the line text = pyt.image_to_string(image_file) to text = pyt.image_to_string(im). It should work just fine.

0

I am also faced with the same problem. But, I don't want to try to leave this. So,I installed so many types of the tesseract. The new version of tesseract has to download correctly i.e https://github.com/UB-Mannheim/tesseract/wiki. After installed, Compulsory need to use package installation in command prompt (pip install pytesseract or pip install --user pytesseract). No need to add the path in environment variable. Just used the tesseract path directly in the code.

import cv2
import pytesseract

pytesseract.pytesseract.tesseract_cmd=r'C:\Users\Lenovo\AppData\Local\Tesseract-OCR\tesseract'

img=cv2.imread("image file")
img=cv2.resize(img,(600,500))
cv2.imshow("originalimage",img)
text=pytesseract.image_to_string(img)
print(text)
cv2.waitKey(0)
cv2.destroyAllWindows()

Note: If the above installations are correctly performed. Then the program is executed.

Michael M.
  • 10,486
  • 9
  • 18
  • 34
Durga
  • 1