1

My code is as follows:

import pytesseract
from PIL import Image

pytesseract.pytesseract.tesseract_cmd = 'B:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe'

img = Image.open("sample.png")
text = pytesseract.image_to_string(img, lang="eng")
print(text)

The error I get is:

Traceback (most recent call last):
  File "C:/PY/tesseract test.py", line 11, in <module>
    text = pytesseract.image_to_string(img, lang="eng")
  File "C:\PY\lib\site-packages\pytesseract\pytesseract.py", line 346, in image_to_string
    return {
  File "C:\PY\lib\site-packages\pytesseract\pytesseract.py", line 349, in <lambda>
    Output.STRING: lambda: run_and_get_output(*args),
  File "C:\PY\lib\site-packages\pytesseract\pytesseract.py", line 260, in run_and_get_output
    run_tesseract(**kwargs)
  File "C:\PY\lib\site-packages\pytesseract\pytesseract.py", line 236, in run_tesseract
    raise TesseractError(proc.returncode, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (1, 'Error opening data file \\Program Files (x86)\\Tesseract-OCR\\eng.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'eng\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')

I have tried searching for other solutions but cannot find anything

e102
  • 85
  • 8

2 Answers2

0

I'm not familiar with tesseract in Python, but you may need to load the eng.traineddata binary in order to make it work. Add a TESSDATA_PREFIX to your environment variables and point it to the folder where the binary is located.

You may want to at this answer, looks kind similar to your case: pytesseract Failed loading language \'eng\'

Diogo Andrade
  • 56
  • 1
  • 6
0

I fixed this issue by uninstalling tesseract and installing an older version (3.0.2). So far I haven't noticed any functionality loss. I'm personally just happy that it works.

e102
  • 85
  • 8