0

While trying to install and use tesseract on windows 10 with python using pytesseract I get the error:

  File "C:\ProgramData\Anaconda3\lib\site-packages\pytesseract\pytesseract.py", line 194, in run_tesseract
    raise TesseractError(status_code, get_errors(error_string))

TesseractError: (1, 'Error opening data file \\Program Files (x86)\\Tesseract-OCR\\eng.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'eng\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')

I tried reinstalling tesseract. I have Set C:\Program Files (x86)\Tesseract-OCR to the PATH envoirment variables I have added TESSDATA_PREFIX to C:\Program Files (x86)\Tesseract-OCR\tessdata I have verrified that when I type in 'tesseract' in CMD works

The code i use:

import cv2
import pytesseract


# Uncomment the line below to provide path to tesseract manually
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe"

# Define config parameters.
# '-l eng'  for using the English language
# '--oem 1' for using LSTM OCR Engine
config = ('-l eng --oem 1 --psm 3')

# Read image from disk
im = cv2.imread("Serie1/NL83LHL9.JPG", cv2.IMREAD_COLOR)

# Run tesseract OCR on image
text = pytesseract.image_to_string(im, config=config)
# Print recognized text
print(text)

Results:

CMD > tesseract : shows the tesseract interface

tretron
  • 11
  • 6
  • Indeed it looks a bit odd. One thing you can try is to add tessdata path to your config - `config = r'--tessdata-dir "C:\Program Files (x86)\Tesseract-OCR\tessdata" -l eng --oem 1 --psm 3'` – Dmitrii Z. Mar 28 '19 at 08:41
  • at the risk of sounding inexperienced: which of the many config files I have should I add this to? – tretron Mar 28 '19 at 12:12
  • You have line `config = ('-l eng --oem 1 --psm 3')`. Replace it with the one which I suggested. – Dmitrii Z. Mar 28 '19 at 12:17
  • That did do the trick! thanks a lot for your help. – tretron Mar 29 '19 at 08:20

2 Answers2

1

solved by Dmitrii Z.

Indeed it looks a bit odd. One thing you can try is to add tessdata path to your config - config = r'--tessdata-dir "C:\Program Files (x86)\Tesseract-OCR\tessdata" -l eng --oem 1 --psm 3'

tretron
  • 11
  • 6
-1

If you don't have tesseract executable in your PATH, include the following:

 pytesseract.pytesseract.tesseract_cmd = r'C:/Program Files (x86)/Tesseract-OCR/tesseract'
pranita
  • 1
  • 4
  • That is the 6th line of the code I posted with the problem. problem was solved by Dmitrii Z trough – tretron Apr 29 '19 at 11:46