0

I've seen a lot of other people getting this error, and I've tried a lot of different things to fix it. Nothing so far has worked. I have:

  • Added the path to my Tesseract-OCR folder AND the tesseract.exe file to PATH
  • Added an environment variable called TESSDATA_PREFIX which leads to the Tesseract-OCR folder
  • Replaced the eng.traneddata file a couple times
  • Added pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe" to the program
  • Tried running JUST the quickstart file instead of the program I'm running it in

and nothing has changed the error. At this point, I'm just looking for anything. The full error is as follows.

  File "pytesseract should work please.py", line 12, in <module>
    print(pytesseract.image_to_string(Image.open('text.png')))
  File "C:\Users\matth\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pytesseract\pytesseract.py", line 309, in image_to_string
    }[output_type]()
  File "C:\Users\matth\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pytesseract\pytesseract.py", line 308, in <lambda>
    Output.STRING: lambda: run_and_get_output(*args),
  File "C:\Users\matth\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pytesseract\pytesseract.py", line 218, in run_and_get_output
    run_tesseract(**kwargs)
  File "C:\Users\matth\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pytesseract\pytesseract.py", line 194, in run_tesseract
    raise TesseractError(status_code, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (1, 'Error opening data file \\Program Files (x86)\\Tesseract-OCR\\tessdata/eng.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed loading language \'eng\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')
Sophie Snoww
  • 33
  • 1
  • 6
  • `TESSDATA_PREFIX` should lead to the folder with traineddata files (eg eng.traineddata) – Dmitrii Z. Jan 27 '19 at 08:00
  • Really? In the error it says `Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.`. I did switch it, and it still didnt fix it or change the error. – Sophie Snoww Jan 27 '19 at 15:19
  • Yeah, you are right, it is tessdata, not traineddata folder. I wanted to point out that it might not be your "Tesseract-OCR" folder as mentioned in question. – Dmitrii Z. Jan 27 '19 at 20:26

2 Answers2

1

I fixed this issue by fully uninstalling pytesseract and installing an older version (3.2? I think..). So far I haven't noticed any functionality loss. I'm personally just happy that it works.

Sophie Snoww
  • 33
  • 1
  • 6
-1

try these steps: step1: Change this path using '/' instead of '\'. e.g., from ==> [1]: pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe"

to

[1]: pytesseract.pytesseract.tesseract_cmd = r"C:/Program Files (x86)/Tesseract-OCR/tesseract.exe"

step2: configure to TESSDATA_PREFIX environment using==>

[2]: tessdata_dir_config = r'--tessdata-dir "C:/Program Files (x86)/Tesseract-OCR/tessdata"'

step3:text==>

[3]: pytesseract.image_to_string(Image.open('text.png'),lang='eng',config=tessdata_dir_config)

  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Feb 27 '23 at 15:05
  • Please fix the formatting on this; it's quite hard to read. Use code blocks on code, an extra blank line between paragraphs, etc. – Ryan M Mar 01 '23 at 16:27