0

I have installed Anaconda 2018.12 (Python 3.7 version). I am trying to test out the pytesseract module but I keep encountering:

TesseractNotFoundError: C:\Program Files (x86)\Tesseract-OCR\tesseract.exe is not installed or it's not in your path

I have done:

  • pip install Pillow (already installed it says)
  • pip install pytesseract (successful)
  • Tried to set the tesseract_cmd to the location of tesseract (but I can't find it)

I have searched for the tesseract.exe file but cannot find it anywhere on the system so I'm struggling to understand how do I reference/import the module into a jupyter notebook if it's already been consumed into anaconda?

The code I'm trying to run is:

from PIL import Image 
import pytesseract
#pytesseract.pytesseract.tesseract_cmd = r"C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe"

text = pytesseract.image_to_string(Image.open('C:\Temp\IMG_1519.jpg'))

print(text)

I'm hoping it's simple user error but any assistance would be gratefully received. Many thanks, Ben

FlyingTeller
  • 17,638
  • 3
  • 38
  • 53
user7925487
  • 193
  • 2
  • 3
  • 14

1 Answers1

0

Quoting from the PyPi page:

Python-tesseract is a wrapper for Google’s Tesseract-OCR Engine.

and (under prequisites):

Install Google Tesseract OCR (additional info how to install the engine on Linux, Mac OSX and Windows)

This means, that pytesseract is not a standalone module. It is a python wrapper for using the Google’s Tesseract-OCR Engine, which you need to install seperately

FlyingTeller
  • 17,638
  • 3
  • 38
  • 53
  • FlyingTeller: Thank you - all working now...I think the term I need to learn is RTFM :) Appreciate the prompt response! – user7925487 Feb 13 '19 at 15:38