Questions tagged [pytesser]

PyTesser is an Optical Character Recognition module for Python. It takes as input an image or image file and outputs a string.

PyTesser is an Optical Character Recognition module for Python. It takes as input an image or image file and outputs a string.

PyTesser uses the Tesseract OCR engine, converting images to an accepted format and calling the Tesseract executable as an external script. A Windows executable is provided along with the Python scripts. The scripts should work in other operating systems as well.

http://code.google.com/p/pytesser/

105 questions
2
votes
1 answer

How to hide the console window when I run tesseract with pytesseract with CREATE_NO_WINDOW

I am using tesseract to perform OCR on screengrabs. I have an app using a tkinter window leveraging self.after in the initialization of my class to perform constant image scrapes and update label, etc values in the tkinter window. I have searched…
Kethran
  • 116
  • 1
  • 2
  • 11
2
votes
1 answer

ImportError: No module named 'tesserwrap'

Tesseract is already installed in my system , tried installing tesserwrap but getting error as Installed Tesseract using command- pip install tesseract Tried installing Tesserwrap module using command- pip install tesserwrap Collecting…
2
votes
1 answer

pytesseract error when converting image to string

I keep getting an error with the following code: import pytesseract from PIL import Image, ImageEnhance, ImageFilter im = Image.open("book.jpg") # the second one im = im.filter(ImageFilter.MedianFilter()) enhancer = ImageEnhance.Contrast(im) im =…
KSar
  • 101
  • 1
  • 2
  • 5
2
votes
1 answer

"ValueError: cannot filter palette images" during Pytesseract Conversion

Having trouble with this error code regarding the following code for Pytesseract. (Python 3.6.1, Mac OSX) import pytesseract import requests from PIL import Image from PIL import ImageFilter from io import StringIO, BytesIO def…
gmonz
  • 252
  • 1
  • 5
  • 17
2
votes
1 answer

Improving the text extraction efficiency using some OCR

I am very new to Computer Vision. I have lots of images like this: Sample image I want to extract the entire table as text. I tried pytesseract to extract text from the image. I tried the sample code as below: try: import Image except…
Henil Shah
  • 137
  • 4
  • 14
1
vote
0 answers

openCV and pytesseract does not correctly read a simple black text within white background

I'm having trouble reading the text correctly within this image: using cv2 and pytesseract. The code I have is here: import pytesseract import cv2 image = cv2.imread(path, cv2.IMREAD_GRAYSCALE) (h, w) = image.shape[:2] img = cv2.resize(image,…
potatopainting
  • 103
  • 1
  • 9
1
vote
1 answer

Python PyTesseract Module returning gibberish from an image

I'm guessing this is because the images I have contain text on top of a picture. pytesseract.image_to_string() can usually scan the text properly but it also returns a crap ton of gibberish characters: I'm guessing it's because of the pictures…
Charlie
  • 11
  • 1
1
vote
1 answer

Improving image pre-processing for tesseract (video game screenshot)

I am trying to read text for prices in a video game and am experiencing difficulty in pre-processing the image. The rest of my code is "complete", as in after the text is extracted I am formatting it and outputting into CSV for later use. This is…
Lumelity
  • 19
  • 3
1
vote
1 answer

TSV output not supported. Tesseract >=3.05 required

I had a issue with tesseract version. Error log: raise TSVNotSupported() pytesseract.pytesseract.TSVNotSupported: TSV output not supported. Tesseract >=3.05 required How do I install tesseract 3.05 ?
Sai Krishnadas
  • 2,863
  • 9
  • 36
  • 69
1
vote
0 answers

Pytesseract : File missing when trying to use Image_to_Boxes() API

[Errno 2] No such file or directory: '/tmp/tess_3gyrbu0d_out.box' I am getting this error when I am trying to use Image_to_Boxes() API. Image_to_String() is working without any error. Thanks in Advance !!!!
1
vote
0 answers

How to Automate the extraction of user information from filled Bank Account Form

I'm trying to extract handwritten information from a scanned account opening form. For this i have use Pytesseract python library for extracting text data. But using this module i'am having a lot of irregularities in the output, as i'am getting…
1
vote
0 answers

Pytesseract not getting the text from one part of the image

I have the following image that I want to get the information from the table contained in it. I managed to get the information from the first and third columns. However, I cannot get pytesseract to work with the second column. Here is my code: from…
fmarques
  • 391
  • 2
  • 5
  • 16
1
vote
1 answer

Using optical character recognition in python script

I'd like to accomplish the seemingly simple task of running a python script that uses OCR to give me a string of text from an image. My code: from PIL import Image from pytesseract import * image_file = 'IMG_9296' im = Image.open(image_file) text =…
curious_cosmo
  • 1,184
  • 1
  • 18
  • 36
1
vote
1 answer

Image recognition for Payment bills

I want to extract useful information from images of the bills. I have already converted images to text using OCR + pytesseract and extracting the information based on specific words like total, amount, etc. What will be the best generic approach for…
Sourabh Potnis
  • 1,431
  • 1
  • 17
  • 26
1
vote
1 answer

Python/OpenCV program for image to text converter shows WindowsError: [Error 2] The system cannot find the file specified

I've written a python/cv2 image to text converter. When starting up the program I enter C:\Users\mikez\Pictures\examples.png when it was asking for the image. Thereafter it shows the follwing error: Traceback: "WindowsError: [Error 2] The…