Questions tagged [python-tesseract]

Python-tesseract is a wrapper class for Tesseract OCR that allows any conventional image files (JPG, GIF, PNG, TIFF, etc.) to be read and get its text, data of text, or even convert it to pdf.

Python-tesseract is a wrapper class for tesseract OCR that allows any conventional image files (JPG, GIF, PNG, TIFF, etc.) to be read and decoded into usable text.

Tesseract is advertised as the most accurate open source OCR engine available. It was developed at HP Labs between 1985 and 1995 and then remained dormant until 2006 when Google revived the project.

For more information, please see the Python-tesseract page or the Tesseract page.

1664 questions

votes

1 answer

Blackout number in pdf using OCR

Have 3 pages PDF which has scanned Id card. Id card copy can be on any page I need to blackout Id card number (Format of Id card number - 12 Digits and two spaces i.e xxxx xxxx xxxx) Please suggest how can i achieve this I tried microsoft computer…

c# computer-vision ocr tesseract python-tesseract

asked Jun 01 '19 at 12:05

Tony

votes

2 answers

Write image text to a new text file?

I am using tesseract for OCR. I am on ubuntu 18.04. I have this program which extracts the texts from an image and print it. I want that program to create a new text file and paste the extracted content on to the new text file, but I am only able to…

python python-3.x python-tesseract

asked Jun 01 '19 at 06:25

Gaurav Bahadur

votes

0 answers

Trouble pre-processing image to make text clearer in preparation for extraction

I have some images of some ceramic plates. The one shown below is an example of the worst possible from the batch. I am having trouble preprocessing it before using tesseract on it to the get the text (if it's possbile at all). If someone could give…

python-3.x image-processing ocr opencv3.0 python-tesseract

asked May 31 '19 at 13:33

Olariu Lucian

votes

1 answer

OCR with tesseract, pre-processing image

I need to extract digits from images like the one shown below, I'm using tesseract now, but it isn't working. Can anyone help me in pre-processing the images before feeding it to tesseract?

python python-3.x image-processing ocr python-tesseract

asked May 31 '19 at 03:08

Đức Thắng Nguyễn

votes

1 answer

Pass a directory of pdf files for performing OCR and generate .txt files for each converted file in Python

I have a directory containing pdf files. I have written the code that performs OCR when you pass a filename to an object of the wand.image class. What I want to do presently is to loop over the directory of pdf files and generate a OCR'd txt file…

python loops pdf file-handling python-tesseract

asked May 30 '19 at 11:45

ajai biltu

votes

1 answer

identify clear text from image python

i used pytesseract to identify text from image pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe' then i used below code to identify text textImg =…

python nlp ocr python-tesseract

asked May 25 '19 at 08:55

Kaveesha Chethiyawardena

votes

1 answer

Pytesseract behaving differently in Windows vs. Linux

I'm trying to make use of Pytesseract to do some very basic character recognition. When I run the following code in Linux, the output makes sense: import matplotlib.pyplot as plt import pandas as pd import sys import pytesseract # need to add…

python python-3.x ocr tesseract python-tesseract

asked May 23 '19 at 05:40

ollerend

votes

1 answer

Can pytesseract use ChoiceIterator to search over multiple matches?

Can pytesseract use ChoiceIterator to search over multiple matches? It seems to me that pytesseract is only an interface to the binary. tesserocr gives access to the Tesseract API which allows the use of ChoiceIterator. Example How do I use the…

tesseract python-tesseract

asked May 21 '19 at 20:54

qwr

9,525
5
58
102

votes

0 answers

Improving tesseract ocr result in french

I want to perform OCR on a image that is fairly clean and "easy" for OCR I think: But the result using tesseract is quite bad: print(pytesseract.image_to_string(Image.open('file-2.jpg'),lang='fra')) Maintenant ie La QT vieux, lorsque je parcours…

python tesseract python-tesseract

asked May 19 '19 at 19:48

Sulli

votes

1 answer

RuntimeError: TSVNotSupported: TSV output not supported. Tesseract >= 3.05 required (Google Dataflow)

Currently want to distribute text detection on Google Dataflow on a huge dataset. I'm using the python package of tesseract which gets installed without a problem. The problem occurs when installing the tesseract-ocr package. It seems like it's…

tesseract google-cloud-dataflow python-tesseract

asked May 10 '19 at 13:34

Jacob Verschaeve

votes

0 answers

Error importing PDF image to convert to text

I have a PDF image for transfer to image format so I am trying to read the PDF image and store the data in the text file. import pytesseract from PIL import Image img = Image.open('1.pdf') text = pytesseract.image_to_string(img) with open('1.txt',…

python-3.x python-imaging-library python-tesseract python-imageio

asked May 10 '19 at 05:11

Anuj Pratap Singh

votes

0 answers

How do I fetch the source file from pytesseract extract

So the gist is after I extracted the OCR/tesseract data from a pool of images, I then run re.findall(r'example') How would I fetch the source file that has an "Mountain" word? It's still a bit vague in my part. Can you help out. Thanks! for index,…

python ocr python-tesseract

asked May 08 '19 at 05:56

Hanz Mendez

votes

0 answers

Automate covering up text on image

I am just wondering if it is possible to use OCR such as pytesseract to automate covering text on image? I know that pytesseract is able to get the image_to_boxes(), which basically get the box for corresponding character. However, I do not want to…

python python-3.x ocr tesseract python-tesseract

asked Apr 22 '19 at 23:41

Darren Christopher

3,893
4
20
37

votes

0 answers

How to get text from an image using pytesseract?

I have a scenario where I have to fetch some text from an image. But I am getting the following errors when trying to do so: runfile('/Users/vivekchowdary/Documents/untitled folder/pytesseract.py', wdir='/Users/vivekchowdary/Documents/untitled…

python-3.x anaconda python-imaging-library spyder python-tesseract

asked Apr 11 '19 at 08:01

Vivek

votes

1 answer

Python to read text from picture giving some import package errors

Unable to read text from a picture using PIL and pytesseract import PIL from PIL import Image import pytesseract im = PIL.Image.open('C:\\Users\\Edgar.Lizarraga\\Desktop\\Kaizen-Continuous-Improvement-Model.png') x =…

python python-imaging-library python-tesseract

asked Apr 08 '19 at 20:58

edgar lenin lizarraga gastelum

Prev 1 2 3

…

99 100 Next