Questions tagged [python-tesseract]

Python-tesseract is a wrapper class for Tesseract OCR that allows any conventional image files (JPG, GIF, PNG, TIFF, etc.) to be read and get its text, data of text, or even convert it to pdf.

Python-tesseract is a wrapper class for tesseract OCR that allows any conventional image files (JPG, GIF, PNG, TIFF, etc.) to be read and decoded into usable text.

Tesseract is advertised as the most accurate open source OCR engine available. It was developed at HP Labs between 1985 and 1995 and then remained dormant until 2006 when Google revived the project.

For more information, please see the Python-tesseract page or the Tesseract page.

1664 questions

votes

3 answers

How to separate title and headers from body text in image

I am using tesseract (through the python wrapper) in order to extract text from documents. These documents do not include any images or tables, simply text. Is there any option to distinguish the titles/headings from the text? Ideally I want to be…

python opencv ocr tesseract python-tesseract

asked Jul 13 '18 at 07:46

Prikers

votes

1 answer

Can I test tesseract ocr in windows command line?

I am new to tesseract OCR. I tried to convert an image to tif and run it to see what the output from tesseract using cmd in windows, but I couldn't. Can you help me? What will be command to use? Here is my sample image:

ocr tesseract python-tesseract

asked Oct 08 '14 at 07:42

Akunar

votes

4 answers

Simple Captcha Solving

I'm trying to solve some simple captcha using OpenCV and pytesseract. Some of captcha samples are: I tried to the remove the noisy dots with some filters: import cv2 import numpy as np import pytesseract img = cv2.imread(image_path) _, img =…

captcha python-tesseract opencv python

asked Jul 17 '20 at 19:58

Mehran Torki

votes

3 answers

How to detect subscript numbers in an image using OCR?

I am using tesseract for OCR, via the pytesseract bindings. Unfortunately, I encounter difficulties when trying to extract text including subscript-style numbers - the subscript number is interpreted as a letter instead. For example, in the basic…

python ocr tesseract python-tesseract

asked May 16 '20 at 16:30

dspencer

4,297
4
22
43

votes

3 answers

What is the difference between Pytesseract and Tesserocr?

I'm using Python 3.6 in Windows 10 and have Pytesseract already installed but I found in a code Tesserocr which by the way I can't install. What is the difference?

python ocr tesseract python-tesseract

asked Feb 19 '19 at 08:21

Soufiane S

votes

3 answers

Real time OCR in python

The problem Im trying to capture my desktop with OpenCV and have Tesseract OCR find text and set it as a variable, for example, if I was going to play a game and have the capturing frame over a resource amount, I want it to print that and use it. A…

python image ocr image-recognition python-tesseract

asked Oct 19 '18 at 20:01

Novet

votes

2 answers

How to get the co-ordinates of the text recogonized from Image using OCR in python

I am trying to get the coordinates or positions of text character from an Image using Tesseract. I want to know the exact pixel position, so that i can click that text using some other tool. Edit : import pytesseract from pytesseract import…

python image-processing ocr tesseract python-tesseract

asked Feb 22 '18 at 13:26

Maddy

votes

1 answer

how to increase resolution of text in scanned images in python?

I use tesseract-OCR to extract text from scanned images, For few images text is not properly recognized due to low resolution and output produced is some irrelevant characters. Techniques applied: Increase the dpi to 300. Image pre- processing…

python image python-tesseract

asked May 08 '20 at 09:54

Jennifer

votes

4 answers

How to install tesseract for python on anaconda

Does anyone know how to install tesseract for python on Anaconda? I have a windows system. The anaconda website gives the installation for a linux system: conda install -c auto pytesseract Would there be any alterations required for a windows…

python anaconda python-tesseract

asked Mar 12 '18 at 11:59

VK1

votes

5 answers

Highly inconsistent OCR result for tesseract

This is the original screenshot and I cropped the image into 4 parts and cleared the background of the image to the extent that I can possibly do but tesseract only detects the last column here and ignores the rest. The output from the tesseract…

python opencv python-tesseract pytesser

asked Sep 13 '17 at 19:32

codefreaK

3,584
5
34
65

votes

1 answer

How to set init only parameters with python tesseract?

I'm trying to set some Tesseract parameters using the python-tesseract wrapper, but for Init Only parameters I'm unable to do so. I've been reading the Tesseract documentation and it seems i must use Init() to set these. These is what the…

python tesseract python-tesseract

asked Sep 11 '15 at 17:06

tiagosilva

1,695
17
31

votes

1 answer

How to set tessedit_write_images in python-tesseract?

I'm trying to set tessedit_write_images but can't seem to do it, i can't see the tessinput.tif anywhere i'm doing: import tesseract api =…

tesseract python-tesseract

asked Jul 22 '15 at 10:45

tiagosilva

1,695
17
31

votes

3 answers

Cannot import name '_imaging' from 'PIL'

I'm trying to run this code: import pyautogui import time from PIL import _imaging from PIL import Image import pytesseract time.sleep(5) captura = pyautogui.screenshot() codigo = captura.crop((872, 292, 983,…

python python-imaging-library python-tesseract

asked Nov 25 '20 at 03:40

Andresnex

votes

1 answer

Python - OCR - pytesseract for PDF

I am trying to run the following code: import cv2 import pytesseract img = cv2.imread('/Users/user1/Desktop/folder1/pdf1.pdf') text = pytesseract.image_to_string(img) print(text) which gives me the following error: Traceback (most recent call…

python python-tesseract

asked Mar 19 '20 at 10:07

adrCoder

3,145
4
31
56

votes

3 answers

How to get confidence of each line using pytesseract

I have successfully setup Tesseract and can translate the images to text... text = pytesseract.image_to_string(Image.open(image)) However, I need to get the confidence value for every line. I cannot find a way to do this using pytesseract. Anyone…

python-3.x image-processing ocr tesseract python-tesseract

asked Mar 28 '19 at 21:19

buydadip

8,890
22
79
154

Prev 1

…

99 100 Next