Questions tagged [python-tesseract]

Python-tesseract is a wrapper class for Tesseract OCR that allows any conventional image files (JPG, GIF, PNG, TIFF, etc.) to be read and get its text, data of text, or even convert it to pdf.

Python-tesseract is a wrapper class for tesseract OCR that allows any conventional image files (JPG, GIF, PNG, TIFF, etc.) to be read and decoded into usable text.

Tesseract is advertised as the most accurate open source OCR engine available. It was developed at HP Labs between 1985 and 1995 and then remained dormant until 2006 when Google revived the project.

For more information, please see the Python-tesseract page or the Tesseract page.

1664 questions

votes

1 answer

Does anyone know how Tesseract - OCR postprocessing / spellchecking works?

I was using tesseract-ocr (pytesseract) for spanish and it achieves very high accuracy when you set the language to spanish and of course, the text is in spanish. If you do not set language to spanish this does not perform that good. So, I'm…

ocr tesseract python-tesseract

asked Jan 20 '20 at 14:24

Tomas -

votes

1 answer

Embedding Python in GRPC Server

I am exploring GRPC (C++). Following their examples I am trying to create a server which accepts an image from the client returns the text in the image. I have a python code which accepts an image and a json file describing the bounding box of the…

grpc python-c-api python-tesseract

asked Jan 20 '20 at 10:23

Raki

votes

1 answer

Getting Pytesseract Error while creating .exe file using pyinstaller

So basically I am trying to create a simple flask app where we can use pytesseract to do OCR on image and return the data in string. And also i am packaging the whole app into the .exe file using the pyinstaller after doing the obfuscation of the…

python flask pyinstaller python-tesseract

asked Jan 20 '20 at 09:34

Akash

votes

2 answers

How to get coordinates of characters in html document?

refrence how to extract only 369 429 301 123 value from above code using python?

python python-3.x web-scraping beautifulsoup python-tesseract

asked Jan 14 '20 at 17:53

CodeDecode

votes

0 answers

Read Clipboard image into function for OCR

I have written a function that reads the clipboard image whence captured as screenshot and then passes that captured image data to OCR engine. I am struggling with the passing of data. The code is given below. from tkinter import messagebox from PIL…

python ocr screenshot clipboard python-tesseract

asked Jan 14 '20 at 16:47

Ambrish Dhaka

votes

2 answers

Name error: Image to text error in python

I am working on developing code to convert image to text using the below code. I see the below error while executing the code. I dont really understand what is causing the issue. Can any one help me to identify the issue. from PIL import…

python pandas spyder python-tesseract

asked Jan 13 '20 at 14:44

KApril

votes

0 answers

Unable to extract text from those images

I tried to detect and extract text from the below images, but I am not able to get the header text properly. Image 1: Image 2: For those kinds of images, I am unable to detect and extract text from it. Please help me with those images. I tried the…

python-3.x python-imaging-library opencv3.0 python-tesseract

asked Jan 13 '20 at 10:10

Vijay

votes

1 answer

pyTesseract not outputing text from image

maybe someone could help me! When I run the following code import pytesseract from pytesseract import image_to_string from PIL import Image import PIL file = Image.open('/usr/local/Cellar/tesseract/4.1.0/share/tessdata/cap.png') we_will =…

python image-processing tesseract python-tesseract

asked Dec 20 '19 at 23:45

Aaron_Loves_Python

votes

0 answers

Install ImageMagick and Ghostscript through Python

I am very new to Python. I have an OCR program that uses Tesseract, ImageMagick and Ghostscript. I created a .exe file to give it to my team so that they can use it on their tool. The problem that I am facing is that all of them will have to…

python python-tesseract

asked Dec 19 '19 at 13:43

Vadiraj Katti

votes

0 answers

Why doesn't my multi processor program take the whole path of my image?

I've been trying to use multiprocessing on a program that uses tesseract to extract text from images. But when I give the name to my image, it only searches for the first letter of the name of the image in the directory def tess(all_clips): …

python-3.x python-multiprocessing python-tesseract

asked Dec 17 '19 at 10:25

PRATHAMESH

votes

0 answers

How to Improve Pytesseract Results

I'm trying to solve some semi simple CAPTCHA codes using Python3 on my Raspberry Pi 4. This is my current code. from PIL import Image from pytesseract import image_to_string img=Image.open('/home/pi/Desktop/Captcha Code…

python linux tesseract captcha python-tesseract

asked Dec 16 '19 at 23:49

Michael

votes

1 answer

Unable to properly read text from image which has a Color text in python

What I tried so far. it's working fine most of image which is text black and background is white. from PIL import Image import pytesseract import nltk import cv2 imageName = "p9.png" img = cv2.imread(imageName,cv2.IMREAD_COLOR) #Open the image from…

python-3.x opencv machine-learning computer-vision python-tesseract

asked Dec 09 '19 at 07:06

Nazmul Hasan

10,130
7
50
73

votes

0 answers

improper text alignment from pytesseract

Trying to extract data from pdf using pytesseract with below code. But text alignment is improper when printing/Writing data to doc. from PIL import Image import pytesseract import sys from pdf2image import convert_from_path import os PDF_file…

python-3.x nlp python-tesseract pypdf

asked Dec 08 '19 at 19:25

Ashu

votes

1 answer

Solved: Python multiprocessing imap BrokenPipeError: [Errno 32] Broken pipe pdftoppm

Let me first say that this is not a duplicate of the other similar questions, where people tend to manage more closely the pool of workers. I have been struggling with the following exception thrown by my code when using multiprocessing.Pool.imap: …

python python-multiprocessing python-tesseract pdftoppm

asked Dec 05 '19 at 21:07

KantAndr1804

votes

0 answers

Tesseract-OCR / Pillow not working with Pycharm

I have installed Tesseract-OCR using the provided installer and added it to the path. I have also installed Pillow using pip through CMD. But when I attempt to import them into pycharm it says that the modules do not exist. When I type 'tesseract'…

python pycharm python-tesseract

asked Nov 25 '19 at 20:18

BradenD

Prev 1 2 3

…

99 100 Next