Questions tagged [python-tesseract]

Python-tesseract is a wrapper class for Tesseract OCR that allows any conventional image files (JPG, GIF, PNG, TIFF, etc.) to be read and get its text, data of text, or even convert it to pdf.

Python-tesseract is a wrapper class for tesseract OCR that allows any conventional image files (JPG, GIF, PNG, TIFF, etc.) to be read and decoded into usable text.

Tesseract is advertised as the most accurate open source OCR engine available. It was developed at HP Labs between 1985 and 1995 and then remained dormant until 2006 when Google revived the project.

For more information, please see the Python-tesseract page or the Tesseract page.

1664 questions

votes

6 answers

Pyinstaller and Tesseract OCR

I am using Tesseract OCR for my program and I am going to convert it into a single .exe file using pyinstaller. The problem is that in order for Tesseract to work, I need to reference the path to the program installed on my computer, like this:…

python ocr pyinstaller tesseract python-tesseract

asked Jan 20 '20 at 19:05

Mirrah

votes

1 answer

Tesseract OCR image recognition failed because of `Warning: Invalid resolution` error

I tried to detect text from an image where I draw bounding boxes around select characters and stitch them together to form another image as below : I used cv2 to draw bounding boxes around the characters using the following code : cnts =…

python opencv ocr tesseract python-tesseract

asked Jan 17 '20 at 09:38

Mayank

1,364
1
15
29

votes

1 answer

python pytesseract.image_to_string can't read text in image

I am using python3.7 and Tesseract-OCR version 5 on my Windows 10 box. I have pictures containing the numbers. However, despite that it is super clear to the human eyes, the Tesseract can't extract them correctly. Some give me a couple of correct…

python image image-processing ocr python-tesseract

asked Dec 27 '19 at 05:24

Difan Zhao

votes

1 answer

What causes pytesseract to read either the top or bottom text-line of a dual-line image depending on whether opencv or pillow is used?

EDIT: I forgot to process the image which solves the reading issue, thanks to Nathancy. Still wondering what makes Tesseract read only the top OR the bottom line of an unprocessed image (same image, two different outcomes) Orignal: I have an image…

python opencv python-imaging-library ocr python-tesseract

asked Nov 12 '19 at 17:51

non-english-programmer

votes

2 answers

OCR on floorplan screenshots with pytesseract and OpenCV

I am trying to write a function that will take a jpg of a floorplan of a house and use OCR to extract the square footage that is written somewhere on the image import requests from PIL import Image import pytesseract import pandas as…

python opencv ocr tesseract python-tesseract

asked Sep 20 '19 at 14:58

Harvs

votes

3 answers

How to extract decimal in image with Pytesseract

Above is the image ,I have tried everything I could get from SO or google ,nothing seems to work. I can not get the exact value in image , I should get 2.10 , Instead it always get 210. And it is not limited to this image only any image which have…

python opencv image-processing computer-vision python-tesseract

asked Aug 13 '19 at 14:59

PankajKushwaha

votes

0 answers

Retaining tabular structure after extracting data using OCR Pytesseract

I am using OCR Pytesseract to extract data from an image which has tabular data. I am extracting it to a textfile and I wish to store it in an excel sheet. I Couldn't directly store it into an excel sheet. But the problem I am encountering is that…

python python-3.x dataframe python-tesseract

asked Jan 28 '19 at 09:29

developer

votes

1 answer

Pytesseract does not recognize when it's just a letter

I need to recognize only one letter But OCR does not recognize when it's just a letter in this case I am trying to recognize the letter H but nothing shows up What can I do to make it work? from PIL import Image from pytesseract import * import…

python opencv tesseract python-tesseract

asked Jan 09 '19 at 22:54

Bernardo Martins

votes

3 answers

Why does pytesseract fail to recognize digits in this simple image?

I'm trying to use pytesseract to recognize two numbers from an image: I have tried --psm 6 up to 10 I have tried -c tessedit_char_whitelist=0123456789' None of the above returns 49 number. Closest I got is returned 4 without 9 Do you have any…

python ocr tesseract python-tesseract

asked Jan 01 '19 at 20:26

Povilas

votes

0 answers

PyTesseract incredibly slow to process single image

I have Tesseract running in python via pytesseract. Using a image of a newspaper article which happens to contain around 600 words, the pytesseract.image_to_string function takes around 20 seconds to complete. The eventual results are great, but…

python tesseract python-tesseract

asked Dec 08 '18 at 12:42

user3795126

votes

1 answer

Reading low resolution image with pytesseract

I'm trying to read off some stats off the cropped (manually) sections of tables in pdf files. Here is the image I'm trying to process The current result I get has most of the numbers but not all of the text, as seen below: Hmuwinu'fg. cm’:…

python image-processing ocr python-tesseract

asked Nov 23 '18 at 23:39

AndreK

votes

3 answers

How to find rotate and crop a section of text in openCV, python

I'm in a struggle with a project that takes an image of a pretty clear font from say a label for example reads the "text region" and outputs it as a string using OCR tesseract for instance. Now I've made quite some progress with the thing as I added…

python opencv text raspberry-pi python-tesseract

asked Aug 06 '18 at 02:11

MikeLemo

votes

2 answers

TesseractNotFoundError: tesseract is not installed or it's not in your path

I am trying to use tesseract-OCR to print text from the image. But I am getting the above error. I have installed tesseract OCR using https://github.com/UB-Mannheim/tesseract/wiki and pytesseract in the anaconda prompt using pip install pytesseract…

python python-3.x image-processing python-tesseract

asked Aug 03 '18 at 17:11

manpreet singh

votes

0 answers

Use Pytesseract to Extract Text into Table Arrays Given the Coordinates of the Table Structure

I want to extract texts from a scanned table with tesseract and put it them into arrays that have the same structure as the table. I already used opencv to detect the table structure, and obtained the coordinates of the table joints as well as the…

opencv ocr tesseract text-mining python-tesseract

asked Jul 17 '18 at 15:15

Bec Zhao

votes

0 answers

Tesseract Unknown Font training

I have been trying to train Tesseract 4.0 to recognise "customized" font from an engineering blueprint. I've followed the necessary steps using Training Tesseract from here…

fonts ocr cad python-tesseract

asked Jun 22 '18 at 12:16

Ankita

Prev 1 2 3

…

99 100 Next