Questions tagged [python-tesseract]

Python-tesseract is a wrapper class for Tesseract OCR that allows any conventional image files (JPG, GIF, PNG, TIFF, etc.) to be read and get its text, data of text, or even convert it to pdf.

Python-tesseract is a wrapper class for tesseract OCR that allows any conventional image files (JPG, GIF, PNG, TIFF, etc.) to be read and decoded into usable text.

Tesseract is advertised as the most accurate open source OCR engine available. It was developed at HP Labs between 1985 and 1995 and then remained dormant until 2006 when Google revived the project.

For more information, please see the Python-tesseract page or the Tesseract page.

1664 questions

votes

1 answer

pytesseract not idenfiying digits properly as well it is detecting dashed 0 as 8

Pytesseract unable to identify proper characters as well it is predicting slashed zero wrong. Here is my Image: from PIL import Image import pytesseract import cv2 import numpy as np img = cv2.imread('dilation_1_0.png') #dilation_1.png…

python-3.x python-tesseract

asked Mar 03 '20 at 12:45

Sidey1238

votes

1 answer

Why is pytesseract not identifying this image?

I am trying to identify single digits in python with tesseract. My code is this: import numpy as np from PIL import Image from PIL import ImageOps import pytesseract import cv2 def predict(imageArray): pytesseract.pytesseract.tesseract_cmd =…

ocr tesseract python-tesseract

asked Feb 29 '20 at 16:39

stgy222

votes

1 answer

Error while performing OCR using pytesseract

I wanna to use pytesseract. This is my code. import pytesseract from pdf2image import convert_from_path PDF_file = 'file.pdf' text = '' pages = convert_from_path(PDF_file, 500) pageText = str(((pytesseract.image_to_string(pages[0])))) and at…

python python-3.x ocr python-tesseract

asked Feb 27 '20 at 14:20

Robert Aydinyan

votes

1 answer

Read △ as minus

△ means minus ('-') as a business rule. How can I read the following images as expected. Input image 1 (expected value is -74,523) Input image 2 (expected value is -1,794,306) Actual result $ tesseract 1.png stdout -l eng --psm 4 £74 523 $…

opencv ocr tesseract python-tesseract

asked Feb 25 '20 at 15:08

zono

8,366
21
75
113

votes

0 answers

Text annotation replacement for google's cloud vision in pytesseract or microsoft's cognitive services

I need an alternate for google's cloud vision code as below client = vision.ImageAnnotatorClient() with io.open(tempFile, 'rb') as image_file: content = image_file.read() image =…

computer-vision ocr azure-cognitive-services python-tesseract google-cloud-vision

asked Feb 25 '20 at 13:39

Santhosh Arockiaxavier

votes

1 answer

OCR using Tesseract simple task failing

I'm doing text recognition in scanned text pages and recently started trying Tesseract. I realize it sometimes struggles with some tasks so I created a region of interest in a field where I will have none to two characters to recognize, like so: I…

ocr tesseract python-tesseract

asked Feb 22 '20 at 19:55

André Fazendeiro

votes

1 answer

Pytesseract is failing with PermissionError: [WinError 5] Access is denied due to undeletable file

I Installed the 64bit version from https://github.com/UB-Mannheim/tesseract/wiki then pip install pytesseract cv2 didn't cause any issues My code: import cv2 import pytesseract pytesseract.pytesseract.tesseract_cmd=r"C:\Program…

python computer-vision tesseract python-tesseract text-recognition

asked Feb 21 '20 at 17:01

Nicolai Sergeev

votes

0 answers

limit the no of characters detected in pytesseract

So I've a captcha image that I'm trying to decode using pytesseract. I've done all the preprocessing using opencv then my current image is - Now when I'm using pytesseract it is giving me an output: print(pytesseract.image_to_string(image, config…

python ocr tesseract captcha python-tesseract

asked Feb 19 '20 at 10:34

niel_99

votes

1 answer

How do I pass a RegEx pattern to Pytesseract?

There seems to be two ways to go about this, none seem to work. First, you can pass tessedit_char_whitelist, but that seems to work only with characters, not patterns: import pytesseract pytesseract.pytesseract.tesseract_cmd =…

python ocr tesseract python-tesseract

asked Feb 18 '20 at 14:32

Nicolas Gervais

33,817
13
115
143

votes

1 answer

Is there a way to fix permission denied error with pytesseract and python?

I'm trying to create a client/server program in python that sends recognized text from a picture and useful information about it to a client which will then display it on an oled display. But the problem comes on the server side of the program when…

python windows opencv ocr python-tesseract

asked Feb 17 '20 at 02:16

Bradley C

votes

2 answers

How can I extract names and handwritten numbers from images (or pdf files) in python?

I want to build a project in which, when I put a pdf file it extracts from it printed names and handwritten numbers then put them in a CSV file ( excel file ) Please note that the pdf files has a table in which we find names in a column and…

python opencv tensorflow ocr python-tesseract

asked Feb 12 '20 at 12:17

user11874369

votes

0 answers

Can we extend the tesseract - ocr library as it is open source?

I am looking for additional functionality that seem unavailable for current version. So is it possible for developer community to add functionality or modify existing libraries?

image-processing ocr tesseract python-tesseract

asked Feb 11 '20 at 12:16

rahul naidu

votes

1 answer

how to convert C++ tesseract-ocr code to Python?

I want to convert the C++ version Result iterator example in tesseract-ocr doc to Python. Pix *image = pixRead("/usr/src/tesseract/testing/phototest.tif"); tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI(); api->Init(NULL, "eng"); …

python c++ tesseract python-tesseract

asked Feb 11 '20 at 10:28

iMath

2,326
2
43
75

votes

1 answer

ImportError: No module named pytesseract on Jupiter lab and VSCode but not my local

I have tried running a ProcessImage.py file in which I import the package pytesseract in Jupiter Lab and VSCode. This is the error that pops out : import pytesseract ImportError: No module named pytesseract I already know that pytesseract is…

import importerror python-tesseract

asked Feb 07 '20 at 17:09

thmo

votes

1 answer

OCR detecting E as £

I am using pytesseract (version 5 of tesseract) to scan an image. I have changed image to black and white to remove the noise but still E is being detected as £196893 . Also tried setting the language, dpi and psm values which has been suggested by…

ocr tesseract python-tesseract

asked Feb 02 '20 at 17:35

Sandeep Bhutani

Prev 1 2 3

…

99 100 Next