Questions tagged [python-tesseract]

Python-tesseract is a wrapper class for Tesseract OCR that allows any conventional image files (JPG, GIF, PNG, TIFF, etc.) to be read and get its text, data of text, or even convert it to pdf.

Python-tesseract is a wrapper class for OCR that allows any conventional image files (JPG, GIF, PNG, TIFF, etc.) to be read and decoded into usable text.

Tesseract is advertised as the most accurate open source OCR engine available. It was developed at HP Labs between 1985 and 1995 and then remained dormant until 2006 when Google revived the project.

For more information, please see the Python-tesseract page or the Tesseract page.

1664 questions
0
votes
0 answers

why does tesseract not load in python

So i wanted to use tesseract and i installed both pytesseract and tesseract ocr and created the path. So why is it when i try to use tesseract i get an error message which says : 1, 'Error opening data file C:\\Program Files…
Nisox
  • 1
  • 1
  • 7
0
votes
1 answer

Pytesseract doesn't find any text only on some files

I have the following code and the problem is, that on some images the return value is empty. The structure of the images is always the same. it is plain black text on white background. Clearly readable. 50% of the results are excellent and other…
Rune
  • 61
  • 5
0
votes
0 answers

Pytesseract dont reconize a very clear image

I have aplied pytesseract in Three similar images of the digit "2". Only in the last one, pytesseract reconize correctly the digit. The three images have diferent dimensions and if i change the dimension of the images in the right way, pytesseract…
0
votes
2 answers

pytesseract Failed loading language \'eng\'

I've seen a lot of other people getting this error, and I've tried a lot of different things to fix it. Nothing so far has worked. I have: Added the path to my Tesseract-OCR folder AND the tesseract.exe file to PATH Added an environment variable…
Sophie Snoww
  • 33
  • 1
  • 6
0
votes
3 answers

pytesseract installed but missing?

x64, Win 10, Anaconda Python 2.7 I'm trying to do some OCR from captured video frames using OpenCV & pytesseract, my code... import numpy as np import cv2 from PIL import ImageGrab import pytesseract cap = cv2.VideoCapture(0) while True: #…
DrBwts
  • 3,470
  • 6
  • 38
  • 62
0
votes
0 answers

OCR all .tif images in directory

I have python code which does OCR for one tiff file and prints result in python window. I have more number of tiff files in a directory, it will take more hours to OCR all images one by one using my code. Since I'm a beginner, I'm getting error…
Sarath SRK
  • 111
  • 5
0
votes
1 answer

How to get the intensity (darkness) of all the text in an image to one level?

I have used Pytesseract and openCV to read text from an image. I used the median blur, normalization and threshold to remove the background and was able to read the text. However, some parts of the text have turned too light during the process of…
developer
  • 257
  • 1
  • 3
  • 15
0
votes
1 answer

Python Real Time OCR With OpenCV and pytesseract

I am just starting out on python and I am attempting to create a code that does real-time OCR on a portion of my screen. I was certain this code would work, but it just throws me a bunch of tesseract errors. Does the image need to be saved for…
0
votes
1 answer

Pycharm not finding common modules

I'm unable to import modules in pycharm, however i am easily able to do it via the cmd (after i've typed) python and via the python console in Pycharm as well. I am using python 2.7 and python.exe is in my PATH. In Pycharm when i go into setting…
user10739071
0
votes
0 answers

Extract date from Image and filter dataframe with date

I'm working on Image Recognition. There are a lot of video taken from security cameras and I need to know someone on the camera. I'm trying to create the training kit right now. For this purpose, I've divided the videos into frames. Then I pulled…
0
votes
1 answer

Why python tesserocr not using 4 CPU cores on AWS Batch?

I'm trying to get tesserocr python library to run on 4 cores. According to tesseract docs, I understand it supports up to 4 cores. I have a tesserocr python3.x job running inside AWS Batch (docker container based on amazonlinux:latest image) on a…
Sagi Mann
  • 2,967
  • 6
  • 39
  • 72
0
votes
1 answer

Tesseract fails to parse text from image

I'm completely new to opencv and tesseract. I spent all day trying to make code that would parse game duration from images like that: original image (game duration is in the top left corner) I came to code that manages to recognize the duration…
stekatk
  • 13
  • 1
  • 8
0
votes
0 answers

How make tesseract ocr distinguish letters from numbers

I have a picture of plate with text "991AAA". I binarized it with opencv, found necessary contours and gave them to Tesseract ocr. But it reads this as "ЧЧЛААА" (rus lang, I guess '9' kinda looks like 'Ч' but no). Is there problem with tesseract…
user8638529
0
votes
1 answer

Concat contours from image

I am making plate recognition app. I managed to find contours of each element of plate. How can I cut them out, resize and combine, so tesseract ocr can do it's job and recognize the text from the plate. To make it more clear I attach pictures from…
user8638529
0
votes
0 answers

Jupyter Notebook object has no attribute

I'm getting an error to convert images into text when I'm working on Jupyter notebook using ubuntu. but when I'm working with windows I'm successfully retrieving text from Images. Please see on Image click here to see image
CVK
  • 53
  • 1
  • 1
  • 10
1 2 3
99
100