Questions tagged [python-tesseract]

Python-tesseract is a wrapper class for Tesseract OCR that allows any conventional image files (JPG, GIF, PNG, TIFF, etc.) to be read and get its text, data of text, or even convert it to pdf.

Python-tesseract is a wrapper class for OCR that allows any conventional image files (JPG, GIF, PNG, TIFF, etc.) to be read and decoded into usable text.

Tesseract is advertised as the most accurate open source OCR engine available. It was developed at HP Labs between 1985 and 1995 and then remained dormant until 2006 when Google revived the project.

For more information, please see the Python-tesseract page or the Tesseract page.

1664 questions
0
votes
1 answer

How to use OpenCV to detect contour in reversed color (0 and 255) image?

I have few images with black and white font color at the same page. Sample-input What I'm trying to do is to find the contour of each word in this image so it become like this. Sample-output *sample-output just an illustration since I do it inside…
0
votes
0 answers

Improve Tessract detection on grayscale image

I have an issue with text recognition on same files with low contrast. I'm using PYTESSERACT and some files, like this, return me absolutely nothing : https://i.stack.imgur.com/DbgQm.png I use the LineBoxBuilder from PyTesseract. Before that, I…
Nathan Cheval
  • 773
  • 2
  • 7
  • 32
0
votes
1 answer

Image to text conversion python

i am trying to extract only the highlighted text from an image using pytesseract module in python. Issue is that i am unable to extract the highlighted part and the whole image is getting converted to text, and i have no idea how to extract specific…
0
votes
1 answer

detection of vertical texts (container BIC codes) with tesseract OCR fails

I'm trying to use Tesseract Open Source OCR Engine for a text detection of intermodal (shipping) containers codes in BIC format. BTW, I'm using tesseract through pytesseract and I preprocess input photos with few standard opencv filtering (huge…
Giorgio Robino
  • 2,148
  • 6
  • 38
  • 59
0
votes
1 answer

Pytesseract not recognizing text as expected

I am doing a project, using OCR I want to read the text from pic. I am using tesseract to for OCR, for getting better results I added image enhancement code. But the results in OCR is average before image processing, after preprocessing there is no…
0
votes
2 answers

Python Tesseract cyrillic characters problem

I am trying to create a script that will highlight specific words inside images using tesseract. My approach works fine for most languages except languages with Cyrillic characters like Russian or Greek. For example usinng this image, when I…
Alxdey
  • 1
  • 1
0
votes
1 answer

Difficulty reading text with pytesseract

I need to read the highest temperature on thermographic images, as shown below: IR_1544_INFRA.jpg IR_1546_INFRA.jpg IR_1560_INFRA.jpg IR_1564_INFRA.jpg I used the following code, this was the best result. I also tried several other ways, such as:…
0
votes
1 answer

I am using docker for flask and pytesseract container is running but cannot access the page on browser

Using this for DockerFile, on running with docker run -p 5000:5000 flask_app:1.0 It runs but browser is showing 127.0.0.1 refused to connect. RUN apt-get update \ && apt-get install tesseract-ocr -y \ python3 \ #python-setuptools \ …
White_Wolf
  • 55
  • 5
0
votes
1 answer

Text Detection of Labels using PyTesseract

A label detection tool that automatically identifies and alphabetically sorts the images based on equipment number (19-V1083AI). I used the pytesseract library to convert the image to a string after the contours of the equipment label were…
0
votes
1 answer

Screenshot desktop, capture words, get coordinates of words and click on them

I want to take a screenshot of my desktop, interpret the characters on my desktop, group words and then get the coordinates of these words so I can click on them. imageName = "images/desktop.png" image = cv2.imread(imageName) # Grab image…
Otto
  • 663
  • 3
  • 17
  • 33
0
votes
1 answer

Inconsistent Pytesseract

I have a directory full of images and want to extract the value from part of it. I won't bother you with the efforts to extract the exact position of the text from the original image. It's just a convolve function. Here's an example of it…
Dr Xorile
  • 967
  • 1
  • 7
  • 20
0
votes
1 answer

Tesseract not detecting any text on RGB images on Python

Hey I started working with Tesseract OCR but I'm having problems getting the text from really simple RGB images. It works just fine with text2image images. Here is my code: from PIL import Image import pytesseract import argparse import cv2 import…
yarin Cohen
  • 995
  • 1
  • 13
  • 39
0
votes
1 answer

How do you convert an image into a number in python using pytesseract

I have been trying to convert an image into a string/integer using pytesseract. The only problem is every time I run the code nothing happens. I changed the image into a text image reading "TEXT" and pytesseract detected it fine. Here is what I was…
0
votes
1 answer

pytesseract for low resolution img

Disc: I total newbie In OCR and looking for oportunity to convert image to text from image with low resolution image I'm tried pytesseract with different conf, but it still fails convert image to text. As far I understand I need some kind of…
DobbyBobby
  • 11
  • 1
0
votes
2 answers

Is reading the text from this type of image doable? If so, how would I approach doing it?

I think that most OCR tools are used for reading documents. I'm trying to make a program that reads the post-result screen from a game. I was wondering if it's possible using some sort of workaround (I'm new to OCR tools). An example of the image. A…