Questions tagged [tesseract]

Tesseract is an OCR (Optical Character Recognition) engine originally developed at HP Labs and now available as an open source library with development sponsored by Google.

Tesseract is an open source, multi-lingual OCR (Optical Character Recognition) engine originally developed at HP Labs. It is now sponsored by Google and licensed under the Apache license 2.0. It currently recognizes 107 languages. Tesseract is primarily written in C++ and C. The project is hosted at https://github.com/tesseract-ocr/tesseract and its support forums are found at http://groups.google.com/group/tesseract-ocr.

4350 questions
1
vote
1 answer

IronOCR / Tesseract OCR recognize single digit

I want to use IronOCR to recognize single digits from a screenshot. The problem is, that my .Read() Result always ends up as an empty "". This is my code var bmpScreenshot = new Bitmap(105, 25, PixelFormat.Format32bppRgb); …
John Smith
  • 615
  • 4
  • 15
1
vote
1 answer

Why there is a permisson denied error while using node-tesseract-ocr?

I am using node-tesseract-ocr library for using ocr for my node js project. I installed tesseract-ocr in my machine(windows) using choco and then node-tesseract-ocr using npm. While requesting that particular route I am getting the following…
Areeba Akhtar
  • 137
  • 10
1
vote
0 answers

Setuptools include tesserocr package and custom traineddata into my pypi package

I'm developing a pip package atm which relies on the tesserocr package. I have my own custom traindeddata being included into my pypi package, but when I go to create the PyTessBaseAPI for the package demo I'm not sure how to set the path. Here's…
1
vote
0 answers

Swift use tesseract in broadcast extension over 50MB memory limits

I want use Broadcast extension and SwiftyTesseract to recognition text on screen,but when I load tessdata then will use over 50MB memory lead to crash. Has any solution use over 50MB memory or let main app background handler? func loadOCR(){ …
1
vote
0 answers

Extracting contours bounding boxes for ROI's from image using opencv

I am trying to extract bounding boxes from this form image. The Bounding Boxes in my case are all the boxes in the image. My approach was to Find contours, obtain the bounding box, extract the ROI and perform OCR using pytesseract on those ROI's. I…
Chaitanya
  • 31
  • 10
1
vote
0 answers

tesseract fails to form shapetable

i am attempting to extract OCR data of a 3-digit counter within a video via tesseract 4.1.1 on Kubuntu 21.04. (full tesseract version string below.) i am failing to add characters during the shapetable phase, and no other troubleshooting has worked…
person
  • 11
  • 2
1
vote
1 answer

How to edit the image so Tesseract OCR recognizes it? (Python)

Here is the image I am trying tesseract to detect: So far, I have tried greyscaling, inverting, blurring, thresholding and still no recognition. Is there something I'm missing that prevents it from recognizing or is there something (or any…
Alex Besse
  • 19
  • 2
1
vote
1 answer

Omit Leptonica Library from Tesseract

Working on an OCR project, I am trying to figure out if there is any way or possibility to omit Leptonica Library from Tesseract and maybe replace it with OpenCV. Already have OpenCV on the C layer and planning to combine that with Tesseract.
Alma
  • 75
  • 1
  • 6
1
vote
0 answers

R package "tesseract" OCR not recognizing values in an image

I'm using the image ocr from the tesseract package to try and extract a value from an image: Here's my code for trying to extract "2" using the ocr: library(tesseract) library(magick) library(dplyr) crop <- image_read("2.jpeg") text <- crop %>% …
millie0725
  • 359
  • 2
  • 12
1
vote
0 answers

How to use blobs with tesseract.js/

I am learning tesseract.js! What I am doing is streaming video from a device and displaying it. When I click a button, it runs a function that draws an image from the video and displays that. It also saves the image as a blob to the img variable. I…
1
vote
0 answers

icu4c unicharset_extractor error Tesseract font training

I'm trying to train a new font using Tesseract and I'm running into issues. I searched online and could not find anything related to this issue other than those for node reinstallation. The issue I encounter has to do with not having…
ken4ward
  • 2,246
  • 5
  • 49
  • 89
1
vote
0 answers

How to install Tesseract and Leptonica with header files on Windows

I have an cross platform application that requires Tesseract and Leptonica to work. Building it on Linux was a piece of cake, Windows seems to be way more difficult. The problem that I have is that I need the dlls and the header files. When I…
Damir Porobic
  • 681
  • 1
  • 8
  • 21
1
vote
1 answer

Tesseract installed is not installed in default location

I have installed tesseract ( build it from the source ) on my rhel machine as specific user without root access as below /> ./autogen.sh /> ./configure --prefix=$HOME/local/ /> make /> make install When i try to check if it is installer with…
hazal
  • 11
  • 2
1
vote
1 answer

Tesseract can't recognize exclamation marks

I use Tesseract5 and pytesseract My picture is: I tried different methods for pre-processing: scale, resize, binarization, blur, dilate and etc In the same time it works fine for "!?#@abc!!" Will be glad of any advice
Pablo
  • 11
  • 1
1
vote
0 answers

How to recognize text that looks like a Captcha, but it's not? Using pytesseract

I need to recognize text that looks like this - Photo. I tried to do it, but some word which is covered by the lines can not be recognized. import cv2 import pytesseract img = cv2.imread('screen.jpg') img = cv2.cvtColor(img,…