Questions tagged [tesseract]

Tesseract is an OCR (Optical Character Recognition) engine originally developed at HP Labs and now available as an open source library with development sponsored by Google.

Tesseract is an open source, multi-lingual OCR (Optical Character Recognition) engine originally developed at HP Labs. It is now sponsored by Google and licensed under the Apache license 2.0. It currently recognizes 107 languages. Tesseract is primarily written in C++ and C. The project is hosted at https://github.com/tesseract-ocr/tesseract and its support forums are found at http://groups.google.com/group/tesseract-ocr.

4350 questions

votes

5 answers

Using tesseract to recognize license plates

I'm developing an app which can recognize license plates (ANPR). The first step is to extract the licenses plates from the image. I am using OpenCV to detect the plates based on width/height ratio and this works pretty well: But as you can see,…

asked Oct 09 '13 at 09:51

unicorn80

1,107
2
9
15

votes

6 answers

How do I segment a document using Tesseract then output the resulting bounding boxes and labels

I'm trying to get Tesseract to output a file with labelled bounding boxes that result from page segmentation (pre OCR). I know it must be capable of doing this 'out of the box' because of the results shown at the ICDAR competitions where contestants…

ocr tesseract hocr

asked Feb 18 '15 at 18:27

James Owers

7,948
10
55
71

votes

9 answers

What is the ideal font for OCR?

Does anybody have any experience with different fonts for OCR? I am generating an ID then trying to scan it with tesseract. At the moment I am just T&E'n different fonts, but this seems pretty inefficient. I've tried the OCR* family of fonts, and…

fonts ocr tesseract

asked Nov 25 '08 at 01:06

Chris Lloyd

12,100
7
36
32

votes

6 answers

Preprocessing image for Tesseract OCR with OpenCV

I'm trying to develop an App that uses Tesseract to recognize text from documents taken by a phone's cam. I'm using OpenCV to preprocess the image for better recognition, applying a Gaussian blur and a Threshold method for binarization, but the…

opencv image-processing ocr tesseract

asked Mar 09 '15 at 05:57

Mauricio

votes

6 answers

Recognize a number from an image

I'm trying to write an application to find the numbers inside an image and add them up. How can I identify the written number in an image? There are many boxes in the image I need to get the numbers in the left side and sum them to give total. How…

java image-processing ocr tesseract hough-transform

asked Apr 20 '15 at 10:45

Hash

7,726
9
34
53

votes

9 answers

Tesseract OCR simple example

Hi Can you anyone give me a simple example of testing Tesseract OCR preferably in C#. I tried the demo found here. I download the English dataset and unzipped in C drive. and modified the code as followings: string path =…

c# ocr tesseract

asked May 16 '13 at 22:14

Will Robinson

votes

6 answers

Using Tesseract from java

I'm trying to build a sample application in java that will read an image file and just output the text extracted from the image. I found the Tesseract project which seems promising, however, its in c++. In order to use it, should I simply run it as…

java ocr tesseract

asked Dec 20 '12 at 14:45

Omnipresent

29,434
47
142
186

votes

5 answers

OCR with the Tesseract interface

How do you OCR an tiff file using Tesseract's interface in c#? Currently I only know how to do it using the executable.

c# ocr tesseract

asked Aug 27 '08 at 14:46

toh yen cheng

votes

2 answers

Which OCR Engine is better: Tesseract or OCRopus?

I have tried Tesseract with iPhone and assessed its accuracy to be 70% without image preprocessing. I also noticed that it might be poor in extracting digits. I have heard about OCRopus OCR engine: which is better, Tesseract or OCRopus, in terms of…

ocr tesseract feature-extraction

asked Apr 05 '12 at 17:08

Ahmed Hussein

votes

2 answers

What OCR options exist beyond Tesseract?

I've used Tesseract a bit and it's results leave much to be desired. I'm currently detecting very small images (35x15, without border, but have tried adding one with imagemagick with no ocr advantage); they range from 2 chars to 5 and are a pretty…

php python ruby ocr tesseract

asked Mar 13 '12 at 19:31

ylluminate

12,102
17
78
152

votes

3 answers

Tesseract training for a new font

I'm still new to Tesseract OCR and after using it in my script noticed it had a relatively big error rate for the images I was trying to extract text from. I came across Tesseract training, which supposedly would be able to decrease error rate for a…

ocr tesseract

asked Dec 23 '16 at 05:13

user19235

votes

2 answers

How can I run tesseract with multiple languages one time?

I have to analyzed a image which containing both English and Japanese texts. When I run tesseract by default (-l eng), some Japanese characters lost. Otherwise, if I run tesseract with japanese (-l jpn) some English characters lost (e.g. Email). How…

image-processing ocr tesseract

asked Jun 24 '14 at 06:31

pars

votes

5 answers

How to install language in tesseract OCR

I have installed tesseract OCR and it has only 'eng' and 'osd' in the language list. I need german language. I tired following command brew install tesseract-ocr-deu but i am getting error. Error: No available formula with the name…

ocr tesseract

asked Oct 19 '18 at 11:34

Lama Madan

votes

4 answers

Removing horizontal underlines

I am attempting to pull text from a few hundred JPGs that contain information on capital punishment records; the JPGs are hosted by the Texas Department of Criminal Justice (TDCJ). Below is an example snippet with personally identifiable…

python c++ opencv tesseract

asked Jan 18 '18 at 17:57

Brad Solomon

38,521
31
149
235

votes

2 answers

Can `tesseract-ocr` put the result to STDOUT?

Using tesseract-ocr #3.02.02. The basic usage of tesseract is tesseract sourc.png result and result.txt is generated. To get the result text, I have to cat this file. Is there any options to dump the result in stdout?

tesseract

asked Jun 22 '14 at 03:13

otiai10

4,289
5
38
50

Prev 1

…

99 100 Next