Questions tagged [tesseract]

Tesseract is an OCR (Optical Character Recognition) engine originally developed at HP Labs and now available as an open source library with development sponsored by Google.

Tesseract is an open source, multi-lingual OCR (Optical Character Recognition) engine originally developed at HP Labs. It is now sponsored by Google and licensed under the Apache license 2.0. It currently recognizes 107 languages. Tesseract is primarily written in C++ and C. The project is hosted at https://github.com/tesseract-ocr/tesseract and its support forums are found at http://groups.google.com/group/tesseract-ocr.

4350 questions

votes

4 answers

How to extract text from image Android app

I am working on a feature for my Android app. I would like to read text from a picture then save that text in a database. Is using OCR the best way? Is there another way? Google suggests in its documentation that NDK should only be used if strictly…

asked May 17 '16 at 23:41

MrAnderson1992

votes

1 answer

Tesseract OCR user patterns

Is there any way to get Tesseract to match only user-specified words or patterns? The manual claims it is possible, yet I cannot find a single documented instance on the internet of somebody getting this working. Here are many examples of people…

ocr tesseract

asked Jan 01 '16 at 22:31

Michael Connor

votes

3 answers

Text detection on Seven Segment Display via Tesseract OCR

The problem that I am running with is to extract the text out of an image and for this I have used Tesseract v3.02. The sample images from which I have to extract text are related to meter readings. Some of them are with solid sheet background and…

ocr tesseract seven-segment-display

asked Jul 16 '13 at 09:24

yunas

4,143
1
32
38

votes

1 answer

Getting error: "bad read of inttemp!" when training a new font in Tesseract 2

I'm trying to train Tesseract for a new font which can be used in my Android app. I need to train for digits only, so I had created one training image, box file and unicharset file. I have followed the training instructions, but when I tried to run…

java android tesseract

asked Feb 12 '13 at 09:36

Dipin

1,085
6
19

votes

4 answers

Tesseract 3 (OCR) - .NET Wrapper

http://code.google.com/p/tesseractdotnet/ I am having a problem getting Tesseract to work in my Visual Studio 2010 projects. I have tried console and winforms and both have the same outcome. I have come across a dll by someone else who claims to…

c# visual-studio-2010 wrapper tesseract

asked Apr 08 '12 at 22:15

Jpin

1,527
5
18
27

votes

4 answers

How can I use Tesseract in Android?

I have searched on the net for a couple of hours. I got many answers saying we need to use NDK, etc. for "Tesseract" for WINDOWS. But I didn't get any step-by-step/proper explanation of what should be done when NDK is installed. How to get the .so…

android ocr android-ndk tesseract

asked Oct 10 '11 at 08:20

PrincessLeiha

3,144
4
32
53

votes

2 answers

Tesseract OCR fails to detect varying font size and letters that are not horizontally aligned

I am trying to detect these price labels text which is always clearly preprocessed. Although it can easily read the text written above it, it fails to detect price values. I am using python bindings pytesseract although it also fails to read from…

python opencv ocr tesseract

asked Mar 28 '18 at 13:25

NONONONONO

votes

2 answers

Tesseract confuses two numbers

I'm writing an application to scan numbers from an image. The numbers are using the OCR-B font and may also contain + and > characters. This is my source image: The scans using Tesseract weren't very good, even when limiting the character set to…

ocr tesseract

asked Sep 03 '11 at 12:04

Danilo Bargen

18,626
15
91
127

votes

1 answer

Tesseract handwriting with dictionary training

I have a dictionary of words in a text file, separated by newlines. And I want to recognize the handwriting using Tesseract, and output the nearest matching line in the text file. This is the first time I'll be using Tesseract, and it's already in…

android tesseract handwriting

asked Sep 07 '12 at 00:39

Ruel

15,438
7
38
49

votes

4 answers

What's the best way to ocr as much text as possible from video game screenshots?

I'm trying to use the tesseract ocr tool to extract ocr text from video games(I'm pre processing screenshots and passing them to command line tool tsv output and parsing that). I'd like to use it for test automation not unlike selenium web testing.…

python automated-tests ocr tesseract ui-automation

asked May 04 '18 at 07:49

Roman A. Taycher

18,619
19
86
141

votes

1 answer

pytesseract cannot find the file specified

My code is straight forward and is the following: import pytesseract from PIL import Image img = Image.open('C:/temp/foo.jpg') img.load() i = pytesseract.image_to_string(img) and the error response I get back is: Traceback (most recent call…

python tesseract python-tesseract

asked Dec 11 '15 at 14:34

jason m

6,519
20
69
122

votes

1 answer

chinese character recognition using Tesseract OCR

I have been using Tesseract 3.0.2 OCR SDK for image text extraction. But if I use Chinese text images and pass through OCR then Tesseract doesn't provide me the Chinese characters instead of that I am getting numeric and english characters. But I…

iphone ios ocr tesseract

asked May 16 '13 at 07:41

Nishant Tyagi

9,893
3
40
61

votes

3 answers

what's the best image input type for tesseract?

I'm using tesseract on a project and want to know the best image input type for tesseract to give the best output. Is Binary&TIFF the best input or there's something else?

image-processing ocr tesseract

asked Apr 17 '12 at 14:17

chostDevil

1,041
5
17
24

votes

4 answers

converting cv::Mat for tesseract

I'm using OpenCV to extract a subimage of a scanned document and would like to use tesseract to perform OCR over this subimage. I found out that I can use two methods for text recognition in tesseract, but so far I wasn't able to find a working…

c++ opencv tesseract

asked Nov 13 '11 at 22:44

Pedro

4,100
10
58
96

votes

12 answers

(-215:Assertion failed) !_src.empty() in function 'cv::cvtColor' with cv::imread

I am trying to recognize text from an image to then have the text outputted; however, this error spits out: Traceback (most recent call last): File "C:/Users/Benji's Beast/AppData/Local/Programs/Python/Python37-32/imageDet.py", line 41, in…

python opencv ocr tesseract python-tesseract

asked Dec 26 '18 at 01:18

Benji

Prev 1 2 3

…

99 100 Next