Questions tagged [ocr]

Optical Character Recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. The following topics, although some being distinct fields of application, are also commonly referred to as OCR: Handwritten Text Recognition (HTR), Optical Word Recognition (OWR), Intelligent Character Recognition (ICR), Intelligent Word Recognition (IWR).

Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files, to computerize a record-keeping system in an office, or to publish the text on a website.

OCR @Wikipedia

Frequently-asked questions:

Simple Digit Recognition OCR in OpenCV-Python

6124 questions

votes

3 answers

Tesseract training for a new font

I'm still new to Tesseract OCR and after using it in my script noticed it had a relatively big error rate for the images I was trying to extract text from. I came across Tesseract training, which supposedly would be able to decrease error rate for a…

ocr tesseract

asked Dec 23 '16 at 05:13

user19235

votes

2 answers

How can I run tesseract with multiple languages one time?

I have to analyzed a image which containing both English and Japanese texts. When I run tesseract by default (-l eng), some Japanese characters lost. Otherwise, if I run tesseract with japanese (-l jpn) some English characters lost (e.g. Email). How…

image-processing ocr tesseract

asked Jun 24 '14 at 06:31

pars

votes

8 answers

How to know if a PDF contains only images or has been OCR scanned for searching?

I have a bunch of PDF files that came from scanned documents. The files contain a mix of images and text. Some were scanned as images with no OCR, so each PDF page is one large image, even where the whole page is entirely text. Others were…

search pdf ocr acrobat

asked Sep 28 '09 at 22:45

Bratch

4,103
5
27
32

votes

5 answers

How to install language in tesseract OCR

I have installed tesseract OCR and it has only 'eng' and 'osd' in the language list. I need german language. I tired following command brew install tesseract-ocr-deu but i am getting error. Error: No available formula with the name…

ocr tesseract

asked Oct 19 '18 at 11:34

Lama Madan

votes

5 answers

Remove background noise from image to make text more clear for OCR

I've written an application that segments an image based on the text regions within it, and extracts those regions as I see fit. What I'm attempting to do is clean the image so OCR (Tesseract) gives an accurate result. I have the following image as…

java c++ opencv ocr

asked Nov 23 '15 at 21:37

Zy0n

votes

5 answers

iOS: Real Time OCR on top of live camera feed (similar to iTunes Redeem Gift Card)

Is there a way to accomplish something similar to what the iTunes and App Store Apps do when you redeem a Gift Card using the device camera, recognizing a short string of characters in real time on top of the live camera feed? I know that in iOS 7…

ios ocr

asked Sep 30 '13 at 18:38

boliva

5,604
6
37
39

votes

4 answers

Understanding Freeman chain codes for OCR

Note that I'm really looking for an answer to my question. I am not looking for a link to some source code or to some academic paper: I've already used the source and I've already read papers and still haven't figured out the last part of this…

algorithm ocr

asked Jul 16 '11 at 15:58

SyntaxT3rr0r

27,745
21
87
120

votes

5 answers

How can i use tesseract ocr(or any other free ocr) in small c++ project?

So what I heard after research is that the only solid free OCR options are either Tesseract or CuneiForm. Now, the Tesseract docs are plain horrible, all they give you is a bunch of Visual Studio code (for me on Windows) and from there you are on…

c++ c windows image-processing ocr

asked Feb 22 '11 at 14:50

Marko29

1,005
4
14
25

votes

4 answers

Tesseract ocr PDF as input

I am building an OCR project and I am using a .Net wrapper for Tesseract. The samples that the wrapper have don't show how to deal with a PDF as input. Using a PDF as input how do I produce a searchable PDF using c#? I have use ghostscript library…

c# ocr tesseract

asked Apr 15 '15 at 17:48

acrab

votes

1 answer

Getting text from image on ios (image processing)

I am thinking of making an application that requires extracting TEXT from an image. I haven't done any thing similar and I don't want to implement the whole stuff on my own. Is there any known library or open source code (supported for ios,…

ios objective-c image image-processing ocr

asked Dec 27 '10 at 12:24

Vikram.exe

4,565
3
29
40

votes

4 answers

My own OCR-program in Python

I am still a beginner but I want to write a character-recognition-program. This program isn't ready yet. And I edited a lot, therefor the comments may not match exactly. I will use the 8-connectivity for the connected component labeling. from PIL…

python arrays artificial-intelligence ocr

asked Jan 01 '10 at 23:14

kame

20,848
33
104
159

votes

2 answers

Where can I find a free .Net (C#) library that I can use to scan and OCR documents?

I searching for a free .Net (C#) library that iIcan use to scan from a document scanner, and then OCR the document, so I can get the text from it to save in a database. After some search I can not find anyone working in Visual Studio 2010 and .Net…

c# .net open-source document ocr

asked May 05 '12 at 05:03

RickardP

2,558
7
34
42

votes

8 answers

Can OCR software reliably read values from a table?

Would OCR Software be able to reliably translate an image such as the following into a list of values? UPDATE: In more detail the task is as follows: We have a client application, where the user can open a report. This report contains a table of…

ocr

asked May 30 '11 at 07:31

GarethOwen

6,075
5
39
56

votes

1 answer

How do I train tesseract 4 with image data instead of a font file?

I'm trying to train Tesseract 4 with images instead of fonts. In the docs they are explaining only the approach with fonts, not with images. I know how it works, when I use a prior version of Tesseract but I didn't get how to use the box/tiff…

ocr tesseract lstm training-data

asked Apr 11 '17 at 17:47

claim

votes

2 answers

OpenCV MSER detect text areas - Python

I have an invoice image, and I want to detect the text on it. So I plan to use 2 steps: first is to identify the text areas, and then using OCR to recognize the text. I am using OpenCV 3.0 in python for that. I am able to identify the text(including…

python opencv image-processing ocr

asked Oct 17 '16 at 04:43

Amit Madan

1,013
2
12
23

Prev 1 2 3

…

99 100 Next