Questions tagged [ocr]

Optical Character Recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. The following topics, although some being distinct fields of application, are also commonly referred to as OCR: Handwritten Text Recognition (HTR), Optical Word Recognition (OWR), Intelligent Character Recognition (ICR), Intelligent Word Recognition (IWR).

Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files, to computerize a record-keeping system in an office, or to publish the text on a website.

OCR @Wikipedia

Frequently-asked questions:

6124 questions
17
votes
1 answer

Google Vision API does not recognize single digits

I have a project that make use of Google Vision API DOCUMENT_TEXT_DETECTION in order to extract text from document images. Often the API has troubles in recognizing single digits, as you can see in this image: I suppose that the problem could be…
17
votes
1 answer

Pytesseract set character whitelist

Does anyone know how to set the character whitelist for Pytesseract? I want it to only output A-z and 0-9. Is this possible? I have the following: img = Image.open('test.jpg') result = pytesseract.image_to_string(img, config='-psm 6') I'm getting…
Minato10
  • 173
  • 1
  • 1
  • 4
17
votes
2 answers

Google Cloud Vision - Numbers and Numerals OCR

I've been trying to implement an OCR program with Python that reads numbers with a specific format, XXX-XXX. I used Google's Cloud Vision API Text Recognition, but the results were unreliable. Out of 30 high-contrast 1280 x 1024 bmp images, only a…
17
votes
3 answers

OpenCV Adaptive Threshold OCR

I am using OpenCV to prepare images for OCR from an iPhone camera, and I have been having trouble getting the results I need for an accurate OCR scan. Here is the code I am using now. cv::cvtColor(cvImage, cvImage, CV_BGR2GRAY); …
user3247146
17
votes
2 answers

Suggestions for digit recognition

I'm writing an Android app to extract a Sudoku puzzle from a picture. For each cell in the 9x9 Sudoku grid, I need to determine whether it contains one of the digits 1 through 9 or is blank. I start off with a Sudoku like this: I pre-process the…
1''
  • 26,823
  • 32
  • 143
  • 200
17
votes
2 answers

PDF and text layer

According to this site http://www.searchable-pdf.com/content.php?lang=en&c=61, a PDF can be searchable when a text layer is added. I was looking for the technical specification of a PDF. I think text can be stored in 2 ways into a PDF: a) as a text…
Jochen Hebbrecht
  • 733
  • 2
  • 9
  • 23
16
votes
3 answers

Windows 7 OCR API

I have been reviewing replacements for the Office 2007 MODI OCR (OneNote's 2010 solution has lesser quality/results than 2007 :-( ). I notice that Windows 7 contains an OCR library once you install the optional tiff filter The OCR component gets…
slyi
  • 321
  • 1
  • 2
  • 7
16
votes
2 answers

Alternative to Tesseract OCR Training?

For the past 3 months I've been trying to train the Tesseract With identifying a collection of images I've had, due a real lack of proper documentation, and very high level of complexity I'm starting to give up on Tesseract as a solution. I'm…
Asaf
  • 8,106
  • 19
  • 66
  • 116
16
votes
2 answers

Microsoft Azure Cognitive Services Handwriting Detection Bounding Box Parameters

I am currently using Microsoft Azure Cognitive Services Handwriting Detection API. The API returns a set of values for the bounding box: { "boundingBox": [ 2, 52, 65, 46, 69, 89, 7, 95 ], …
Rohan Pillai
  • 917
  • 3
  • 17
  • 26
16
votes
7 answers

Tesseract OCR Library - Learning Font

Well I'm using a complied .NET version of this OCR which can be found @ http://www.pixel-technology.com/freeware/tessnet2/ I have it working, however the aim of this is to translate license plates, sadly the engine really doesn't accurately…
Ash
  • 3,494
  • 12
  • 35
  • 42
16
votes
2 answers

Stroke Width Transform (SWT) implementation (Java, C#...)

I recently discovered the stroke width transform, as documented in the following research paper: Detecting Text in Natural Scenes with Stroke Width Transform. Boris Epshtein, Yonathan Wexler, and Eyal Ofek. IEEE International Conference on…
user496607
  • 442
  • 1
  • 10
  • 21
16
votes
2 answers

Doing OCR with R

I have been trying to do OCR within R (reading PDF data which data as scanned image). Have been reading about this @ http://electricarchaeology.ca/2014/07/15/doing-ocr-within-r/ This a very good post. Effectively 3 steps: convert pdf to ppm (an…
anshuk_pal
  • 195
  • 1
  • 8
16
votes
1 answer

Processing an image of a table to get data from it

I have this image of a table (seen below). And I'm trying to get the data from the table, similar to this form (first row of table image): rows[0] = [x,x, , , , ,x, ,x,x, ,x, ,x, , , , ,x, , , ,x,x,x, ,x, ,x, , , , ] I need the number of x's as…
user
  • 715
  • 4
  • 13
  • 32
16
votes
1 answer

Improve Tesseract OCR results with blurred text

I am working on OCR recognition of printed text. In particular I am focusing on the preprocessing step to improve the results of the Tesseract engine. I have already obtained good results with adaptive thresholding, noise removal, text deskew,…
Marco Ancona
  • 2,073
  • 3
  • 22
  • 37
16
votes
2 answers

Suggest an OCR Library for iOS

I want to make an offline iPhone application that can grab text from a picture. Can anyone suggest the best library which I can use. I heard ZBAR and ZXING can be used only for barcode reading. Is there any other OCR Libraries for iOS to read text…
Vaisakh
  • 1,088
  • 1
  • 8
  • 14