How can I call the Acrobat feature OCR from C#?

Question

I want to write a C# application that can utilize the OCR function in Adobe Acrobat. How can I call this? Is there a public API?

David Vogel · Answer 1 · 2015-06-09T18:01:57.370

There is no direct Adobe OCR API suitable for .net. There are some alternatives though, for what you are trying to achieve. There is a open-source .Net wrapper for Google's open-source Tesseract OCR available on GitHub here: https://github.com/charlesw/tesseract. This should get you OCR capability within C#.

From the documentation:

Getting started quickly

Add the Tesseract NuGet Package by running Install-Package Tesseract from the Package Manager Console.

Ensure you have Visual Studio 2012 x86 & x64 runtimes installed

Download language data files for tesseract 3.02 from tesseract-ocr and add them to your project, ensure 'Copy to output directory' is set to Always.

Check out the samples solution ~/Samples/Tesseract.Samples.sln for a working example

score 0 · Answer 2 · answered Jul 03 '09 at 11:24

0

I believe this is part of the Adobe Reader software and is not accessible through an API. There's an API and libraries for constructing PDF documents per the format specifications, but OCR is something that concerns the reader and not the format. I'm afraid you would either have to use another library or implement it yourself.

answered Jul 03 '09 at 11:24

Slavo

15,255
11
47
60

Sure? because i only need it in form of: makeOCR(file); Then he open the File and make a OCR – subprime Jul 03 '09 at 11:50
@Salvo any idea can we convert html to pdf using acrobat.can you provide me any useful link.i am planning to use it in .Net application – shreesha Feb 25 '16 at 06:51

How can I call the Acrobat feature OCR from C#?

2 Answers2