Comparison of optical character recognition software

This comparison of optical character recognition software includes:

  • OCR engines, that do the actual character identification
  • Layout analysis software, that divide scanned documents into zones suitable for OCR
  • Graphical interfaces to one or more OCR engines
  • Software development kits that are used to add OCR capabilities to other software (e.g. forms processing applications, document imaging management systems, e-discovery systems, records management solutions)
Sortable table
Name Founded year Latest stable version Release year License Online Windows Mac OS X Linux BSD Android iOS Programming language SDK? Languages Fonts Output Formats Notes
ABBYY FineReader1989162022ProprietaryYesYesYesNoYes Yes YesC/C++Yes192All fontsDOC, DOCX, XLS, XLSX, PPTX, RTF, PDF, HTML, CSV, TXT, ODT, DjVu, EPUB, FB2ABBYY also supplies SDKs for embedded and mobile devices. Professional, Corporate and Site License Editions for Windows, Express Edition for Mac.
AnyDoc Software1989??ProprietaryNoYesNoNoNo ? ?VBScript???Works with structured, semi-structured, and unstructured documents.
Asprise OCR SDK1998152015ProprietaryYesYesYesYesYes ? ?Java, C#,VB.NET, C/C++/DelphiYes20+?Plain text, searchable PDF, XMLJava, C#, VB.NET, C/C++/Delphi SDKs for OCR and Barcode recognition on Windows, Linux, Mac OS X and Unix.
CuneiForm19961.12011BSD variantNoYesYesYesYes ? ?C/C++Yes28Any printed fontHTML, hOCR, native, RTF, TeX, TXTEnterprise-class system, can save text formatting and recognizes complicated tables of any structure
Dynamsoft OCR SDK20038.22012ProprietaryYesYesNoNoNo ? ?C/C++Yes40+?PDF, TXT
E-aksharayan 2010 Yes No Yes No ? ? 14 RTF, TXT, BRL
GOCR20000.522018GPLYesYesYesYesYes ? ?C?20+?
Google Drive OCR or Google Cloud Vision2015ProprietaryYesBrowserBrowserBrowserUnknown ? ?UnknownYes200+All fontstextGoogle blog post
Microsoft Office Document Imaging?Office 20072007ProprietaryNoYesNoNoNo ? ?????Uses OmniPage
Microsoft Office OneNote 20072011?2007ProprietaryNoYesNoNoNo ? ?????
OCRFeeder2009-030.8.52022GPLNoNoNoYesNo ? ?Python???Features a full user interface and has a command-line tool for automatic operations. Has its own segmentation algorithm but uses system-wide OCR engines like Tesseract or Ocrad
Ocrad?0.282022GPLYesNoYesYesYes ? ?C++YesLatin alphabet?Command line
OCRopus20071.3.32017ApacheNoNoYesYesYes ? ?Python?All languages using Latin script (other languages can be trained)Normal Latin script and Fraktur (other scripts can be trained)TXT, hOCR, PDFPluggable framework under active development, used for Google Books
OmniPage1970s19.22015ProprietaryYesYesYesYesNo ? ?C/C++, C#Yes125Machine and handprinted fontsDOC/DOCX XLS/XLSX PPTX RTF PDF PDF/A Searchable PDF HTML Text XML ePUB MP3Product of Nuance Communications
Puma.NET??2009BSDNoYesNoNoNo ? ?C#Yes28Any printed font.NET OCR SDK based on Cognitive Technologies' CuneiForm recognition engine. Wraps Puma COM server and provides simplified API for .NET applications
ReadSoft???ProprietaryNoYesNoNoNo ? ?????Scan, capture and classify business documents such as invoices, forms and purchase orders integrated with business processes.
Scantron???ProprietaryNoYesNoNoNo ? ?????For working with localized interfaces, corresponding language support is required.
SmartScore199110.5.82015ProprietaryNoYesYesNoNo ? ?????For musical scores
Tesseract19855.3.32023ApacheNoYesYesYesYes ? ?C++, CYes100+Any printed fontText, ALTO, hOCR, PDF, others with different user interfaces or the APICreated by Hewlett-Packard; under further development by Google
Name Founded year Latest stable version Release year License Online Windows Mac OS X Linux BSD Android iOS Programming language SDK? Languages Fonts Output Formats Notes
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.