Tesseract (software)
Tesseract is an optical character recognition engine for various operating systems. It is free software, released under the Apache License. Originally developed by Hewlett-Packard as proprietary software in the 1980s, it was released as open source in 2005 and development has been sponsored by Google since 2006.
Tesseract 4.1.1 reading an image. | |
Original author(s) | Ray Smith, Hewlett-Packard |
---|---|
Developer(s) | Google and others |
Stable release | |
Repository | |
Written in | C and C++ |
Operating system | Linux, Windows, and macOS |
Available in | Interface: English Recognition: Afrikaans, Albanian, Arabic, Azerbaijani, Basque, Belarusian, Bengali, Bulgarian, Catalan, Czech, Cherokee, Croatian, Danish, Dutch, English, Esperanto, Estonian, Finnish, French, Galician, German, Greek, Hindi, Hebrew, Hungarian, Indonesian, Italian, Japanese, Kannada, Korean, Latvian, Lithuanian, Malayalam, Macedonian, Maltese, Malay, Norwegian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tagalog, Tamil, Telugu, Thai, Turkish, Ukrainian & Vietnamese (more can be added using included training files) |
Type | Optical character recognition |
License | Apache License 2.0 |
Website | github |
In 2006, Tesseract was considered one of the most accurate open-source OCR engines available.