15

I was looking around for an OCR library - optimally it would be open-source - that I could use on some Arabic pdfs. Googling it didn't result in anything useful. I was wondering if anyone knows a related OCR library or even one that works on related languages (Farsi and Urdu could be relevant) that Arabic support could be added to.

Any general suggestions on how to approach this will be appreciated.

Mohammed
  • 2,693
  • 2
  • 17
  • 8
  • http://stackoverflow.com/questions/6003630/open-source-ocr-for-arabic http://stackoverflow.com/questions/6825712/need-an-opensource-of-arabic-ocr-either-in-java-or-in-dotnet – 1.01pm Nov 23 '11 at 01:59

3 Answers3

10

Starting with Version 3.01 of Tessaract-ocr it now supports Arabic

1.01pm
  • 841
  • 1
  • 12
  • 23
0

The Arabic language is sophisticated when it comes to OCR because of the nature of the language and there is no free or commercial software that can get 100% accuracy.

This is from my personal experience but you can try IRISREadIRIS pro 14.

Vality
  • 6,577
  • 3
  • 27
  • 48
  • Please try and reformat your post to add some grammar, it is very hard to read as it. I have made a start but some more work is needed. – Vality Aug 01 '14 at 15:47
0

I know nothing about Arabic OCR quality, but some intelligent Googling found Sakhr's Automatic Reader. It's commercial software.

Sorry. It's commercial, and quite expensive. Arabic is probably one of the hardest languages in the world to do OCR on -- I guess it takes a lot to motivate someone to do it.

Ken Bloom
  • 57,498
  • 14
  • 111
  • 168