0

My image is target image and when I do

tesseract myimage.png result digits

I am getting result as 80 1 3047490 though I am expecting to get only digit, that is 4749 in my image. What I am doing wrong ? My Tesseract version is 3.03.

PS: I also tried with no success tesseract myimage.png result nobatch digits

Bhushan
  • 1,489
  • 3
  • 27
  • 45

1 Answers1

1

That is the expected result: the output is forced to be all digits. In this case, I would use Regex to extract the digits from the mixed output, or substring if you know the position of the numbers in the string.

nguyenq
  • 8,212
  • 1
  • 16
  • 16
  • Thanks @nguyenq for input but I can't use Regex as Tesseract sometimes read letter 'S' as '9'(and vice-versa). I wanted Tesseract to do character matching only against digit if possible. – Bhushan Aug 31 '15 at 11:27
  • 1
    You might want to try the [bazaar](https://tesseract-ocr.googlecode.com/svn/trunk/doc/tesseract.1.html) pattern: `\A\A\A\A\A\d\d\d\d\A` – nguyenq Sep 01 '15 at 01:44
  • Hii @nguyenq , I tried to use bazaar pattern but couldn't succeed as there is almost no documentation with example. Can you guide me what commands you use for bazaar pattern? – Bhushan Oct 28 '15 at 07:34