1

I want to Pre Process this image in Apple Ios, which kind of filters we can apply for this kind of images. I want to remove double quotes characters before and after numbers numbers as well as last character as i marked in boxes.

I have tried whitelisting

tessrect.charwhitelist="0123456789";

and

tessrect.blacklist="\":";

I have used GPUImage lib for preprocessing.

Image Sample

Dgan
  • 10,077
  • 1
  • 29
  • 51
  • I don't know anything about Apple Ios stuff but I would suggest to search for all characters with some segmentation methods(adaptive threshold) and maybe filter on blob size. I can imagine the blobs of the quotes being a smaller size than the digits. – Janco de Vries May 18 '16 at 08:27
  • What does Tesseract return ? It might be more effective to perform OCR on the characters as they stand and discard whatever was decoded for them (hopefully non-digits). –  May 18 '16 at 09:48
  • @YvesDaoust yes but Tesseract some times detects it as numbers or some times as characters. – Dgan May 18 '16 at 11:09
  • Is the text layout fixed ? –  May 18 '16 at 11:50
  • @YvesDaoust Yes this is one of Bank Cheque image. – Dgan May 18 '16 at 15:26
  • Then isn't it enough to discard the first, eighth and eighteenth characters ? –  May 18 '16 at 15:27
  • The characters are `U+2446 OCR BRANCH BANK IDENTIFICATION` and `U+2448 OCR DASH` [see here](https://en.wikipedia.org/wiki/Optical_Character_Recognition_%28Unicode_block%29). You could try training this font as described [here](https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract#training-procedure). – Stef May 18 '16 at 15:30
  • [This](http://stackoverflow.com/questions/25279271/android-how-to-recognize-micr-codes) might be interesting for you. – Stef May 18 '16 at 15:36

0 Answers0