reading data from driving license using Tesseract OCR in iPhone

Question

I am trying to read information from a driving license of USA. But I am not able to get correct text from the image. enter image description here

I am trying to read image like above but I am getting some strange result. I am getting something like following:

7 WISCONSIN **i_.* 4' L. _-
DRIVER LICENSE Regular
' Q555-5555-2555-00 35533
I5 .4 ClassDMXxX Enduslmmls TPXMXX J
Sex r mnBLQ EyesBl-U 0000.501" 0.00.100
X Restrictions 0n Back MM 08484005
X E0". 00-20-2010
It JANE QUINCY
' * 1' 3913' ECIJ-SWILEKgSJVEEQIJNSRIEMREKBVAY
jilfccgbwm suns 20s
BLACK RIVER FALLS w: 54015-0000

Very few of the words are correct. What should I need do to get a more accurate information?
My Code:

Tesseract* tesseract4 = [[Tesseract alloc] initWithDataPath:@"tessdata" language:@"eng"];
[tesseract4 setVariableValue:@"*'\"-_:.0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" forKey:@"tessedit_char_whitelist"];
[tesseract4 setImage:[UIImage imageNamed:@"dlWI.jpg"]];
[tesseract4 recognize];

NSLog(@"%@", [tesseract4 recognizedText]);

imho you have to increase contrast, the background pattern needs to be less distinct — Volker, Mar 11 '14 at 10:01
So you mean I need to increase the contrast of text and reduce it for background. Could you please suggest me some directions to how to do it? I am new to image processing. — pankaj, Mar 11 '14 at 10:25
i would start using an app like photoshop or similar and try to produce an image that works better. then you know which steps are necessary and you can try to utilize CIFilters for that purpose... — Volker, Mar 11 '14 at 10:45
@PoojaM.Bohora I never got accurate results with this sdk, try other paid versions like abbyy which are better. — pankaj, Jan 01 '18 at 09:43
@pankaj very true. working on abbyy sample lets see. thanks.. — Pooja M. Bohora, Jan 02 '18 at 04:46

score 1 · Answer 1 · edited May 23 '17 at 12:24

1

Try having a look at this question here it explains how to convert the image to grayscale and process the image a bit in order to improve the quality of the results from Tessseract

iOS Tesseract OCR Image Preperation

Also it is worth ensuring that your white list only includes characters that you want to process. So if you don't need : or _ or * then don't include them in the white list and this should clean up the results a bit

edited May 23 '17 at 12:24

Community

1
1

answered Mar 11 '14 at 11:06

Adam Richardson

2,518
1
27
31

Hi Adam, Thanks for replying and providing the url. I have tried to used the methods mentioned in the link and converted the image to grayscale and also processed it a bit. But I am still not able to get all of the content correctly. If you check the image there are lot of colored characters in background. This is possibly causing trouble. The last line of this image is read correctly as it has white background. Is there a way to remove the background text from a image and make it white background? – pankaj Mar 12 '14 at 10:55
I would say that the background is defiantly causing an issue as it has a lot of noise. The only thing I could suggest is looking up image manipulation for iOS and specifically colour replacement. You could look to replace the background colours used on the text to remove the noise – Adam Richardson Mar 12 '14 at 11:23

reading data from driving license using Tesseract OCR in iPhone

1 Answers1