I'm trying to create ANPR (for Persian character plates), the only solution I have came up with, is to train a classifier for each character but I 'm sure it should take a lot time to train classifier for each character also it should not have good performance in detection time. Isn't there any other ways to do OCR? or, if there is no other solution, how should I choose positive and negative samples to achieve a good detection hit rate?
Asked
Active
Viewed 213 times
0
-
1How many characters the Persian alphabet (for plates) has? – Miki Mar 18 '16 at 13:45
-
9 numbers + 20-30 letters. – alireza_fn Mar 18 '16 at 13:50
-
Opinion: Yes you should. Training time is not relevant because you train only once (in the best case). And why do you fear it has bad detection performance (time)? Of course there are lot of ways to do OCR. How bid is your data set? – knivil Mar 18 '16 at 14:37
-
@knivil I fear because there is 30-40 possible characters so I have to do 30-40 classifications per image which seems not to be so good. How should I choose samples? how long it takes approx. ? and how many samples do I need to use? – alireza_fn Mar 18 '16 at 14:43
-
1Do it in parallel. – knivil Mar 18 '16 at 14:45
-
1You have to think about that for every method, so nothing special for binary classification. Unfortunately Persian has a lot of similar characters, so binary classification wouldn't be the best, but is a good starting point for comparison. Clustering/classification into groups and differentiate within clusters/groups would be better, maybe. E.g.se and te are very similar so this would be in one cluster, another cluster specific classifier should separate them. Also more knowledge about Persian would be helpful. – knivil Mar 18 '16 at 14:55
-
@knivil How would it be if I eliminate letters and just focus on numbers ( http://www.omniglot.com/images/writing/persian_num.gif ) – alireza_fn Mar 18 '16 at 15:04
-
Western number plate fonts make this easier by have fixed size / width characters. The software can then process things easier than standard OCR as it can isolate each character. Also there is an exaggeration of certain characters to aid with distinction... if there are any traits like this I advise you focus on these first and then on single character recognition with a focus on the official font https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/359317/INF104_160914.pdf – Barry Mar 18 '16 at 15:07