0

Building an app using the Google Mobile Vision Text API which scans Library of Congress Classification Numbers from library books and determines if any books are out of order. I am having trouble writing the algorithm that is going to determine if the text block received is in the valid format.

Format should be as follows:

First Line: one or two letters ex) AB

Second Line: decimal number ex) 2405; or 234.23

Third Line: combination of letters and decimal number ex) .H65; or F123

  • There may be multiple combination lines usually not more than 4

Line After Combo Lines: year the book was published ex) 2001

  • This line is not always included

I receive the code read as a TextBlock which can be broken up into individual Lines, which are made up of individual Elements

The mobile vision text api is very poor at recognizing single letters, so I'm just going to ignore the first line since it's not super important for determining relative order.

The problem I am facing is how to determine if each line matches the criteria above, since I don't know until runtime how many letter/number combo lines are included.

Lines are stored as a List<? Extends Text> lines

Looking for suggestions on how to iterate through that list and determine if a line breaks the criteria.

Thank you.

Dez
  • 21
  • 6
  • Step 1 is to go to the library and scan 1000 books. Because the issue here is not how to design an algorithm. The issue is the reliability of the API. For example, what are the odds that `B128` will be reported as `8I2B`? If the API is 0% reliable, no algorithm is going to fix that. But if it's 99% reliable, then you need to understand what happens in the other 1% in order to design an algorithm to deal with it. – user3386109 Feb 06 '18 at 01:45
  • @user3386109 It is pretty much 95% reliable, the problem is that it scans once every frame and is constantly correcting itself so it may scan in 5 wrong codes and then get it right, all in the span of a second. So to the user it appears that its correct 100% of the time, but if you check the logs there is a bunch of garbage data in there. Im trying to figure out how to get rid of that garbage. – Dez Feb 06 '18 at 19:54
  • Ah ok, that makes sense, but I didn't get that at all from the question. So the real question is A) how to know from the logs that the character recognition has settled on its final answer, and B) how to know from the logs when the camera was moved to the next input. That's not something that can be answered without studying the logs. – user3386109 Feb 06 '18 at 20:19

0 Answers0