I am trying to learn Tensorflow Object Detection API (SSD + MobileNet architecture) on the example of reading sequences of Arabic numbers. Generated images with random sequences of numbers of different lengths - from one digit to 20 were fed to the input.
The result is perfect detection and reading for short sequences (up to 5 characters). And a terrible result for long sequences - characters are skipped or several digits are read as one.
What could be the problem? You can think about some kind of built-in pre-processing, but at the training stage, the network also saw sequences of different lengths.