MLKit giving less accurate results than Google Vision

Question

Been testing MLKit and have noticed less accurate results when compared with Google Vision. One thing I do have enabled is languageHints for Google Vision e.g.

                    image: {
                        content: base64String
                    },

                    features: [
                        {
                            type: "TEXT_DETECTION"
                        }
                    ],
                    imageContext:
                    {
                        languageHints: ['en']
                    }

I'm thinking maybe I need to enable language hints for MLKit but no sure how. I am testing on both Android and IOS.

For MLKit the version I'm using is the latest v2 beta:

  pod 'GoogleMLKit/TextRecognition', '3.1.0'

And for Android I'm using:

    implementation 'com.google.mlkit:text-recognition:16.0.0-beta4'

score 0 · Answer 1 · answered Jul 25 '22 at 14:24

0

3.1.0 is a beta version for Text Recognition, where we support 4 additional languages other than Latin, so if you create instances of these 4 languages, it will return results of specified languages. Otherwise, it will just return results for Latin text.
If you are comparing results of en only, then there is no other input parameters that can improve the result for ML Kit.
Do you have a code snippet of the usage for ML Kit?
It will be great if there are data points that shows the difference between ML Kit and Google vision results, we can investigate more from there. Thank you!

answered Jul 25 '22 at 14:24

Julie Zhou

126
5

Is there an email or other location I can send the info to? – Joshua Augustinus Jul 27 '22 at 11:42
You can zip it and upload here. – Julie Zhou Jul 28 '22 at 14:28
https://drive.google.com/file/d/1Fj-7RXodIhpwoQkvtqL4rxnAysIBg7p7/view?usp=sharing – Joshua Augustinus Jul 28 '22 at 22:13
Also @Julie I've noticed differences between MLKit and the Google Lens app. Is this also something you are interested in or are those differences expected? – Joshua Augustinus Jul 28 '22 at 23:26
Thanks, we will take a further look at the code.The difference is expected: ML Kit is an on-device only SDK, while Google Len is working with the cloud. Generally the server model gets better result, and the choice of SDK is based on your use case. – Julie Zhou Aug 01 '22 at 14:43
I've noticed difference between Google Lens and Google Vision too. With the Google Lens app it can get the name Peter from this image: https://i.imgur.com/X2gMHE1.jpg But with Google Vision API it picks up the noise. – Joshua Augustinus Aug 03 '22 at 02:50

score 0 · Answer 2 · answered Aug 08 '22 at 05:28

I've noticed this too, the MLKit text extraction tool requires the picture of the text to be aligned properly, as in exactly 90 degrees upright. Whereas with Google Vision API it can even extract the text when it is inverted.

Attached picture is of a tire's serial number that can be detected by Google Vision even when inverted, it doesn't work with MLKit.

In real world application, it is not always possible to have pictures taken in the perfect alignment that MLKit requires.

MLKit giving less accurate results than Google Vision

2 Answers2